Opencv|Document Scanning Optical Character Recognition

Eranthe ·

更新时间:2024-11-10

· 878 次阅读

Opencv|Document Scanning & Optical Character Recognition(OCR)

Step 1. Import some packages and a pyfile named resize for the project.

import cv2
import numpy as np
import resize

Step 2. Import and preliminary processing of the image.
Read in the picture to be detected. If the resolution is good enough, we can also use the laptop camera.

image = cv2.imread('test.jpg')
image = cv2.resize(image, (1500, 1125))
orig = image.copy()
# Create a copy of the original image.
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blurred = cv2.GaussianBlur(gray, (5, 5), 0)
# Grayscale the image, and then perform line Gaussian blur to reduce noise
edged = cv2.Canny(blurred, 0, 50)
# Use canny algorithm for edge detection
orig_edged = edged.copy()
# Create a copy processed by the canny algorithm.

Step 3. Get approximate contours of the image.

Find the outline in the edge image, keep only the largest one, and initialize the screen outline.

contours, hierarchy = cv2.findContours(edged, cv2.RETR_LIST, cv2.CHAIN_APPROX_NONE)
# findContours() for finding contours from binary images
contours = sorted(contours, key=cv2.contourArea, reverse=True)
# Use the sorted function in python to return the results of contours 
# Get approximate contours:
for c in contours:
    p = cv2.arcLength(c, True)  
    # Calculate the circumference of the closed contour or the length of the curve
    approx = cv2.approxPolyDP(c, 0.02 * p, True) 
    # Specify (0.02 * p) as precision to approximate the polygon curve. Because approximate curve is a closed curve, the parameter closed is True.
    if len(approx) == 4:  
        target = approx  
        break  
    #Find the rectangle profile we are looking for.

Step 4. Create a function to rectify and resize the target image.

ps: Function rectify is stored in resize.py.

def rectify(h):
    h = h.reshape((4, 2))  
    hnew = np.zeros((4, 2), dtype=np.float32)  
    add = h.sum(1)
    hnew[0] = h[np.argmin(add)]  
    # return the larger number
    hnew[2] = h[np.argmax(add)]
    diff = np.diff(h, axis=1)  
    # Calculate the N-dimensional discrete difference along the specified axis.
    hnew[1] = h[np.argmin(diff)]
    hnew[3] = h[np.argmax(diff)]
    # Determine the four vertices of the detected document.
    return hnew
 approx = resize.rectify(target)

Step 5. Map our target to a quadrilateral size of (400 * 600) after perspective transformation.

pts2 = np.float32([[0, 0], [400, 0], [400, 600], [0, 600]])
M = cv2.getPerspectiveTransform(approx, pts2)
#Use the gtePerspectiveTransform function to obtain the perspective transformation matrix.
#(approx is the four fixed-point collection positions of the quadrilateral in the source image; pts2 is the four fixed-point collection positions of the target image.)
dst = cv2.warpPerspective(orig, M, (400,600))
# Use the warpPerspective function to perform perspective transformation on the source image, the output image dst size is 400 * 600.

Step 6. Use several different ways to optimize the perspective transformed image to obtain the final result.
We can also compare different ways of processing below to choose the properest one to be our final results. The results of image processing are not shown in the article. If you are interested in it, just try it by yourself.

dst = cv2.cvtColor(dst, cv2.COLOR_BGR2GRAY)
# Grayscale the image after perspective transformation
cv2.drawContours(image, [target], -1, (0, 255, 0), 2)
# Draw the outline, -1 means all the outlines, the color of the brush is green, and the thickness is 2.
ret, th1 = cv2.threshold(dst, 127, 255, cv2.THRESH_BINARY)
#Threshold
ret2, th2 = cv2.threshold(dst, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)
# Otsu's binarization
th3 = cv2.adaptiveThreshold(dst, 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY, 11, 2)
#Adaptive threshold of mean
th4 = cv2.adaptiveThreshold(dst, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 11, 2)
#Adaptive threshold of gaussian

The origianal image is :

result:

Step 7. Do the Optical Character Recognition.

ps:
1.For windows users, before we install the pytesseract package, we need to install the tesseract-ocr-setup-4.00.00dev.exe program on windows system.
2. For Mac users, we need to install Homebrew on Mac system and install pytesseract by Homebrew.

Code:

from PIL import Image
import pytesseract
import cv2
import os
preprocess='thresh'
image= cv2.imread('scan.jpg')
gray = cv2.cvtColor(image,cv2.COLOR_BGR2GRAY)
if preprocess =='thresh':
    gray = cv2.threshold(gray,0,255,cv2.THRESH_BINARY|cv2.THRESH_OTSU)[1]
if preprocess=='blur':
    gray=cv2.medianBlur(gray,3)
#Choose an appropriate method to process the image.
filename="{}.jpg".format(os.getpid())
cv2.imwrite(filename,gray)
text = pytesseract.image_to_string(Image.open(filename))
#Use OCR to recognize the text information on the image.
print(text)
# Print the information.
os.remove(filename)
cv2.imshow('image',image)
cv2.imshow('output',gray)
cv2.waitKey(0)
cv2.destroyAllWindows()

It can only recognize English and numbers, however, there is Japanese in the scanned pictures, as a result of which, the effect is not good.

The scanned image:

result:
在这里插入图片描述
Therefore, we chose another scanned image, mostly in English and numbers, for text recognition to see the real effect of OCR.
The scanned image:

The results are as follows.

在这里插入图片描述
The result is much better but there still exists some small mistakes which can not be organized correctly.
Thank you for reading!

--credit by dora 2020.4.13

Resources:
https://www.bilibili.com/video/BV1X4411Z7qV?p=10
https://blog.csdn.net/showgea/article/details/82656515
https://zhuanlan.zhihu.com/p/93092044
https://blog.csdn.net/gaoyu1253401563/article/details/84995349
https://zhuanlan.zhihu.com/p/59805070

作者：grid_vision

document CHARACTER opencv

1024 个赞

编辑举报

需要登录后方可回复, 如果你还没有账号请注册新账号

相关文章

SQL LIKE 操作符

Flower 2021-06-17

920

CSS3毛玻璃效果(blur)有白边问题的解决方法

Jcinta 2020-04-18

866

编写Ruby脚本来对Twitter用户的数据进行深度挖掘

Alexandra 2020-12-12

770

OpenCV通过透视变换实现矫正图像详解

Miette 2023-02-20

1376

OpenCV实现视频绿幕背景替换功能的示例代码

Serafina 2023-02-20

1031

OpenCV使用GrabCut实现抠图功能

Gitana 2023-02-20

65

OpenCV基于稠密光流实现视频跟踪详解

Mathilda 2023-02-25

973

python2.7安装opencv-python很慢且总是失败问题

Sabah 2023-02-26

211

关于pip安装opencv-python遇到的问题

Helen 2023-02-26

545

OpenCV使用稀疏光流实现视频对象跟踪的方法详解

Florence 2023-02-26

1151

OpenCV实战记录之基于分水岭算法的图像分割

Catherine 2023-02-26

704

opencv+图像处理(ImageProcessinginOpenCV)4-0改变颜色空间的过程

Tricia 2023-04-22

1237

Java的Character类详解

Kefira 2023-04-24

1855

python OpenCV实现图像特征匹配示例详解

Elita 2023-04-26

428

一文带你安装opencv与常用库(保姆级教程)

Kara 2023-05-09

1531

opencv-python图像增强解读

Ianthe 2023-05-12

1680

opencv实践项目之图像拼接详细步骤

Cytheria 2023-05-12

222

详解OpenCV-Python Bindings如何生成

Tricia 2023-05-13

146

Python opencv进行圆形识别(圆检测)实例代码

Oria 2023-05-13

293

正则表达式之字符组[ ](Character Classes)

Nora 2023-07-19

316

我要提问

致谢

帮助他人，成就自己。

人生最大成功就是伸出热情而温暖的双手，尽自己所能去帮助身边的每一个人，只要无私的奉献，就会收获到美好的生活。

1024问感谢每一位朋友的帮助和支持。
软件开发网提供编程的基础软件技术培训教程,软件开发编程实例讲解Go,Node,HTML,CSS,Javascript,Python,Java,Ruby,C,PHP,MySQL等软件开发编程语言以及数据开发的基础知识，也提供大量的软件开发在线实例、从入门到精通就在1024问。

育儿网微养生全球行美食街育儿菜谱大全海南旅游女性养狗百科星座