Text extraction from image using OpenCV and OCR Python
In this article, you will learn how to extract text from an image using Python OpenCV and OCR.
OpenCV stands for Open Source Computer Vision Library. It is a free, open source library which is used for computer vision applications.
OCR (Optical Character Recognition) is a process of recognizing text into an image and converting into text. OCR has applications in a wide scope of enterprises and capacities. Thus, everything from checking records, bank articulations, receipts, transcribed reports, coupons, hand writing, documents and so forth, all falls under the OCR. It basically used to convert scanned documents into searchable text files. It works on optical recognition technology. The OCR module is designed with more advance features for performing additional functionalities. These process can also help in reducing document file size for easier transfer and sharing. It also saves a lot of time by frequently transfer paper documents into electronic files.
Tesseract is an open-source text recognition engine. It is widely used to extract text from images or documents because of providing more accurate result.
Install tesseract OCR on windows
For windows users, download the exe file of tesseract either 32 bits or 64 bits as per your system from here. You can directly execute the downloaded exe file by following the steps -
And, then we need to configure Tesseract path in the System Variables window under the Environment Variables window-
We require to install three modules - opencv-python, pytesseract and tesseract. Pytesseract is a wrapper for Tesseract-OCR Engine. It is also helpful as a stand-alone invocation script to tesseract. It is able to read all image types supported by the Pillow and Leptonica imaging libraries, including png, jpeg, gif, bmp, and others. We can install these using pip tool -
pip install opencv-python pip install pytesseract pip install tesseract
Code to Extract Text From Image using Tesseract
Suppose, we have the following test image located in the same working directory -
First, we have created a Python file and imported all the necessary modules at the top -
# text recognition import cv2 import pytesseract
Next, we have used the imread() function to load the test image from the specified location -
# read image img = cv2.imread('quotes.jpg')
Here, we have set the configuration custom options -
# configurations config = ('-l eng --oem 1 --psm 3')
If you have not configured the tesseract executable in your System variables PATH, include the following -
# pytessercat pytesseract.pytesseract.tesseract_cmd = 'C:/Program Files/Tesseract-OCR/tesseract.exe'
Next, we convert from image to string using the method image_to_string() -
text = pytesseract.image_to_string(img, config=config)
At last, we can print the extracted text form Image -
# print text text = text.split('\n') print(text)
Let's merge all the above code and execute -
# text recognition import cv2 import pytesseract # read image img = cv2.imread('quotes.jpg') # configurations config = ('-l eng --oem 1 --psm 3') # pytessercat pytesseract.pytesseract.tesseract_cmd = 'C:/Program Files/Tesseract-OCR/tesseract.exe' text = pytesseract.image_to_string(img, config=config) # print text text = text.split('\n') print(text)
The above code returns the following output -
Extract Image and Save to text file
In this, we have first converted the image to grayscale and then specified the kernel shape and size. Next, we have found the contours and looped over it and chopped the rectangle area. Next, we have passed the rectangle area onto pytesseract for extracting text from it and then written in the text file.
# import modules import cv2 import pytesseract # read image img = cv2.imread('quotes.png') # set configurations config = ('-l eng --oem 1 --psm 3') # Convert the image to gray scale gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) # OTSU threshold performing ret, threshimg = cv2.threshold(gray, 0, 255, cv2.THRESH_OTSU | cv2.THRESH_BINARY_INV) # Specifying kernel size and structure shape. rect_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (18, 18)) # Appplying dilation on the threshold image dilation = cv2.dilate(threshimg, rect_kernel, iterations = 1) # getting contours img_contours, hierarchy = cv2.findContours(dilation, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE) # Loop over contours and crop and extract the text file for cnt in img_contours: x, y, w, h = cv2.boundingRect(cnt) # Drawing a rectangle rect = cv2.rectangle(img, (x, y), (x + w, y + h), (0, 255, 0), 2) # Cropping the text block cropped_img = img[y:y + h, x:x + w] # Open the text file in append mode file = open("recognized.txt", "a") # Applying tesseract OCR on the cropped image text = pytesseract.image_to_string(cropped_img) # Appending the text into file file.write(text) file.write("\n") # Close the file file.close
Output of the above code -
Related ArticlesPython Spell Checker Program
Python remove punctuation from string
How to convert Excel to CSV Python Pandas
How to read data from excel file using Python Pandas
How to read data from excel file in Python
Python read JSON from URL requests
Python send mail to multiple recipients using SMTP server
How to generate QR Code in Python using PyQRCode
Python programs to check Palindrome strings and numbers
CRUD operations in Python using MYSQL Connector
Fibonacci Series Program in Python
Python File Handler - Create, Read, Write, Access, Lock File
Python convert XML to JSON
Python convert xml to dict
Python convert dict to xml