Skip to content Skip to sidebar Skip to footer

How To Recognize Text With Colored Background Images?

I am new to opencv and python as well as tesseract. Now, I am creating a script that will recognize text from an image. My code works perfectly on black text and white background o

Solution 1:

Here are two different approaches:

1. Traditional image processing and contour filtering

The main idea is to extract the ROI then apply Tesseract OCR.

  • Convert image to grayscale and Gaussian blur
  • Adaptive threshold
  • Find contours
  • Iterate through contours and filter using contour approximation and area
  • Extract ROI

Once we obtain a binary image from adaptive thresholding, we find contours and filter using contour approximation with cv2.arcLength() and cv2.approxPolyDP(). If the contour has four points, we assume it is either a rectangle or square. In addition, we apply a second filter using contour area to ensure that we isolate the correct ROI. Here's the extracted ROI

enter image description here

import cv2

image = cv2.imread('1.png')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(gray, (3,3), 0)
thresh = cv2.adaptiveThreshold(blur,255,cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY_INV,9,3)

cnts = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] iflen(cnts) == 2else cnts[1]

ROI_number = 0for c in cnts:
    area = cv2.contourArea(c)
    peri = cv2.arcLength(c, True)
    approx = cv2.approxPolyDP(c, 0.05 * peri, True)
    iflen(approx) == 4and area > 2200:
        x,y,w,h = cv2.boundingRect(approx)
        ROI = image[y:y+h, x:x+w]
        cv2.imwrite('ROI_{}.png'.format(ROI_number), ROI)
        ROI_number += 1

Now we can throw this into Pytesseract. Note Pytesseract requires that the image text be in black while the background in white so we do a bit of preprocessing first. Here's the preprocessed image and result from Pytesseract

enter image description here

Reboot

import cv2
import pytesseract

pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe"

image = cv2.imread('ROI.png',0)
thresh = cv2.threshold(image, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)[1]

result = 255 - thresh 

data = pytesseract.image_to_string(result, lang='eng',config='--psm 10 ')
print(data)

cv2.imshow('thresh', thresh)
cv2.imshow('result', result)
cv2.waitKey()

Normally, you would also need to use morphological transformations to smooth the image but for this case, the text is good enough

2. Color Thresholding

The second approach is to use color thresholding with lower and upper HSV thresholds to create a mask where we can extract the ROI. Look here for a complete example. Once the ROI is extracted, we follow the same steps to preprocess the image before throwing it into Pytesseract

Post a Comment for "How To Recognize Text With Colored Background Images?"