Academic
Blog
Analytics

Document detection

2 mins read time ❤️ Views


Scanned images have always undesired white space. It is tedious to crop manually when you have several images. In this blog we show how to implemented a simple algorithm to detect the document of an image using image preprocessing functions of opencv.

Image preprocessing

We use an example image found on the internet that is similar to scanned image with excessive white space around the document.

We use opencv 3.3 to read the image, transform it in grey level, then we use a bilateral filter to remove noise and keep information of the borders, then we apply a equalized filter to increase the contrast of the image and finally we use canny edge detection algorithm to detect edges of the image.

img = cv2.imread('cni.jpg')
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
bilateral = cv2.bilateralFilter(gray, 5, 5,5)
eq = cv2.equalizeHist(bilateral)
edged = cv2.Canny(eq, 0, 150)

Contour detection

We use Opencv function findContours to find connected pixels of the image. We use the retrieval hierarchy mode to obtain all contours in a nested hierarchy and simple contour method to obtain simplified contours. We set a condition of the area of contours to take the biggest contours in the image.

_, contours, hierarchy = cv2.findContours(edged, cv2.RETR_TREE,  cv2.CHAIN_APPROX_SIMPLE)
h, w = img.shape[:2]
thresh_area = 0.001
list_contours = list()
for c in contours:
    area = cv2.contourArea(c)

    if (area > thresh_area*h*w): 
        rect_page = cv2.minAreaRect(c)
        box_page = np.int0(cv2.boxPoints(rect_page))
        list_contours.append(box_page)

The result are the biggest contours in the image. Then we can use the biggest one to crop our image.

Image crop

We use the extreme points of the contours to crop the image.

Background color substitution

The biggest contour can be found using the max python function, ordering by contour area with the function cv2.contourArea. Other contour properties can be found at Opencv Contours. This biggest contour is filled with the function cv2.fillPoly to create the mask to select the important information of the image.

c = max(list_contours, key=cv2.contourArea)
mask = np.zeros(img.shape, dtype=np.uint8)
mask = cv2.fillPoly(img=img.copy(), pts=c.reshape(1, -2, 2), color=(0,0,0))
masked_image = cv2.bitwise_and(img, ~mask)

This mask is used again to initializate the background of the image and then add it to the masked image from the previous step.

background = np.zeros(img.shape, dtype=np.uint8)
background[:,:,:] = 128
masked_bg = cv2.bitwise_and(background, mask)

The complete code can be found on this notebook.

In relation with 🏷️ opencv, python:

TheFifthDriver: Machine learning driving assistance on FPGA

FPGA implementation of a highly efficient real-time machine learning driving assistance application using a camera circuit.

📅 Dec 7, 2020  Hackster.io   fpga   python   opencv
Intelligent Video Analytics using SSD mobilenet on NVIDIA's Jetson Nano

This project implements a deep learning model on a Jetson Nano to count and track people passing in front of a video camera.

Real time motion detection in Raspberry Pi

In this article I show how to use a Raspberry Pi with motion detection algorithms and schedule task to detect objects using SSD Mobilenet and Yolo models.

Object detection using a Raspberry Pi with Yolo and SSD Mobilenet

This post how how to implement a light object detection algorithm

Subscribe to my newsletter 📰

and share it with your friends: