Russian Car Plate Detection with OpenCV and TesseractOCR | by Kenneth Leung | Dec, 2020


We will probably be the use of Python Jupyter pocket book to construct our mission, and we will be able to be leveraging the ability of 2 open supply tool to make the magic occur, specifically OpenCV and TesseractOCR. Before we continue additional, listed below are the stairs to get those equipment totally arrange for your device.

(i) OpenCV
OpenCV (Open Source Computer Vision Library) is an open supply laptop imaginative and prescient and device studying tool library. It principally makes a speciality of symbol processing, video seize and research together with options like face detection and object detection, and it is helping to supply a not unusual infrastructure for laptop imaginative and prescient programs.

We set up the Python model of OpenCV (for your digital setting) with the next command:

(ii) Haar Cascade XML File
Besides the set up of the OpenCV library, any other essential factor to retrieve is the Haar Cascade XML record.

Let’s first communicate just a little concerning the principle in the back of Haar Cascades, since it’s the most important idea to grasp. In 2001, Paul Viola and Michael Jones got here up with the object detection technique the use of Haar feature-based cascade classifiers. It is a device studying founded means (involving AdaBoost) the place a cascade serve as is educated from many certain and adverse pictures. It extracts numerical values for options (e.g. edges, strains) successfully with the idea that of integral symbol (or summed area table), which trumps the default computationally-heavy method of subtracting sums of pixels throughout more than one areas of a whole symbol.

In addition, it makes use of the idea that of ‘Cascade of Classifiers’. This implies that as a substitute of making use of loads of classifiers for the various other options throughout the symbol at one cross (which is truly inefficient), the classifiers are carried out one-by-one.

Take for instance a picture of a human face. If the primary classifier for the ‘eyes’ characteristic has failed (i.e. didn’t come across any human eyes within the symbol), the set of rules does now not trouble making use of the following classifiers (e.g. for nostril, for mouth and many others.). Instead, it stops and announces that no face is detected. On the opposite hand, if this primary ‘eyes’ characteristic is detected, the set of rules will observe the second one level of characteristic classifications and proceed with the classification procedure. In the top, if the picture passes all classification levels, it may possibly then be declared {that a} face area is certainly provide.

Visual illustration of Haar Cascade procedure

OpenCV in reality comes with pre-trained XML recordsdata of more than a few Haar Cascades, the place every XML record comprises the characteristic set. We will probably be the use of the Haar Cascade XML record containing the options for Russian automobile plates, and right here’s how you’ll obtain the Haar Cascade XML record:

  1. Visit the OpenCV GitHub web page containing the Russian automobile plate Haar Cascade by clicking here.
  2. Right click on at the display screen (which will have to be showing a wall of textual content with the highest line being <?xml model=”1.0″?>), and click on ‘Save as..
  3. In the Save choice pop-up, you will have to see the default record title of ‘haarcascade_russian_plate_number’ and record sort ‘XML record’. Leave them because the default, and save this XML record within the trail the place your Jupyter pocket book is positioned.

To discover different XML recordsdata to be had for experimentation, take a look at the OpenCV Haar Cascades GitHub useful resource here. If you wish to to find out extra concerning the principle of Haar Cascade, discuss with the OpenCV educational here.

(iii) TesseractOCR
TesseractOCR is an open supply optical persona popularity (OCR) engine. It is known as one of the vital fashionable and maximum correct open-source OCR engines. The amusing truth is this engine in reality began off as a proprietary tool evolved by Hewlett Packard, however used to be later open-sourced in 2005, and its building has since been backed by Google.

Here are the directions for putting in TesseractOCR (for Windows):

  1. Install the TesseractOCR software the use of the Windows installer to be had at: At the purpose of this mission, I downloaded the 27 Nov 2020 (64 bit) model (tesseract-ocr-w64-setup-v5.0.0-alpha.20201127.exe)
  2. Run the downloaded installer and have in mind of the place the applying is put in. For me, I put in it inside of folderD:Program FilesTesseract-OCR . We will probably be the use of this folder trail in a while, and that is essential as a result of we will be able to want to level without delay to the tesseract.exe throughout the folder.
  3. Install the Python model of TesseractOCR (i.e. PyTesseract) for your setting with the next command:

The above dependencies can then be initialized with the next code:

First of all, we import the enter automobile symbol we need to paintings with. Because OpenCV imports pictures as BGR (Blue-Green-Red) structure by default, we will be able to want to run cv2.cvtColor to change it to RGB structure earlier than we ask matplotlib to show the picture.

Our enter automobile symbol. Source: “FAB Design!” by Niklas Emmerich approved underneath CC BY-NC-SA 2.0

It’s now time to usher in the Haar Cascade characteristic set (XML record) for Russian automobile plates, with the usage of OpenCV’s CascadeClassifier serve as.

Next we employ the detectMultiScale way of the CascadeClassifier to run the detection.

Let’s in brief communicate concerning the OpenCVdetectMultiScale way. The way permits us to come across items of various sizes within the enter symbol, and it returns a listing of rectangle bounds the place items are detected. For every rectangle, there will probably be four values returned, and they correspond to the next respectively:

  • x-coordinate of bottom-left nook of rectangle (x)
  • y-coordinate of bottom-right nook of rectangle (y)
  • width of rectangle (w)
  • top of rectangle (h)

The key parameters concerned within the detectMultiScale serve as are scaleFactor and minNeighbors.

  • scaleFactor specifies how a lot the picture measurement is lowered at every symbol scale (as a part of the scale pyramid, which is a multi-scale illustration of a picture). In essence, when object detection fashions are educated, they’re educated to come across items (i.e. automobile plate in our case) of a hard and fast measurement, and may leave out automobile plates which might be larger or smaller than anticipated. As a part of the size pyramid, the picture is resized a number of instances within the hopes {that a} automobile plate will finally end up being a “detectable” measurement. I used the default scale issue of 1.1, which means that that OpenCV will scale the picture down by 10% to take a look at and fit the auto plates higher.
  • minNeighbors permits us to specify what number of neighbors every candidate rectangle will have to have, to ensure that the candidate rectangle to be retained. In more effective phrases, this parameter influences the standard of the detected items. The next worth ends up in fewer detections, however the detections come with upper high quality and accuracy. This implies that the next worth can in reality assist with lowering the choice of false positives. For my case, the default worth of three gave me a number of false positives, so I higher it to 5.

These two parameters can also be tuned accordingly to be able to strengthen your effects. For extra main points, take a look at the OpenCV documentation here.

Let’s now run our serve as for automobile plate detection:

Input symbol with crimson rectangle bounds across the detected automobile registration code

We can see that our serve as labored a deal with! The automobile registration code has been effectively detected and bounded by a crimson rectangle. Our subsequent step is to position our emphasis at the registration code itself, and paintings against extracting the numbers and textual content of the auto plate the use of OCR functions.


Please enter your comment!
Please enter your name here