Tuesday, 19 September, 2017 UTC


Summary

A few days ago, when prowling around the internet, I came across a very interesting article, How I replicated an $86 million project in 57 lines of code. The use of open source technology was greatly emphasized in the license plate recognition. The article’s author also insisted on implementing an on-device image processing platform, which would require installing image processing software on every device.
Reading this, a few thoughts came to mind:
  1. Open source technology is great, but it also has its downsides. For instance, a bug in the open source software means every device that has the software running could be hacked easily and instantly. Remember the Heartbleed bug?
  2. Fixing and updating the software installed on these devices would be a massive pain. They would have to be upgraded individually.
  3. I can definitely build a similar system using alternative technology because I am a software engineer.
I use Cloudinary for any project that relates to image, audio and video uploads, as well as filtering and transformations in production for virtually every software project I code. And guess what? I quickly skimmed through Cloudinary’s documentation to see if there was a way I could easily extract text from images. Unsurprisingly, I found a hidden gem, and I’ll show you how to use it as we build our license plate recognition system in 31 lines of code
Wait, what’s Cloudinary, anyway?
Cloudinary is a cloud-based solution for image and video management, including server or client-side upload, a huge range of on-the-fly image and video manipulation options, including face detection, quick content delivery network (CDN) delivery, and powerful asset management options.
Cloudinary enables web and mobile developers to address all of their media management needs with simple bits of code in their favorite programming languages or frameworks, freeing them to focus primarily on their own product's value proposition.
Step 1: Create a Cloudinary Account
Sign up for a free Cloudinary account.
Once you are signed up, you will be redirected to the dashboard where you can get your credentials.
Take note of your Cloud name, API Key and API Secret
Step 2: Set Up A Node Server
Initialize a package.json file:
 npm init
Install the following modules:
 npm install express connect-multiparty cloudinary cors body-parser --save
express: We need this module for our API routes connect-multiparty: Needed for parsing http requests with content-type multipart/form-data cloudinary: Node SDK for Cloudinary body-parser: Needed for attaching the request body on express’s req object cors: Needed for enabling CORS
Step 3: Activate OCR Text Detection and Extraction Add-on
Go to the dashboard add-ons section. Click on OCR Text Detection and Extraction Add-on and select the Free Plan.
Note: You can change to other plans as your usage increases.
Step 4: Set Up License Plate Character Recognition
Create a server.js file in your root directory. Require the dependencies we installed:
const express = require('express');
const app = express();
const multipart = require('connect-multiparty');
const cloudinary = require('cloudinary');
const cors = require('cors');
const bodyParser = require('body-parser');

app.use(bodyParser.json());
app.use(bodyParser.urlencoded({ extended: true }));
app.use(cors());

const multipartMiddleware = multipart();
Next, configure Cloudinary:
cloudinary.config({
    cloud_name: 'xxxxxxxx',
    api_key: 'xxxxxxxx',
    api_secret: 'xxxxxxx'
});
Replace xxxxxx with the real values from your dashboard.
Add the route for uploading. Let’s make the route /upload.
app.post('/upload', multipartMiddleware, function(req, res) {
  cloudinary.v2.uploader.upload(req.files.image.path,
    {
      ocr: "adv_ocr"
    }, function(error, result) {
        if( result.info.ocr.adv_ocr.status === "complete" ) {
          res.json(result); // result.info.ocr.adv_ocr.data[0].textAnnotations[0].description (more specific)
        }
    });
});
Once a user makes a POST request to the /upload route, the route grabs the image file from the HTTP request, uploads to Cloudinary, sends to Google Vision. The OCR Add-on is powered by the Google Vision API and it integrates seamlessly with Cloudinary’s upload and manipulation functionality.
A JSON response is quickly sent back, with the characters of the license plate in the image that was recently uploaded. Let’s quickly test this functionality with Postman.
Make sure your server is running:
nodemon server.js
Uploaded this car image
I sent the image as a POST request to the http://localhost:3333/upload route.
Results from Uploaded Car image
Boom! It extracted the plate number from the image: LMIO OHH.
A comprehensive JSON response is returned. The ocr node of the response includes the following:
  • The name of the OCR engine used by the add-on (adv_ocr)
  • The status of the OCR operation
  • The detected language of the text
  • The outer bounding rectangle containing all of the detected text
  • A description listing the entirety of the detected text content, with a newline character (\n) separating groups of text
  • For multi-page files (e.g. PDFs), a node indicating the containing page
  • The bounding rectangle of each individual detected text element and the description (text content) of that individual element
Multi-page PDF files? Yes, you can extract text from PDF files using Cloudinary’s OCR add-on.
One More Gem
You can blur, pixelate or overlay other images on all detected text with simple transformation parameters. You also can use the add-on to ensure that important text isn’t cut off when you crop your images.
cloudinary.image("image.jpg", {transformation: [
  {width: 1520, height: 1440, gravity: "west", x: 50, crop: "crop"},
  {effect: "pixelate_region:15", gravity: "ocr_text"}
  ]})
Check out the documentation for more gems on using the OCR add-on
Conclusion
The advantage of using Cloudinary’s cloud-based system for building services, like the one we showed here with 31 lines of code, is that you can perform a string of operations other than just detecting and extracting text.
You can leverage a full-blown suite of Image transformation features in your applications and services.