Comparing OCR Engines in UiPath

There are many popular OCR Engines in the market today, some of which reside in the UiPath developer platform. In this article, we detail a recent hackathon use case by comparing OCR Engines in UiPath, along with highlighting our experience with each.

What is OCR in RPA?

UiPath with its drag-and-drop interface provides an extensive tool set to program and design bots. One of the built-in activity categories is called Optical Character Recognition, or OCR for short. It’s a form of technology programmed to recognize text inside images, like scanned documents and photos. OCR is used to convert various forms of imaging that contain written text into machine-readable text data.

In a recent Smartbridge Hackathon, we harvested the OCR’s capabilities by integrating cloud engine processing into the receipts validation feature inside NetSuite Expense Reports. Through this integration, we’d be able to identify if the total expense matched the total amount displayed on the Expense Report attachment.

UiPath logo

Smartbridge is a UiPath Partner

Explore our partnership

Usually, Expense Report attachments are pictures of meals receipts, hotels invoices, taxi fares, parking receipts, gas receipts, etc. In the grand majority of scenarios, these receipts involve a handwritten copy of a tip and total expense. This is where RPA and OCR capabilities come into play. As shown in the picture below, a developer is prompted to choose one of the seven OCR engines that UiPath houses.

Comparing OCR Engines in UiPath

To narrow the scope of the hackathon, we focused on 3 of the most used (and most popular) OCR Engines. We’ll also identify the key differences between these engines, what they can/can’t do, the ease of use and the processing time. The three engines highlighted in this article are the following:

  • Google Cloud OCR: This required a Google Cloud API Key, which has a free trial once you signed up on the Google Cloud platform.

  • Microsoft Cloud OCR: This uses the Microsoft Computer Vision API, which is also free to sign up for.

  • Abbyy OCR: This requires you to install Abbyy FineReader on your local machine and purchase a license.

Hear our podcast episode about hackathons:

Hackathons for an innovative culture

Comparing OCR Engines in UiPath

Google Cloud OCR

This OCR is easy to use because it’s built directly into UiPath, and performs accurately. We were able to create an API key and get it working with UiPath much faster than we originally expected.

comparing ORC Engines in UiPath

To get it working, we must turn on billing to retrieve the API key. Afterwards, Google Cloud will give you $300 of credit for 365 usage days. It will also provide 1000 free OCR image processing cycles per month (which comes to $0.15 per image process).

Microsoft Cloud OCR

This is similar to the built-in Google Cloud OCR, however it’s free and easy to use. In some scenarios, it was more accurate than Google Cloud OCR as well.

comparing ORC Engines in UiPath

We signed up for an API key, but received an error in UiPath when we started using it. The integration was not smooth due to the image processing nature of Microsoft. It required encoding, or needed to be uploaded to a public URI to be sent to the OCR Engine.

Abbyy OCR

We downloaded a free trial of Abbyy FineReader to the machine we were using. FineReader performs quite well with converting a scanned PDF to a searchable PDF, or even to other document formats.

Unfortunately, when we tried to using Abbyy to process handwritten text on an image, we were not successful.

Final OCR Conclusion

While comparing OCR Engines in UiPath, any of the current platforms has the potential to perform well. The key differentiation between each one of these tools depends on its intended use. For example, if we want to process raster or vector PDF’s only, Abbyy would likely be the best option. It is full of features for PDF processing, however lacks the functionality of handwritten text recognition.

Microsoft OCR excels on handwritten processing, however the integration seems too complex and not easy to use with 3rd party applications. Google on the other hand is quite applicable for both handwritten and PDF processing, with the engine accuracy being the only potential stipulation.

Looking for more on Automation?

Explore more insights and expertise at smartbridge.com/automation

There’s more to explore at Smartbridge.com!

Sign up to be notified when we publish articles, news, videos and more!