Comparing OCR Engines in UiPath
There are many popular OCR Engines in the market today, some of which reside in the UiPath developer platform. In this article, we detail a recent Smartbridge hackathon use case by comparing OCR Engines in UiPath along with highlighting our experience with each.
What is OCR in RPA?
With its drag-and-drop interface, UiPath provides an extensive tool set in order to program and design bots. One of the built-in activity categories is called Optical Character Recognition or OCR for short. It’s a form of technology programmed to recognize text inside images, like scanned documents and photos. OCR is used to convert various forms of imaging that contain written text into machine-readable text data.
In a recent Smartbridge hackathon, we harvested the OCR’s capabilities by integrating cloud engine processing into the receipts validation feature inside NetSuite Expense Reports. Through this integration, we’d be able to identify if the total expense matched the total amount displayed on the Expense Report attachment.
Usually, Expense Report attachments are pictures of meals receipts, hotels invoices, taxi fares, parking receipts, gas receipts, etc.. In the grand majority of scenarios, these receipts involve a handwritten copy of a tip and total expense. This is where RPA and OCR capabilities come into play. As shown in the picture below, a developer is prompted to choose one of the seven OCR engines that UiPath houses.
To narrow the scope of the hackathon, we focused on 3 of the most used (and most popular) OCR Engines. We also identified the key differences between these engines, what they can/can’t do, the ease of use, and the processing time. The three engines highlighted in this article are the following:
Comparing OCR Engines in UiPath
Google Cloud OCR
This OCR is easy to use because it’s built directly into UiPath and performs accurately. We were able to create an API key and get it working with UiPath much faster than we originally expected.
To get it working, you must turn on billing to retrieve the API key. Afterwards, Google Cloud will give you $300 of credit for 365 usage days. It will also provide 1000 free OCR image processing cycles per month, which comes to $0.15 per image process.
Microsoft Cloud OCR
This is similar to the built-in Google Cloud OCR, and it’s free and easy to use. In some scenarios, it was more accurate than Google Cloud OCR as well.
We signed up for an API key but received an error in UiPath when we started using it. The integration was not smooth due to the image processing nature of Microsoft. It required encoding or needed to be uploaded to a public URI to be sent to the OCR Engine.
We downloaded a free trial of Abbyy FineReader to the machine we were using. FineReader performs quite well with converting a scanned PDF to a searchable PDF or even to other document formats.
Unfortunately, when we tried to using Abbyy to process handwritten text on an image, we were not successful.
Final OCR Conclusion
While comparing OCR Engines in UiPath, any of the current platforms have the potential to perform well. The key differentiation between each one of these tools depends on its intended use. For example, if we want to process raster or vector PDFs only, Abbyy would likely be the best option. It is full of features for PDF processing, however lacks the functionality of handwritten text recognition.
Microsoft OCR excels on handwritten processing, however the integration seems too complex and not easy to use with 3rd party applications. Google on the other hand is quite applicable for both handwritten and PDF processing with the engine accuracy being the only potential stipulation.
There’s more to explore at Smartbridge.com!
Sign up to be notified when we publish articles, news, videos and more!
Other ways to