Optical Character Recognition or OCR for short is the ability of software to convert images or scanned PDFs into text - that can be copied, searched, edited and further transformed. However, pretty much all of the OCR software out there can perform this basic function. If you're a developer looking to implement OCR into your software, or a business user evaluating OCR software to bring efficiency to your organization, this article is for you. We will discuss some of the best OCR software options available in 2022 and provide some advice on how to choose the right option for your needs.
Top Free, Freemium, and Paid OCR Software
Butler Labs
Butler Labs incorporates AI and OCR together to make it possible to extract information accurately from any document. Butler uses recent advancements in NLP and Computer Vision to extract printed text, and handwriting accurately. In addition, Butler is also able to associate areas or concepts of text to extract just the information you care about, allowing you to more thoroughly convert unstructured files like PDFs, and Images into a structured JSON, or CSV format. Butler's utilization of NLP and Computer Vision AI improvements allows it to be able to handle files with a lot of variations, or files containing tabular information with ease.
Pros
- Handles High Variance Files (Invoices, POs, Driver Licenses) as well as low variance files (W-9s, W-2s, etc)
- Generous Monthly Free Tier with a pay as you go usage Plan
- Very cost effective. See pricing plan here
- Deployed via a REST API with extensive documentation and features to enable developers to implement OCR in their own application
- Train new AI Models for documents unique to your company
- Has ability to handle Tables in Files
- Handles most languages
- Easy to use
Cons
- Doesn't natively integrate with ERPs. It instead provides a REST API that can be used to integrate with other applications.
- Lacks an on premise offering
Adobe Acrobat Pro DC
Adobe Acrobat Reader DC is a free PDF reader offered by Adobe. Adobe Acrobat Pro is their product that gives editing features to a PDF, among which is a way to run OCR on scanned PDFs.
Pros
- Native integration to the PDF reader makes copy paste workflows fantastic.
- Not additional charge to use the OCR feature
- Files are locally processed
- Easy to use
Cons
- No API Access
- Not possible to automate tasks completely without humans. It doesn't have the ability to extract one piece of information from a batch of documents all at once.
- No ability to customize accuracy for your business
- Slows down your computer, given OCR runs locally it requires more processing power from your computer
ABBYY Finereader
ABBYY Finereader is a PDF reader and editor with OCR capabilities built in. Similar to Adobe Acrobat, Finereader also comes with extensive capabilities in being able to manipulate PDF files; however unlike Adobe - Finereader has a lot more emphasis on its OCR capabilities
Pros
- Handles most languages
- Allows for corrections to be made in the application for OCR mistakes
- Supports exporting information after OCR Extraction to Microsoft Office, XML, and other PDF formats
- Advanced workflows available to allow data entry specialists to move between and OCR files quickly
Cons
- No API Access, meant primarily to be used data entry specialists
- Expensive
- Unintuitive, but feature rich UI. Requires training before people use Finereader efficiently
ABBYY Flexicapture
ABBYY's Flexicapture product combines OCR with AI. Unlike Finereader, Flexicapture is meant to be used to set up full or partial automation of back office functions often times it is paired with a RPA solution. See this article for how RPA and OCR are used in document processing use cases.
Pros
- Handles low and high variance documents
- Integrates with ERPs and other internal software through RPA software
- Ability to improve accuracy with more examples
Cons
- Limited REST API Support
- No Free Trial
- Expensive
- Difficult to learn and setup
Tesseract Open Source OCR Engine
Tesseract is a well-known open source OCR engine that has been used by many large organizations, including Google and Microsoft. Tesseract is available and deployed as a python package, and is primarily used by engineers to implement the ability to extract text out of images. However, customizing tesseract's accuracy for your documents tends to eventually require robust data science and MLOps knowledge.
Pros
- Free, only need to pay compute costs
- Has the ability to run locally or on a server
- Popular open source library, used by thousands of companies
Cons
- API Responses are represented in word and line blocks, and are difficult to parse for normal document processing use cases
- Requires some MLOps knowledge to deploy tesseract in a performant way
- Requires combining Tesseract to handle highly varied documents like (invoices, POs)
- Requires data science knowledge to handle reliably extracting information from rotated, zoomed out, and blurry images
Google Document AI
Google document AI extracts text from PDFs, and Images. Google document AI also uses recent advancements in Deep Learning, NLP and Computer Vision to create an AI that's able to extract relevant pieces of information from documents, primarily for the purposes of automating business workflows.
Pros
- Part of Google Cloud Platform, and integrates seamlessly with other GCP Services
- Easy to use API, after going through the GCP setup process
- Easy to scale to thousands of documents
- Has a Free Tier, with a pay as you go pricing
- Has REST API
Cons
- Expensive
- Unable to customize Document AI for your unique documents
- Lack of customer support
AWS Textract
AWS Textract is an API from Amazon that's able to extract text from PDFs and Images. AWS Textract offers APIs for common document types to make consuming it more useful as well as easy
Pros
- Part of the AWS Stack, and integrates well with other AWS Services
- Easy to use API
- Cost Effective
- Has REST API
Cons
- Unable to be trained for unique documents
- Performance issues at scale
- Lack of accuracy
- Lack of support