Optical character recognition, commonly referred to as OCR, has commonly been used as a method of digitizing printed texts. But what does OCR mean, how does it work, and can optical character recognition really be beneficial to your business?
What Is Optical Character Recognition?
Optical character recognition refers to software that transforms a scanned image into text, typically by transcribing it character by character.
Optical Character Recognition vs. Intelligent Character Recognition
OCR has been around for decades, and as the technology has evolved to include extraction of text from hundreds of languages, so has its name. One such offshoot of OCR is intelligent character recognition (ICR). ICR works like OCR, except with the purpose of capturing specific fonts and styles of writing. ICR relies on a “constrained handprint” that separates handwritten characters into individual boxes to extract characters one-by-one.
The History Of OCR
Interestingly, the origins of optical character recognition can be traced back to telegraphy. During World War 1, physicist Emanuel Goldberg created a machine that could read characters from a document and convert them to telegraph code. This method was updated during the 1920’s, and a system that closely resembles modern-day OCR was born.
At first, OCR was used almost exclusively to assist with the labor-intensive task of microfilming financial records, whereby Goldberg used a photoelectric cell to recognise patterns on a surface. This innovation led to the first steps toward the digital record keeping that we’re familiar with today.
While early use of OCR was limited by font and the range of characters that could be recognized, today we see the introduction of intelligent character recognition as a way to process an increasing number of handwriting styles and reduce errors.
How Does OCR Work?
OCR software normally follows the following steps to convert a scanned document into a digital format:
OCR software often “pre-processes” images or scanned documents to reduce distortions and noise, improving the quality of the recognition by applying techniques such as:
- De-skew and rotation – Applying filters to correct image orientation and make lines of text perfectly horizontal and/or vertical.
- Despeckle – Removing positive and negative spots, smoothing edges
- Binarization – Converting an image from color or grayscale to black-and-white pixels so there’s a clear distinction between the characters to be read (black pixels) and the background (white pixels).
The scanned document is separated into light and dark images. The lighter areas of the images are recognised as a background, and the dark areas are recognised as text that needs to be deciphered. Darker areas are then analyzed to locate letters or digits. These letters or digits are then identified using an algorithm.
These algorithms can recognise patterns and features:
- Pattern Recognition
Pattern Recognition occurs when OCR software has been “trained” with examples of text in many different fonts so that it can learn to recognize this as it works.
- Feature Recognition
Feature recognition takes place when the software applies a rule in relation to certain features of a letter or digit to distinguish what the letter or digit may be.
For example, crossed lines, curves, or angled lines are all features that make up the formation of letters and digits, and so the OCR system will recognize these features and determine which letter or digit is being used within a document.
Some OCR tools use a list of words (called a lexicon) that are allowed to occur in a document to increase output accuracy, but this can cause problems if the document contains proper nouns or technical language that are not part of the list.
The output file is often a plain text stream, although some OCR systems can preserve the original layout of the page.
The most common uses of OCR involve extracting data from electronic or paper documents for data entry into other business applications.
OCR is often used as an “under the radar” solution, powering common systems and services from daily life, such as passport recognition, automatic license plate identification, or scanning business card information into a contact list.
OCR has also enabled the digitalization of volumes of information by converting paper and scanned image documents into machine-readable, searchable files. This can be seen with initiatives such as Project Gutenberg or Google Books.
Businesses continue to invest in OCR technology, frequently as a piece of a wider intelligent automation plan, and this is expected to increase exponentially over the next decade. With this in mind, it’s important to gain a clear understanding of how OCR can benefit a business.
The main benefit of OCR technology is that it streamlines the data-entry process, allowing businesses to digitally store files and ensuring instant, consistent access to all documentation.
The benefits of using OCR technology include:
- Accelerate Workflows – Optical character recognition provides the ability to shorten each phase of document processing workflows, resulting in a higher and faster output.
- Reduce Costs – OCR reduces costs by significantly decreasing the number of errors, as well as the time taken to resolve them.
- Secure And Centralize Data – Data from documents is stored in a digital format, greatly reducing the risk of data being lost, stolen, or damaged. Centralizing data in a secure area is ideal for businesses that deal with large amounts of data.
- Improve The Service For End Users – Maintaining current customer information, helps businesses benefit from smoother interactions with customers. Having this data drastically reduces response times, keeping customers happier.
IDP vs. OCR
OCR simply transcribes a document, leaving you with a text representation of the image, but does not provide the structured data needed for downstream processing. Different document types can present challenges for OCR.
While most intelligent document processing (IDP) solutions incorporate some form of OCR, these solutions offer much more than just character recognition. IDP solutions are built to identify the business data from a document, and can also deal with the real-world challenges different documents may present.
For example, in the case of processing checks, OCR engines can normally read the payor address information, check number, and MICR (routing/banking info), but would not be able to capture the handwriting for the date, LAR (written out amount in words), and CAR (written amount in numbers).
Modern IDP solutions use specialized machine learning models to maximize accuracy and automation for more complex documents, like checks. They take into consideration the context (the same way a human would), can read cursive and handprint, and can even reconcile the values in the LAR and CAR fields.
How does Hyperscience use Optical Character Recognition?
At Hyperscience, we recognise the importance of having a solid foundation to support all of the elements of intelligent document processing, with optical character recognition being one of the most integral parts.
But an OCR engine alone is not enough. Organizations want accurate data extraction and insights, but they need to be able to process all sorts of documents, including complex ones with high variability and handwriting. That’s why at Hyperscience we have developed our own text recognition engine called Optical Intelligent Character Recognition (OICR), an ML-based OCR that takes context into account and can understand human intent (e.g. to ignore cross-outs, or read messy handwriting). This approach offers the highest levels of accuracy in the industry for both printed and handwritten text.