OCR data entry if mostly referred to as a way to automate data entry tasks. It can also be thought of as the manual data entry process which either verifies the scanned data for accuracy or extracts it for further usage. In this sense, OCR data entry would be considered the third major part of a conversion process involving scanning of paper documents, OCR conversion of images to text, then validating the results with manual data entry.
The last step is oftentimes necessary because OCR technology is not completely accurate. In addition, the entry of data into the source forms may not be accurate, so the OCR is just digitizing text of imaged errors. For example, say the forms in question have an email address field. The original source of the data does not include the ‘@’ sign when filling out the form. Imaging and then OCR conversion to digitized text will lead to an email address that is not valid. So say you wish to do a mailing based on a customer list. The email will not send because of at least one invalid email address, which will have to be located, then corrected in the database manually. This can turn into a bottleneck that slows business processes down.
Capabilities of OCR Technology
The biggest misconception about OCR technology is that it is accurate. Even under the best of circumstances – typewritten, unblemished form data; there will still be errors that occur.
- Something as innocent as an extra dot on a form can cause problems
- OCR cannot handle handwritten data, unusual fonts, and documents that have a variety of character sizes (some much larger than the others).
- OCR is very much scanner dependent also. Inexpensive/lower quality hardware makes it much more difficult for the software to do its job accurately
As noted before, the original input of the data onto the forms may be inaccurate also. Because of accuracy problems, OCR is best used for basic archival purposes.
The question to ask, when deciding which would be a better solution – double key manual or OCR data entry; is the data contained actionable data? That is, does it have practical value in that it needs to serve a useful function? If the answer is yes, then OCR should not be used. The mailing list example is one scenario where actionable data (email addresses) serves a critical function for the purpose of communication. Inaccurate data of this sort will stop a process in its tracks. In addition, if OCR’d data is being used as a proxy of the actual document, manual data entry correction will be necessary post OCR to ensure an accurate representation of the document. Some examples of documentation that would benefit from manual data entry to capture the content would be:
- Archival material
- Names and addresses
- Forms such as registrations and rebates
- Forms containing data in an index or post card format
OCR cannot convert handwritten form data to text accurately. According to an AIIM study (Forms Processing 2012) a significant amount of businesses interact with forms in which half the data contained within is in handwritten format, such as the forms listed above.
Cost of OCR Services
Another false impression of OCR technology is the cost. While it is advertised as the cheap automated solution for data entry and capture, many of the pricing models have turned OCR software into a software as a service (SaaS). Yes, OCR is optical character recognition software – not a piece of hardware, which is another misconception. This means that a license is purchased and fees accumulate on a per character basis. It is still cheaper than say double key data entry and verification but cost should take fully into account the context of whether there is a need for accuracy in the data – both for future use purposes, or just to preserve a basic archival record. Finding out later that substantial error correction of OCR’d data is needed can lead to significant additional costs.
Coleman Data Solutions does offer OCR services because customers sometimes ask for it and we are a full service data and document management company. However, we always recommend data entry services or imaging and indexing to customers with actionable data that is key to business processes. Contact us with any questions you may have in deciding whether manual or OCR data entry services would best fit your situation.