Guide to better mobile images (from cell phone camera) for higher quality OCR

Mobile images make up large volume of traffic going through OCR-IT OCR Cloud 2.0 API.  Compared to conventional office documents, which are typically black on white at 200 to 400 dpi resolution images, and for which OCR technology has been fine-tuned for over a decade, mobile images vary greatly in resolution, quality, and image content, and present new and interesting challenges for technology and beyond.  With mobile image capture, technology is not the only important factor anymore, since behavior and simple actions performed by users can easily make or break any and all available technology.  So user behavior became much more important for ‘distributed capture’ of mobile images across very wide network of users with different skills and hardware.  Industry has not seen that dependency on user actions before cell phones, because using scanners, faxes, MFPs and copiers for image capture provided predictable image quality expectations controlled by mature technology and without much user intervention. In the following post I will describe most common situations encountered by OCR-IT Cloud OCR processing of pictures from mobile devices.  However, this text should apply to any OCR in general. I will use specific examples to describe and document common issues with mobile images. Document type: business card (which is in the top 5 of most frequently requested document types through OCR-IT Cloud API, receipt images being the most frequent document type) Mobile device: iPhone 4 (which is equivalent to average mobile camera, not top end camera by today’s standards) Environment: office desk, 8 PM (winter night), one fluorescent desk lamp for lighting For simplicity of explanation, and to further explain how some OCR engines operate internally,...

Standard Process for Managed Document Conversion and Outsourcing

OCR-IT Document Conversion Services Team uses the following methodology and project progress tracking for every Document Conversion task. Party Stage Task Client INITIATION Issue the Order for Service, complete Service Agreement, discuss project progression. Client PREPARATION Prepare documents for processing.  Documents should be in PDF (without password protection or content extraction limitations), TIF, JPEG, BMP, PNG file format.  Documents may be in sub-folder structure, or in a single folder.  This original structure will be preserved for Delivery. Client SEND Documents should be provided to OCR-IT for processing. Media options are: – FTP (OCR-IT will provide secure FTP location) – HDD (Recommended for large volumes over several GB) – Any other standard storage media, such as USB drive, flash card, DVD, etc. Sending options are: – FedEx – FTP – Local pickup (for urgent projects) OCR-IT SETUP Documents are received and checked for transmission errors.  Processing profile is created.  Processing settings are confirmed to client. OCR-IT PROOF RUN A small sample set is processed using created settings.  Processed set is delivered to client for review. Client PROOF CHECK Sample is reviewed and settings confirmed.  Upon confirmation, OCR-IT locks down settings to be used for the entire volume. OCR-IT PRODUCTION Entire volume goes into production with confirmed settings.  Progress updates are provided every 48 hours to completion. OCR-IT QA Upon completion, results are checked using following techniques: – Total count IN = total count OUT – File name IN = file name OUT – Random spot check to verify successful processing (searchability) and desired file format output (per settings) NOTE: Some projects may or may not include manual verification of text. OCR-IT DELIVERY...

User Scenario: Process digital camera pictures and OCR to extract specific numbers

In this specific project asked by one of our users, we would like to provide analysis and suggestions how to process photos of marathon runners and OCR and extract text data from these pictures. This article will describe the fully automated OCR Cloud 2.0 API approach and automated tools for developers to be used without human intervention in processing of these images. If you are interested in semi-automated process including human verification options, please contact us separately. In this project, there are several parts we will discuss separately, but overall we believe it is possible to achieve good recognition result on most good images. This project can be considered medium-to-hard complexity project, due to multiple factors, technology limitations, and multiple decision steps in approach. We will test several images from the same category to illustrate how OCR works internally, what limitations exist in these specific images, and what we can do to optimize output quality. First, we will test one random image and describe every step happening to that specific image in background processes.  These same processes will happen on each image processed. The original color photograph looks like this: NOTE : It should be noted that original photographs have high resolution, and are large files around 3 MB in file size. Only for this visual explanation and illustration purposes images (above and below) were decreased in size. For simplicity of explanation, and to further explain how OCR engines operate internally, let’s review the binarized image next. Binarization – the process of converting every pixel in the photo to either black or white, which effectively converts the photo into...