OCR-IT Featured as an Exhibitor Sponsor at Apttus Accelerate 2015

Fremont, CA, 2015 – OCR-IT, a provider of world-class OCR technology for document processing and automation solutions, is being featured as an exhibitor sponsor at Apttus Accelerate 2015, being held at the prestigious Palace Hotel in downtown San Francisco from April 7–10, 2015. OCR-IT is exhibiting its highly accurate optical character recognition (OCR) technology, services, and software. Its OCR technology is capable of recognizing 190 different languages and is available for organizations who are either looking to migrate to Apttus or are existing Apttus clients. For organizations interested in migrating to Apttus, OCR-IT is showcasing its high-volume OCR services, which help clients migrate large, existing libraries of contracts, agreements, and other legal documents to the digital world of Apttus.  OCR-IT prepares and provides documents optimized specifically for Apttus ecosystem. “With our high-volume OCR services, our clients are able to convert huge volumes of faxes, images, and other unsearchable documents and agreements into highly compressed, searchable PDFs that retain the documents’ original look and feel. This gives our clients the ability to extract and analyze the specific content and data contained within them,” said Ilya Evdokimov, the lead architect behind OCR-IT. “This service has been saving our customers tremendous amounts of resources and money, which they previously needed to manually manage these large volumes of unstructured information.” For legal organizations that are already Apttus customers, OCR-IT offers a simple solution to convert daily documents into searchable PDFs almost effortlessly.  This OCR-IT system, contrary to every other available solution, has no User Interface (UI), virtually zero user training requirements, while empowering every user in an organization to have access to high...

Helping in research to extract OCR data from WWII records

Someone asked: “I am working on a research project that deals with American military casualties during WWII. Specifically, I am attempting to construct a count of casualties for each service at the county level. There are two sources of data here, each presenting their own challenges. 1. Army and Air Force data.  2. Navy and Marine Core data.” Full question is here: http://datascience.stackexchange.com/questions/5047/ocr-text-recognition-and-recovery-problem/5078#5078 Source 1 Sample Image | Source 2 Sample Image The answer to both data sets is an OCR application with some post-processing, but a more specialized program than a generic low-quality or an open source OCR. Essentially the harder the problem, the more capable and advanced tools need to be used to solve it. There will be two major stages in this task: generating the data (image to text, i.e. OCR), and processing the data (doing the actual count). Look at them separately in order to select the best method for each stage. The main challenges in these images and OCR are: a) images have low resolution. For example the # 1 image has resolution of about 72 dpi. Suggested resolution for such text quality is to scan at 300 to 400 dpi, but it is clear that re-scanning or controlling scan resolution is not applicable now. That’s why one option is to clean and increase the size using image pre-processing tools. This is what the original #1 image snippet looks like after adaptive binarization and zoomed at 300%. It is clear that each character has too few pixels and characters can be easily misread.   b) GIF format in #1 is not supported by many OCR...

Tips for recognizing multiple languages and processing documents with mixed languages

OCR-IT API can recognize text in over 180 languages, more than most other OCR systems in or out of Cloud environments.  This powerful feature makes this API useful in every region of the World without the need to change API structure, develop different code, or sign up to any other services.  We currently have numerous users implementing our API to process text from images generated globally, and we continue to expand to more and more supported languages.  Language setting is one of primary parameters for successful OCR conversion, and it is FREE for you to use, unless you turn on one of specialty languages (see list below, costs extra).  Selecting incorrect language for a document most likely will cause degraded speed and quality of OCR, and frequently all text may become unreadable, so it is an important parameter. Auto-detect multiple languages?  Sure! If you do not know in advance what language will be present on the next picture or in the next document, select several language choices at once, and OCR-IT API will select the best language to use.  For example, in Canada, a user may take a picture of something in French, immediately followed by another picture in English.  Or a company in Germany may receive a fax in English, followed by a fax in German, followed by a fax in French languages.  In such situations, selecting multiple languages in OCR-IT API automatically resolves this complex technical challenge.  But there are a few suggestions which will optimize your multi-language environment: Use fewest number of possible languages for highest recognition result.  If you can precisely know which language to use with which document, such as separate folders by language, that...

Resources and suggestions for iOS developers

OCR-IT Cloud-based OCR API was one of the first high-quality online OCR Optical Character Recognition) services on the market. It launched in 2009 and started to appear in various implementations by 2010. On of the first apps on Apple Store was FotoNote app, which to this day gets 5-star rating due to high OCR quality. Many other apps followed with unique and creative uses of OCR. OCR-IT offers a number of plans and resources to enable iOS developers to use the OCR-IT API in their own apps. Pricing Plans All currently available pricing plans are listed here: Pricing Plans Development Account – developers receive Free account and full access to API for entire development and testing lifecycle.  Full access to resources is provided along with live testing environment.  Sign up to Development & Testing plan to start the development. Production Account – once the app is ready to go live, a different subscription from Development & Testing plan is needed.  Developer can choose any other plan available from OCR-IT plan selection page, depending on the estimated volume of images to be processed.  Alternatively, a custom plan can be discussed and created if Developer finds that a different licensing model will be more beneficial. API Technical Resources There are three major sources of technical information for iOS developers: API Documentation – detailed technical documentation explaining every part of OCR service and its usage. OCR-IT Blog – a number of articles containing tips, tricks and best approaches to creating powerful and effective OCR-based apps. OCR-IT Support Team – technical experts with many years of OCR and image processing experience.  OCR-IT staff can help answer theoretic and practical questions regarding image and text quality, use...

Speed of processing

OCR-IT Cloud OCR API provides access to high-quality OCR from devices and environments where OCR does not reside locally due technical limitations and other constraints.  This enables such environments to perform OCR-related tasks without use of local resources or maintenance and upkeep.  In some cases, cloud-based OCR is the only option to enable image processing and text recognition.  As the result, since images are processed off-device, developers should consider several optimization techniques at every stage of their submission process. In general, the Web OCR process is represented here: The entire conversion workflow can be separated into these logical steps: Image capture, creation, optimization Transmission to cloud Processing Transmission back to source Text/data processing There are multiple actions developers can take at each process stage to achieve fastest possible processing.  Let’s explore each stage separately. 1. Image capture, creation, optimization – preparation of the image for submission to processing.  This is one of the most important steps in successful workflow, since all consecutive stages will depend on the result of this stage.  Image should be as clear as possible to achieve higher level of OCR.  This means using various techniques such as user guidance and training to achieve better images, on-device quality check, resolution check, shake detection, image cleanup to prepare clean and small image for transmission, as well as other techniques.  An average 3G connection upload speed on iPhone or Android device is about 0.85 Mbps (0.11MBps) per PCWorld field tests here.  The average photo size is about 2.5 MB.  This means the upload of the original photo alone will take about 23 seconds.  However, if the image is binarized prior to transmission, the resulting black & white image filesize can be about 30 KB,...

OCR-IT Team attended Apps World 2013 tradeshow in San Francisco – post visit summary

If your have attended Apps World 2013 last week, you could not have missed OCR-IT corner booth right in front of ‘Media & Speakers Lounge”.  OCR-IT Team members were wearing bright yellow sweaters and gave away environmentally-friendly carry bags, cell phone stands, and other items to booth visitors.  We had many great conversations, suggestions about future cloud-based OCR products, questions about API and valuable feedback about our existing services.   It was interesting to see people look at our diagrams for a few seconds, and then exclaim “That’s a good idea!” or “I did not know that was possible!” or “I could definitely use it in my apps!“.   On the first day over 7,000 attendees walked the show floor learning about new and emerging technologies, major service providers, and big-name players in Apps and Mobile markets.  OCR-IT shared two major offerings: OCR-IT Cloud 2.0 API and Managed OCR Services. Visitors attended seminars, visited vendor booths, and overall enjoyed modern high-tech environment between sessions. OCR-IT Team members demonstrated OCR capabilities on the spot via Web browser and mobile apps created by other third-party developers using OCR-IT API on iPhone, iPad, Android devices.  Visitors asked to take pictures of signs, their badges, business cards, receipts, books they were carrying and other text-based documents just to see how OCR-IT could process them.  Within a few seconds of processing and after seeing processing results in digital text, they were impressed with high accuracy of OCR from OCR-IT Web service.  It was great to see their reaction, smiles, and sparkles in their eyes as numerous ideas how to use OCR-IT services jumped into their minds. The second day was slower, and OCR-IT Team had more time...