API Documentation
OCR CLOUD API GUIDE
Enhanced API
Release 2.3
Updated 2012-10-10
SIGN UP, TESTING AND PRICING
OPEN NEW ACCOUNT & GET PRICING:
http://www.ocr-it.com/ocr-cloud-2-0-api-subscription-plans-pricing/
OCR-IT OCR Cloud 2.0 API also provides free Development/Testing account for more automated POST API testing. Easy subscription to Trial account is available: http://www.ocr-it.com/free-ocr-cloud-2-0-api-trial
For assisted testing and advice, send your image or PDF to support@ocr-it.com for one of experienced OCR-IT Technicians to convert and return results back to you, along with recommendations how to best process your documents. It’s free. No obligations. Please suggest what OCR language to use for your documents, if it not in English language. Also suggest what format you would like to receive back if you would like any format other than TXT and Searchable PDF.
More information: http://www.ocr-it.com/ocr-cloud-2-0-api/
Technical support: support@ocr-it.com
Overview
The OCR-IT LLC OCR Web API allows to submit OCR requests (images in PDF / TIF / PNG / JPG / BMP / PCX / DCX formats) and get back textual results (in TXT / PDF / RTF / Word / Excel / XML / CSV / others, with full Unicode support). Multi-lingual OCR in a variety of languages (listed at the end of this document) is supported.
Key Features:
- Temporary Cloud Storage for Images
- Support of Common Image Formats
- Variety of Print Types
- Image Cleanup: Deskew, Despeckele, Remove Texture, Automatic Rotation Detection
- Over 150 OCR Languages
- Mixed Languages Auto-detection
- Barcode Recognition
- Two modes of Text Recognition: Quality, Speed
- Specialized Text Extraction Algorithms
- Support for Different Fonts and Print Types
- All Popular Output Formats
- Enhanced Error Handling
By using the various API settings, you can optimize the OCR process to a variety of sources (scans, digital camera images, etc) and a variety of purposes (full-text indexing of articles, invoice scanning, etc). Barcode scanning is also supported. For assistance in optimizing the API for your particular task, please contact Support Team.
Using the API consists of the following stages:
- Submit a Job
- Handle Job Status – one or both of the following:
- Check Job Status manually
- Get notified about Job Status automatically
- Get Results of a Job
1. SUBMITTING A JOB
OVERVIEW
Submit a job by sending an HTTP POST request to the following URL:
http://svc.webservius.com/v1/wisetrend/wiseocr/submit?wsvKey=[your API key]
NOTE: Make sure to include the “/submit” in the URL
The request message body should contain XML of the following format (explained in detail below):
<Job>
<InputURL>[input image URL]</InputURL>
<InputType>[input type (PDF/TIF/PNG/JPG/etc.)]</InputType> <!– Optional –>
<NotifyURL>[job status notification URL]</NotifyURL> <!– Optional –>
<CleanupSettings>[image cleanup settings]</CleanupSettings> <!– Optional –>
<OCRSettings>[OCR settings]</OCRSettings> <!– Optional –>
<OutputSettings>[Output settings]</OutputSettings> <!– Optional –>
</Job>
The XML is case-sensitive, using proper leading letter and acronym capitalization, as listed in this documentation.
The order of XML elements should be as listed in this document. For example, <CleanupSettings> may NOT be placed before <InputURL> within XML.
The Content-Type of the request should be “text/xml”.
The Content-Length is required.
No place to store your images online for conversion? We got you covered with Short Term Storage!
Sometimes it is more convenient to upload the actual image file that you want OCRed, instead of giving its URL. If this is the case, you can use the convenient WebServius Short Term Storage API. It allows you to very easily store a file at a temporary URL in the cloud, which you can then pass to the OCR API.
http://www.webservius.com/cons/subscribe.aspx?p=wsv&s=sts
Please see the link above for latest pricing details. Special discounts and some amount of free usage may be available for the WebServius Short Term Storage API when it is used in conjunction with the OCR-IT OCR Cloud API.
The WebServius Short Term Storage API offers free and paid accounts as long as you use it in conjunction with the OCR-IT OCR Cloud API. Please see the link above for latest pricing details and subscription information.
In case of success, the response will be an HTTP 200 (Success) response code, and the following XML (explained in detail below):
<JobStatus>
<JobURL>[URL for checking job status]</JobURL>
<Status>Submitted</Status>
</JobStatus>
In case of an error, an HTTP error code is returned along with XML explaining the error (see section on Error Elements at the end of this document).
COSTS AND CHARGES PER UNIT
Account is charged 1 Unit (as defined below) of OCR upon successful job submission, and for the rest of the pages upon successful job completion. Please note that certain errors (such as a corrupt input file) can only be detected once you’ve already been charged 1 unit for the submission.
A single “Unit” is defined as a single page of size “international A4 format or smaller”, by actual surface area. Smallest processing unit is 1 Unit, so even smallest images will count as 1 Unit.
Please note that if your image has larger format than A4 size, an appropriate number of Units will be deducted based on a simple formula of ‘your area / A4 area, result rounded up’. A4 has the size of 210 x 297 mm, or 8.27 x 11.69 inches. Area is 96.68 sq in. If you submit a large engineering drawing in size B2, which is 19.69 × 27.83 inches (547.97 sq in area), that drawing will utilize 547.97/96.68 = 5.66, which will be charged as 6 Units.
In multi-page files such as PDFs, each page gets evaluated separately using the above criteria. For example, if a PDF has 5 pages under A4 size, the PDF will be processed at the cost of 5 pages (Units).
There is a detailed visual diagram and specifications of international sizes here:
http://en.wikipedia.org/wiki/Paper_size
Please note that quantity of export formats does not affect the Unit count charges, i.e. multiple output formats can be requested in a single request and will be charged only based on surface area, not on quantity of outputs.
INPUT PARAMETERS
wsvKey (required)
This is your API key, which is issued to you when you subscribe to the OCR-IT LLC OCR Cloud 2.0 API.
InputURL (required)
The URL of the image on which you want to perform OCR (must be http://, https:// or ftp://)
NOTE 1: Make sure that the InputURL is properly XML-encoded. This is especially a concern if the URL contains query parameters. For example, if your image is at:
http://example.com/images?id=565&size=large,
the job request should be:
<Job><InputURL>http://example.com/images?id=565&size=large</InputURL></Job>
Note that the “&” in the original URL has turned into “&”, as required by XML encoding rules.
Normally, if a standard library is used for dealing with XML, this would be done automatically. However, if you are constructing XML manually from strings, you may need to do this manually.
NOTE 2: Do not URL-encode (percent-encode) the InputURL. For example, if your image is at: http://example.com/My%20Picture.jpg
The job request should be:
<Job><InputURL>http://example.com/My Picture.jpg</InputURL></Job>
Note that a real space is used instead of the “%20” percent-encoded version.
The image cannot exceed 20MB in size and cannot take more than 5 minutes to download.
The image must be in a supported format (see table below). If the image URL path (not counting the query string, if any) does not end in a dot followed by a supported extension (case-insensitive, see table below), the InputType parameter must be provided. E.g.:
http://example.com/scan001.tif – InputType not required (TIF auto-detected)
http://example.com/scan001.tif?resolution=high – InputType not required (TIF auto-detected)
http://example.com/scan001 – InputType required
http://example.com/scan001?format=.tif – InputType required
Supported formats and extensions are:
| FORMAT | EXTENSIONS | SUPPORTED FORMAT DETAILS |
| Version 1.6 or earlier | ||
| BMP | bmp | 2-bit – Uncompressed Black & White 4- and 8-bit – Uncompressed Palette 16-bit – Uncompressed Mask 24-bit – Uncompressed Palette and TrueColor 32-bit – Uncompressed Mask |
| PCX | pcx | 2-bit Black & White, 4- and 8-bit Gray |
| DCX | dcx | 2-bit Black & White, 4- and 8-bit Gray |
| JPG | jpg, jpeg | Jpeg: Gray, Color Jpeg 2000: Gray Part 1, Color Part 1 |
| TIF | tif, tiff | Black&White: uncompressed, CCITT3, CCITT3FAX, CCITT4, PackBits, ZIP, LZW Gray: uncompressed, Packbits, JPEG, ZIP, LZW TrueColor: uncompressed, JPEG, ZIP, LZW Palette: uncompressed, Packbits, ZIP Multi-image TIFF |
| PNG | png | Black&white, gray, color |
InputType (optional)
Specifies the input type. Must be one of the Supported Formats (left column in the table above). Not required if the type can be auto-detected from the URL (see InputURL above).
NotifyURL (optional)
The URL to which a notification should be sent when the job succeeds or fails (see section 2b on notifications). Must be http:// or https://.
NOTE: The NotifyURL must not be URL-encoded (i.e. should use “ “ and not “%20”), and must be XML-encoded (i.e. should use “&” and not “&”), just like the InputURL. See the InputURL section above for more details and examples.
CleanupSettings (optional):
Settings that control image cleanup, in the following form (every element is optional):
<CleanupSettings>
<Deskew>[true/false]<Deskew> <!– Optional, default is ‘true’ –>
<RemoveGarbage>[true/false]</RemoveGarbage> <!– Optional, default is ‘true’ –>
<RemoveTexture>[true/false]</RemoveTexture> <!– Optional, default is ‘true’ –>
<RotationType>[see below]</RotationType> <!– Optional, default is ‘Automatic’ –>
</CleanupSettings>
The settings are explained below:
| Deskew | (Boolean) Specifies whether the skew angle for an image should be corrected during preprocessing. This mode is recommended if you want to automatically correct skew for images you work with. The default value is ‘true’.
|
| RemoveGarbage | (Boolean) Specifies whether garbage (excess dots that are smaller than a certain size) should be removed from the image during preprocessing. The default value is ‘true’.
|
| RemoveTexture | (Boolean) Specifies whether structured background noise should be cleared before the recognition process starts. The default value is ‘true’.
Before After |
| RotationType | (String) Specifies what type of rotation will be performed upon the image during preprocessing. The default value is “Automatic”, which means that rotation will be detected automatically. Allowed values:
NoRotation – no rotationAutomatic – auto-detect rotation Clockwise – rotate by 90 degrees clockwise Counterclockwise – rotate by 90 degrees counterclockwise Upsidedown – rotate by 180 degrees |
OCRSettings (optional)
Settings that control image recognition; in the following form (every element is optional):
<OCRSettings>
<PrintType>[see below]</PrintType> <!– Optional, default is ‘Print’ –>
<OCRLanguage>[see below]</OCRLanguage> <!– Optional, default is ‘English’ –>
<SpeedOCR>[true/false]</SpeedOCR> <!– Optional, default is ‘false’ –>
<AnalysisMode>[see below]</AnalysisMode> <!– Optional, default is ‘MixedDocument’ –>
<LookForBarcodes>[true/false]</LookForBarcodes> <!– Optional, default is ‘true’ –>
</OCRSettings>
The settings are described below:
| PrintType | (Semicolon-delimited list of strings) Specifies the types of printed text in the image. The default value is “Print”, which corresponds to common typographic text equivalent to laser printer.Print
Modern Text Typewriter DotMatrix OCR_A OCR-A Text OCR_B OCR-B Text MICR_E13B If you would like to recognize more than one text type in the same document, separate types with semicolons without spaces. For example, “Print;Typewriter”. |
| OCRLanguage | (Semicolon-delimited list of strings) This property allows you to specify which of over 200 supported languages should be used for OCR, including mixed languages within the same document. See list of supported languages at the end of this document. The default value is “English“. To specify more than one language, separate languages with semicolons (without spaces) – for example: “English;Danish”. |
| SpeedOCR | (Boolean) This property provides faster recognition speed (by as much as 2-2.5 times, depending on server load) at the cost of a moderately increased error rate (1.5-2 times more errors). On good, print-quality texts, OCR makes an average of 1-2 errors per page more in this mode, which in some cases is a small sacrifice for the substantial increase in speed. Such moderate increase in error rate can be easily tolerated in many cases, such as full text indexing with “fuzzy” searches, preliminary recognition, etc. The default value is ‘false’. |
| AnalysisMode | (String) Specifies how aggressively the text should be extracted. The default value is “MixedDocument”. MixedDocument – This mode is useful if you export your text to document archives: the full page layout is retained and full-text search is available if you save in this mode. This mode will look for images and text within an image. TextIndexing — This mode is used to extract data from a document, including text in pictures. Note that the OCR retains both the picture and the text in it. Text extracted from a picture block can only be exported to TXT, PDF and XML formats (XML export support is coming soon). The data can then be used for subsequent full-text indexing and search. The program retains the logical reading order, pictures, and tables. TextAggressive — This mode is used to pre-process documents with many small text zones. Usually they are noisy, low-quality images that may contain text within other objects. This mode extracts all text from the image, including tables, pictures, small text areas, and noise. The result is plain text without table blocks and picture blocks. BarcodesOnly — This mode is used to extract barcodes only. NOTE: Barcode values are extracted in all modes as long as LookForBarcodes is true. To understand specific differences between these modes of image analysis, sample processing and testing is encouraged. Results will vary highly based on a) your need for text vs. preserving pictures and b) based on your specific images. For example, running in “TextAggressive” mode, OCR will sacrifice images towards extracting as much text as possible. For example, a road sign in the background or a license plate will be treated as potential text content. In “MixedDocument” mode, the same road sign will be treated as a picture. In “TextIndexing” mode, the road sign will be preserved as a picture, but any recognizable text content will still be available for searching. |
| LookForBarcodes | (Boolean) Specifies whether barcodes should be recognized. Default is ‘true’. |
OutputSettings (optional)
Settings that control text result output, in the following form (every element is optional):
<OutputSettings>
<ExportFormat>[see below]</ExportFormat> <!– Optional, default is ‘Text;PDF’ –>
</OutputSettings>
The settings are explained below:
| ExportFormat | (Semicolon-delimited list of strings) Specifies the desired formats for text output. The default value is “Text;PDF”, which corresponds to both Text and PDF output.RTF – export to *.RTF (rich-text) format. Retains full page layout and preserves pictures. The program will automatically select the most suitable paper size when saving the recognized text and pictures.
MSWord – export to *.DOC (Microsoft Word) format. Retains full page layout and preserves pictures. The program will automatically select the most suitable paper size when saving the recognized text and pictures. MSExcel – export to *.XLS (Microsoft Excel) format. PDF – export to *.PDF format DBF – export to *.DBF format Text – export to *.TXT common formatted ASCII text-only output CSV – export to *.CSV format PPT – export to *.PPT format XML – export to *.XML format UnicodeText_UTF8 – export to *.UTF8.TXT format UnicodeText_UTF16 – export to *.UTF16.TXT format
UnicodeCSV_UTF8 – export to *.UTF8.CSV format
UnicodeCSV_UTF16 – export to *.UTF16.CSV format
HTML – export to *.HTM format ^ UnicodeHTML – export to *.UNICODE.HTM format ^ ^ HTML output will provide access to the HTML result containing all text. If original image contained pictures along with the text data, pictures will be referenced in HTML but will not be returned. Please test HTML output to make sure output matches your desired result. If you would like to produce more than one output format from the same image request, separate your desired output formats with semicolons without spaces. For example, “PDF;Text;UnicodeText_UTF8”. NOTE: You will need to know the file extension of the desired format (specified above) to retrieve the job results.
|
Error Element On Failed Job Submission
If the job submission fails, you will receive an appropriate HTTP error code, as well as an <Error>Code</Error> response. The possible values of ‘Code’ are:
| Code | HTTP Error Code | Description |
| BadInputURL | 400 | InputURL is invalid or missing, or is not an HTTP/HTTPS URL |
| BadNotifyURL | 400 | NotifyURL is invalid or missing, or is not an HTTP/HTTPS URL |
| BadInputType | 400 | The specified InputType is invalid, OR InputType is missing, and auto-detected file type is not valid, OR InputType is missing, and auto-detection of file type has failed |
| BadRotationType | 400 | Rotation specified in CleanupSettings is invalid. Please note that it is case-sensitive. |
| BadAnalysisType | 400 | AnalysisMode specified in OCRSettings is invalid. Please note that it is case-sensitive. |
| BadPrintType | 400 | PrintType specified in OCRSettings is invalid. Please note that it is case-sensitive. |
| BadExportFormat | 400 | ExportFormat specified in OutputSettings is invalid. Please note that it is case-sensitive. |
| OCRSettingsTooComplex | 400 | OCRSettings are too complex. Try reducing the number of OCRLanguages and PrintTypes you are recognizing. |
| InternalError:ErrorNumber | 500 | Internal error has occurred. Contact Support |
POST Examples
URL Example:
HTTP POST to http://svc.webservius.com/v1/wisetrend/wiseocr/submit?wsvKey=[your_key_here]
Message body example (simple):
<Job>
<InputURL>http://www.ocr-it.com/online_ocr/english_photo_bw.tif </InputURL>
</Job>
Message body example (enhanced):
<Job>
<InputURL>http://www.ocr-it.com/online_ocr/english_photo_bw.tif</InputURL>
<InputType>TIF</InputType>
<NotifyURL>http://www.example.com/ocrNotify.php</NotifyURL>
<OCRSettings>
<OCRLanguage>English</OCRLanguage>
</OCRSettings>
</Job>
Message body example (full):
<Job>
<InputURL>http://www.ocr-it.com/online_ocr/english_photo_bw.tif</InputURL>
<InputType>TIF</InputType>
<CleanupSettings>
<Deskew>true</Deskew>
<RemoveGarbage>true</RemoveGarbage>
<RemoveTexture>true</RemoveTexture>
<RotationType>Automatic</RotationType>
</CleanupSettings>
<OCRSettings>
<PrintType>Print</PrintType>
<OCRLanguage>English</OCRLanguage>
<SpeedOCR>false</SpeedOCR>
<AnalysisMode>MixedDocument</AnalysisMode>
<LookForBarcodes>true</LookForBarcodes>
</OCRSettings>
<OutputSettings><ExportFormat>Text</ExportFormat>
</OutputSettings>
</Job>
Response example:
<JobStatus>
<JobURL>http:// api.ocr-it.com/ocr/v2/getStatus/123ABCResponseExample123ABC</JobURL>
<Status>Submitted</Status>
</JobStatus>
C# Code Example
PHP Code Example
Python Code Example
Java Code Examples
For these and additional sample code please visit our Code Samples section
2. HANDLING JOB STATUS
There are two ways to handle job status:
- You can manually check the status of any job by sending an HTTP GET request to the <JobURL> that you received when you submitted the job.
- You can automatically get notified when the job succeeds or fails if you provide a <NotifyURL> when you submit a job. There will only be one attempt to notify you. It will be made when the job fully succeeds or fails (you will not get any intermediate status notifications). The notification will consist of an HTTP POST containing XML status information.
Regardless of which method you use, the status report is in the same format, as described below.
2.1 Status for jobs in progress
For jobs that are not yet complete, the status report looks as follows:
<JobStatus>
<JobURL>http://api.ocr-it.com/ocr/v3/getStatus/xxxxx_your_job_id_xxxxx</JobURL>
<Status>[status]</Status>
</JobStatus>
NOTE: Sub-domain may be different depending on which server responds to your initial request. Make sure to retrieve the entire specific Job URL after your submission.
An example of job ID is “583659A247BCFE55110C2229FFEA7601”, which is a randomly generated value assigned to a job at the time of submission.
“Status” can either be “Submitted” (meaning that the job has been submitted but the image to be OCRed has not yet been downloaded), or “Processing” (meaning that the image has been downloaded and is in the process of being OCRed). Other status values (such as “Finished”) are described below, in the sections about successful/expired/failed jobs.
“JobURL” repeats the URL where updated job status may be obtained.
2.2 Status for successful jobs
For jobs that have completed successfully, the status report looks as follows:
<JobStatus>
<JobURL>http://api.ocr-it.com/ocr/v2/getStatus/xxxxx_your_job_id_xxxxx</JobURL>
<Status>Finished</Status>
<Download>
<File>
<Uri>http://api.ocr-it.com/ocr/v3/download/123456789.PDF</Uri>
<OutputType>PDF</OutputType>
</File>
<File>
<Uri>http://api.ocr-it.com/ocr/v3/download/123456789.TXT</Uri>
<OutputType>TXT</OutputType>
</File>
</Download>
</JobStatus>
NOTE: Sub-domain may be different depending on which server responds to your initial request. Make sure to retrieve the entire specific Job URL after your submission.
There will be one <File> entry for each requested output format – by default, there will be one for TXT (plaintext) and the other for PDF. The <File> entries may appear in any order. Each contains an <OutputType> indicating the output type (file extension), and a <Uri> containing the address where the output may be downloaded.
As usual, “JobURL” repeats the URL where updated job status may be obtained.
2.3 Status for expired jobs
Job results are not guaranteed to be kept for more than 24 hours, or after initiating a “clear” command (documented below) for a particular job. If a job has expired, it will NOT have a <Download> element, and the <Status> will be “Expired”.
2.4 Status for failed jobs
For jobs that have failed, the status report looks as follows:
<JobStatus>
<JobURL>http://api.ocr-it.com/ocr/v2/getStatus/xxxxx_your_job_id_xxxxx </JobURL>
<Status>[FailedStatus]</Status>
<Errors>
<Error>
<Code>[Code]</Code>
<Message>[Message]</Message>
</Error>
</Errors>
</JobStatus>
The <Status> may be one of the following:
| FailedDownload | Could not download the image to be OCRed |
| FailedConversion | Could not perform OCR |
| FailedNoFunds | Insufficient funds for the number of pages you are attempting to OCR |
| FailedInternalError | Internal error, please contact Support |
The <Errors> element may or may not be present. If it is present, it may contain one or more <Error> elements with <Code> and <Message> sub-elements that can help you debug the problem. Here are some common <Code> values:
| ConvertFailed | OCR engine reported an error during conversion. Make sure that the input file is not corrupt ad is not password-protected. |
| SubmitFailed | Could not submit the OCR job. Possibly an internal error, contact Support |
| DownloadRejected | Could not download the input image. Ensure that it does not exceed maximum size and that the server with the image responds promptly. |
| DownloadFailed | Could not download the input image. Ensure that the image URL exists and does not require authentication. |
As usual, “JobURL” repeats the URL where updated job status may be obtained.
3. RETRIEVING JOB RESULTS
To get the results of the job, use the URLs from the successful job status reports (see section 2.2 above). Results will be returned with the correct Content-Type header.
NOTE: The result of processing will be deleted after 24 hours from submission automatically.
CLEARING JOB RESULTS
For additional security, you may choose to delete your processing result. Once the processing result has been picked up and is no longer needed, making a simple call will initiate the purge process immediately.
Template:
POST {base_URL}/clear/{JobID}
Example:
POST http://api.ocr-it.com/ocr/v3/clear/e83026a84aa0400897f3000883897ce9
Return status 200 indicates successful completion. Status URL and JobID remain valid but all images and data becomes inaccessible and gets queued for purging. JobID URL should return status “Expired” after successful clear call.
NOTE: The Content-Length is required, but should be set to 0. Body message is not required.
PRIMARY AND ALTERNATE DNS
There are two (2) addresses available through two different DNS hosting providers for added reliability through redundancy.
| PROVIDER # 1 (primary) | ocrcloud-api.dyndns.org |
| PROVIDER # 2 (backup) | api.ocr-it.com |
All addresses can be used interchangeably, and applications that require an added protection from failing DNS can implement a primary and secondary DNS from two different hosting providers to be checked automatically. For example, the default status report:
<JobStatus>
<JobURL>http://api.ocr-it.com/ocr/getStatus/583659A247BCFE55110C2229FFEA7601</JobURL>
<Status>[status]</Status>
</JobStatus>
is equivalent to:
<JobStatus>
<JobURL>http://ocrcloud-api.dyndns.org/ocr/getStatus/583659A247BCFE55110C2229FFEA7601</JobURL>
<Status>[status]</Status>
</JobStatus>
TESTING WITH FIDDLER2 (OUT-OF-BOX)
Fiddler is a Web Debugging Proxy which generates and logs HTTP(S) traffic between your computer and external servers. Fiddler allows you to inspect all HTTP(S) traffic, set breakpoints, and “fiddle” with incoming or outgoing data. Fiddler is freeware and can debug traffic from virtually any application, including Internet Explorer, Mozilla Firefox, Opera, and thousands more.
Fiddler is a useful tool in testing your OCR requests or debugging. The following screenshot demonstrates the out-of-box setup to test any combination of settings available in OCR Cloud 2.0. See POST Examples section for pre-set requests.
LIST OF SUPPORTED LANGUAGES
Languages with full dictionary support
- ArmenianEastern
- ArmenianGrabar
- ArmenianWestern
- Bashkir
- Bulgarian
- Catalan
- Chinese Simplified*
- Chinese Traditional*
- Croatian
- Czech
- Danish
- DutchBelgiun
- DutchNetherlands
- English
- Estonian
- Finnish
- French
- German
- GermanNewSpelling
- Greek
- Hebrew*
- Hungarian
- Indonesian
- Italian
- Japanese*
- Korean*
- Latvian
- Lithuanian
- Norwegian
- NorwegianBokmal
- NorwegianNynorsk
- OldEnglish
- OldFrench
- OldGerman
- OldItalian
- OldSpanish
- Polish
- PortugueseBrazil
- PortuguesePortugal
- Romanian
- Russian
- Slovak
- Slovenian
- Spanish
- Swedish
- Tatar
- Turkish
- Ukrainian
- Abkhaz
- Afrikaans
- Agul
- Albanian
- Altaic
- Avar
- Aymara
- AzerbaijaniCyrillic
- AzerbaijaniLatin
- Basque
- Belarussian
- Bemba
- Blackfoot
- Breton
- Bugotu
- Buryat
- Cebuano
- Chamorro
- Chechen
- Chukchee
- Chuvash
- Corsican
- Crimean Tatar
- Crow
- Dakota
- Dargwa
- Dungan
- EskimoCyrillic
- EskimoLatin
- Even
- Evenki
- Faroese
- Fijian
- Frisian
- Friulian
- Gagauz
- Galician
- Ganda
- GermanLuxembourg
- Guarani
- Hani
- Hausa
- Hawaiian
- Icelandic
- Ingush
- Irish
- Jingpo
- Kabardian
- Kalmyk
- Karachay-Balkar
- Karakalpak
- Kasub
- Kawa
- Kazakh
- Khakas
- Khanty
- Kikuyu
- Kirghiz
- Kongo
- Koryak
- Kpelle
- Kumyk
- Kurdish
- Lak
- Latin
- Lezgin
- Luba
- Macedonian
- Malagasy
- Malay
- Malinke
- Maltese
- Mansi
- Maori
- Mari
- Maya
- Miao
- Minangkabau
- Mohawk
- Mongol
- Mordvin
- Nahuatl
- Nenets
- Nivkh
- Nogay
- Nyanja
- Ojibway
- Ossetian
- Papiamento
- Provencal
- Quechua
- Rhaeto-Romanic
- RomanianMoldavia
- Romany
- Ruanda
- Rundi
- RussianOldSpelling
- SamiLappish
- Samoan
- ScottishGaelic
- Selkup
- SerbianCyrillic
- SerbianLatin
- Shona
- Somali
- Sorbian
- Sotho
- Sunda
- Swahili
- Swazi
- Tabassaran
- Tagalog
- Tahitian
- Tajik
- Tongan
- Tswana
- Tun
- Turkmen
- Tuvan
- Udmurt
- UighurCyrillic
- UighurLatin
- UzbekCyrillic
- UzbekLatin
- Welsh
- Wolof
- Xhosa
- Yakut
- Zapotec
- Zulu
- Esperanto
- Ido
- Interlingua
- Occidental
- Basic
- C_C++
- COBOL
- Fortran
- Java
- Pascal
- SimpleChemicalFormulas
- MICRE-13B
- NumbersOnly
NOTE: Languages marked with “*” are not available in this API release, but are available upon special account. Consult additional documentation or contact OCR-IT Team for further information.
Languages without dictionary support
- Abkhaz
- Afrikaans
- Agul
- Albanian
- Altaic
- Avar
- Aymara
- AzerbaijaniCyrillic
- AzerbaijaniLatin
- Basque
- Belarussian
- Bemba
- Blackfoot
- Breton
- Bugotu
- Buryat
- Cebuano
- Chamorro
- Chechen
- Chukchee
- Chuvash
- Corsican
- Crimean Tatar
- Crow
- Dakota
- Dargwa
- Dungan
- EskimoCyrillic
- EskimoLatin
- Even
- Evenki
- Faroese
- Fijian
- Frisian
- Friulian
- Gagauz
- Galician
- Ganda
- GermanLuxembourg
- Guarani
- Hani
- Hausa
- Hawaiian
- Icelandic
- Ingush
- Irish
- Jingpo
- Kabardian
- Kalmyk
- Karachay-Balkar
- Karakalpak
- Kasub
- Kawa
- Kazakh
- Khakas
- Khanty
- Kikuyu
- Kirghiz
- Kongo
- Koryak
- Kpelle
- Kumyk
- Kurdish
- Lak
- Latin
- Lezgin
- Luba
- Macedonian
- Malagasy
- Malay
- Malinke
- Maltese
- Mansi
- Maori
- Mari
- Maya
- Miao
- Minangkabau
- Mohawk
- Mongol
- Mordvin
- Nahuatl
- Nenets
- Nivkh
- Nogay
- Nyanja
- Ojibway
- Ossetian
- Papiamento
- Provencal
- Quechua
- Rhaeto-Romanic
- RomanianMoldavia
- Romany
- Ruanda
- Rundi
- RussianOldSpelling
- SamiLappish
- Samoan
- ScottishGaelic
- Selkup
- SerbianCyrillic
- SerbianLatin
- Shona
- Somali
- Sorbian
- Sotho
- Sunda
- Swahili
- Swazi
- Tabassaran
- Tagalog
- Tahitian
- Tajik
- Tongan
- Tswana
- Tun
- Turkmen
- Tuvan
- Udmurt
- UighurCyrillic
- UighurLatin
- UzbekCyrillic
- UzbekLatin
- Welsh
- Wolof
- Xhosa
- Yakut
- Zapotec
- Zulu
- Adyghe
- TokPisin
NOTE: Languages marked with “*” are not available in this API release, but are available upon special account. Consult additional documentation or contact OCR-IT Team for further information.
Artificial languages
- Esperanto
- Ido
- Interlingua
- Occidental
NOTE: Languages marked with “*” are not available in this API release, but are available upon special account. Consult additional documentation or contact OCR-IT Team for further information.
Formal languages
- Basic
- C_C++
- COBOL
- Fortran
- Java
- Pascal
- SimpleChemicalFormulas
- MICRE-13B
- NumbersOnly
NOTE: Languages marked with “*” are not available in this API release, but are available upon special account. Consult additional documentation or contact OCR-IT Team for further information.
CONTACT SUPPORT
Contact support@ocr-it.com