You are here: Home - OCR Cloud 2.0 API - FAQ

FAQ

OCR Cloud 2.0 API Status Log

 

Don’t see an answer to your question?  Send your questions to support@ocr-it.com.

OCR Cloud 2.0 API

Input Formats
PDF TIF JPG PNG BMP

Output Formats
JPG, TIF, PDF Image export Searchable PDF
Text under image
Text over image
PDF/A
PDF Compressed with Custom Security/Metadata/Tags
TXT (standard or Unicode)
DOC / RTF
XLS
XML
HTML
Multiple output streams available

 

<?php

//************************************************

// 1. First, we need to build an XML string to send

// to the API. In the example below, the sring is

// built in a very straight forward way.

//************************************************

 

// Lets start with adding to our string input URL and file type, which is

// the file you want to process

$filename = ' http://www.wisetrend.com/online_ocr/english_photo_bw.tif';

$inputURL = '<InputURL>' . $filename . '</InputURL>';

$inputTYPE = '<InputType>TIF</InputType>';

 

// If you want to build a page to where the result would

// be proccessed, you need to specify the Notify URL

// Uncomment the next line if you want to use the feature

// and dont forget to add it to a job string down below

//$notifyURL = '<NotifyURL>http://www.example.com/ocrNotify.php</NotifyURL>';

 

//(Optional)Next step is to add cleanup options.

$cleanup = '<CleanupSettings>';

$cleanup .='<Deskew>false</Deskew>';

$cleanup.='<RemoveGarbage>true</RemoveGarbage>';

$cleanup.='<RemoveTexture>false</RemoveTexture>';

$cleanup .='<RotationType>Automatic</RotationType>';

$cleanup .='</CleanupSettings>';

 

// (Optional)After clean up options, we are going to indicate

// OCR setting for your file

$settings='<OCRSettings>';

$settings.='<PrintType>Print;Typewriter</PrintType>'; //I have indicated multiple print types separated by ';'

$settings.='<OCRLanguage>English;Danish</OCRLanguage>'; //and again, multiple items separated by ';'\

$settings.='<SpeedOCR>true</SpeedOCR>';

$settings.='<AnalysisMode>TextAggressive</AnalysisMode>';

$settings.='<LookForBarcodes>false</LookForBarcodes>';

$settings.='</OCRSettings>';

 

// (Optional) You can also specify a format is which you want to receive

// you processed file in

$output='<OutputSettings>';

$output.='<ExportFormat>Text;PDF</ExportFormat>';

$output.='</OutputSettings>';

 

// Now create a job with all the settings you have specified

$job = '<Job>' . $inputURL . $inputTYPE . $cleanup . $settings . $output .'</Job>';

 

 

//****************************************************

// 2. Now that we have made our upload string with all

// the settings we desire, we are going to create an

// xml request using curl.

//****************************************************

 

// First, we are going to define POST URL

$key = 'your_key_here';

define('XML_REQUEST', $job); //this is the actual request to be sent

define('XML_POST_URL', 'http://svc.webservius.com/v1/wisetrend/wiseocr/submit?wsvKey=' . $key ); //this is where we want it to be sent

 

// Then we will initialize handle and set request options

$header[] = 'Content-type: text/xml';

$header[] = 'Connection: close';

 

// Now, we will use defined settings above to create an xml

// request using curl

$ch = curl_init();

curl_setopt($ch, CURLOPT_URL, XML_POST_URL);

curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);

curl_setopt($ch, CURLOPT_TIMEOUT, 4);

curl_setopt($ch, CURLOPT_POSTFIELDS, XML_REQUEST);

curl_setopt($ch, CURLOPT_HTTPHEADER, $header);

 

// After we have created the request we will

// execute it (Optional) and also time the transaction

$start = array_sum(explode(' ', microtime()));

$result = curl_exec($ch); //executes our xml request

$stop = array_sum(explode(' ', microtime()));

$totalTime = $stop - $start;

 

// It is always a good practice to check for errors

// In my case, i just output them for visual review/debugging

if ( curl_errno($ch) )

{

$result = 'ERROR -> ' . curl_errno($ch) . ': ' . curl_error($ch);

}

else

{

$returnCode = (int)curl_getinfo($ch, CURLINFO_HTTP_CODE); //checks for url errors

switch($returnCode){

case 404:

$result = 'ERROR -> 404 Not Found';

break;

default:

break;

}

}

 

// Once the execution is done close the handle

curl_close($ch);

 

// (Optional) Output the results and time

echo '</br>';

echo 'Total time for request: ' . $totalTime . "\n";

echo "Submit Result" .$result;

 

 

//*****************************************************

// 3. After we have sucessfuly sent the request over to

// the API, we need to receive a result. Usually, result

// is available several seconds later, se we would need

// to query the API with received job ID. Or, you can

// build a webpage that will alert you once the links

// to the new generated files are ready.

//*****************************************************

 

//Let's start by reading the xml we received as a result

// of executing curl xml request

$dom = new DOMDocument;

$dom->loadXML($result);

$xml = simplexml_import_dom($dom); // we are using simplexml here

$jobURL = $xml->JobURL; // a job URL has been obtained

 

// Now that we have URL for our pending job,

// we will querry it untill we recieve a status

// indicating that the job is done.

$i=0;

$status =2;

while($status>0){ //status = 0, means we are done.

$reader =  new XMLReader();  //we are using XMLReader function

$reader->open($jobURL);

while($reader->read()){ // if we have more than one URI, this while will take care of it.

if($reader->nodeType == XMLReader::ELEMENT && $reader->name == 'Status'){ //find status xml element

$reader->read(); // go to its value

if($reader->value=='Finished'){ //if the status is finished, we will read one more time to receive URI(s)

$status= $status -1; // indicates that we are ready to read URI, since the job is finished

}

}

if($reader->nodeType == XMLReader::ELEMENT && $reader->name == 'Uri' && $status==1){ //This is where the link to the new file(s) is located

$reader->read();

$url[$i] =  $reader->value; // we are creating array, if necessary, for storing URI

$i++;

}

}

sleep(3); // wait 3 seconds before doing another querry

}

$reader->close(); // close the xml reader

// (Optional) Output the URI of files you have requested

// If you requested more or less filetypes, change output accordingly

echo '<br/> JobTXT: ' . $url[0];

echo '<br/> JobPDF: ' . $url[1];

?>

 

URL Example:

HTTP POST to http://svc.webservius.com/v1/wisetrend/wiseocr/submit?wsvKey={your_key}

Message body example (minimum):

<Job>
<InputURL> http://www.wisetrend.com/online_ocr/english_photo_bw.tif </InputURL>
</Job>

Message body example (enhanced):

<Job>
<InputURL> http://www.wisetrend.com/online_ocr/english_photo_bw.tif</InputURL>
<InputType>TIF</InputType>
<NotifyURL>http://www.example.com/ocrNotify.php</NotifyURL>
<OCRSettings>
<OCRLanguage>English</OCRLanguage>
</OCRSettings>
</Job>

Message body example (full):

<Job>
<InputURL>http://www.wisetrend.com/online_ocr/english_photo_bw.tif</InputURL>
<InputType>TIF</InputType>
<CleanupSettings>
<Deskew>true</Deskew>
<RemoveGarbage>true</RemoveGarbage>
<RemoveTexture>true</RemoveTexture>
<RotationType>Automatic</RotationType>
</CleanupSettings>
<OCRSettings>
<PrintType>Print</PrintType>
<OCRLanguage>English</OCRLanguage>
<SpeedOCR>false</SpeedOCR>
<AnalysisMode>MixedDocument</AnalysisMode>
<LookForBarcodes>true</LookForBarcodes>
</OCRSettings>
<OutputSettings>
<ExportFormat>Text</ExportFormat>
</OutputSettings>
</Job>

Response example:

<JobStatus>
<JobURL>http://webocr.wisetrend.com/ocr/getStatus/123ABCResponseExample123ABC</JobURL>
<Status>Submitted</Status>
</JobStatus>

 

using System;

using System.Collections.Generic;

using System.Linq;

using System.Xml.Linq;

using System.Text;

using System.Net;

using System.Threading;

 

namespace OCRSampleClient

{

class Program

{

static void Main(string[] args)

{

string imageUrl = " http://www.wisetrend.com/online_ocr/english_photo_bw.tif";

string key = "your_key_here";

 

XElement xe = new XElement("Job",

new XElement("InputURL", imageUrl)

);

 

WebClient wc = new WebClient();

wc.Headers[HttpRequestHeader.ContentType] = "text/xml";

ServicePointManager.Expect100Continue = false;

 

string results = wc.UploadString(

"https://svc.webservius.com/v1/wisetrend/wiseocr/submit?wsvKey=" + key,

"POST",

xe.ToString()

);

 

string textURL = null;

 

//Use "dumb polling" every 2 seconds in this simplified example, until done.

//In a real application, notification with NotifyURL should be used instead.

while (true)

{

XElement resultsXml = XElement.Parse(results);

string jobURL = resultsXml.Element("JobURL").Value;

string status = resultsXml.Element("Status").Value;

 

if (status == "Finished")

{

textURL = (

from elem in resultsXml.Element("Download").Elements("File")

where elem.Element("OutputType").Value == "TXT"

select elem.Element("Uri").Value

).First();

break;

}

 

//Exit if there's an error

if ((status != "Submitted") && (status != "Processing")) break;

 

Thread.Sleep(2000); // 2 seconds

results = wc.DownloadString(jobURL);

}

 

if (textURL == null)

{

Console.WriteLine("An error has occurred");

}

else

{

string text = wc.DownloadString(textURL);

Console.WriteLine(text);

}

Console.WriteLine("PRESS ENTER TO EXIT");

Console.ReadLine();

}

}

}

 

"""
Use WiseTREND OCR cloud from Python.
Example:
import wisetrend
import time
joburl = wisetrend.submit("API-KEY", "http://url/to/the/document")
while True:
time.sleep(5)
status, downloads = wisetrend.getstatus(joburl)
print status
if status == "Finished":
for outputtype, url in downloads.iteritems():
print outputtype, url
break
Credits:
Copyright (c) 2011 Bernt R. Brenna
Licensed under the MIT license: http://www.opensource.org/licenses/mit-license.php
"""
import urllib2
from xml.etree import ElementTree as E
class SubmitError(Exception):
    pass
URL = "http://svc.webservius.com/v1/wisetrend/wiseocr/submit?wsvKey="
def submit(key, inputurl):
    """Returns the job url"""
    job_element = E.Element("Job")
    E.SubElement(job_element, "InputURL").text = inputurl
    request = urllib2.Request(URL + key, data=E.tostring(job_element), headers={
        "Content-Type": "text/xml"
        })
    try:
        response = urllib2.urlopen(request)
    except urllib2.HTTPError as e:
        raise SubmitError(e.read())
    else:
        jobstatus = E.fromstring(response.read())
        if jobstatus.find("Status").text == "Submitted":
            return jobstatus.find("JobURL").text
        else:
            raise SubmitError(E.tostring(jobstatus))
def getstatus(joburl):
    """Returns a tuple (Status, {"OutputType": downloaduri})
If the status is not Finished, returns (Status, {})
"""
    jobstatus = E.fromstring(urllib2.urlopen(joburl).read())
    if jobstatus.find("Status").text == "Finished":
        return "Finished", dict((file_element.find("OutputType").text, file_element.find("Uri").text)
                                for file_element in jobstatus.findall("Download/File"))
    else:
        return jobstatus.find("Status").text, {}
if __name__ == "__main__":
    from optparse import OptionParser
    parser = OptionParser(usage="\n1. %prog submit --key secretkey --inputurl inputurl\n2. %prog getstatus --joburl joburl")
    parser.add_option("--key")
    parser.add_option("--inputurl")
    parser.add_option("--joburl")
    opts, args = parser.parse_args()
    if len(args) != 1:
        parser.error("Specify submit or getstatus")
    if args[0] == "submit":
        joburl = submit(opts.key, opts.inputurl)
        print "Job URL:", joburl
    if args[0] == "getstatus":
        status, downloads = getstatus(opts.joburl)
        print "Status:", status
        for outputtype, url in downloads.iteritems():
            print outputtype, url

 

// *******************************************************

//       Contributor: Vinothkumar Arputharaj

//       Contact: vino4all@gmail.com

// *******************************************************

import org.apache.http.HttpHost;

import org.apache.http.HttpResponse;

import org.apache.http.client.methods.HttpGet;

import org.apache.http.client.methods.HttpPost;

import org.apache.http.client.params.ClientPNames;

import org.apache.http.client.params.CookiePolicy;

import org.apache.http.conn.params.ConnRoutePNames;

import org.apache.http.entity.StringEntity;

import org.apache.http.impl.client.DefaultHttpClient;

import org.apache.http.params.BasicHttpParams;

import org.apache.http.params.HttpConnectionParams;

import org.apache.http.params.HttpParams;

import org.apache.http.util.EntityUtils;

 

 

public class OCRRestClient {

 

private static int proxyPort = 80;                  // Assign your Proxy Port here

private static String proxyHost="myproxyhost.com";  // Assign your Proxy Host here

 

 

public static void main(String[] args) {

 

HttpParams myParams = new BasicHttpParams();// variable to add additional parameters to the httpClient

 

// The next two lines are necessary to avoid "Connection Timeout" Exception

HttpConnectionParams.setConnectionTimeout(myParams, 10000);

HttpConnectionParams.setSoTimeout(myParams, 10000);

 

// Create a client object

DefaultHttpClient httpClient = new DefaultHttpClient(myParams);

 

HttpHost proxy = new HttpHost(proxyHost, proxyPort);                                                                     // For connections that need proxy

 

httpClient.getParams().setParameter(ConnRoutePNames.DEFAULT_PROXY,proxy);// Set proxy to the http client

 

try {

String ret = null;        // Actual response as string

httpClient.getParams().setParameter(ClientPNames.COOKIE_POLICY,CookiePolicy.RFC_2109);

String key = "your key";  // Enter your secret key obtained from WiseTrend

 

// Image URL. This needs to be a "http" / "https" url.

String imageURL = " http://www.wisetrend.com/online_ocr/english_photo_bw.tif";

 

HttpResponse response  = null;    // Response object

 

HttpPost httppost = new HttpPost("http://svc.webservius.com/v1/wisetrend/wiseocr/submit?wsvKey="+key);

 

// Set the content type of the request

httppost.setHeader("Content-Type","text/xml");

 

// Set the request parameter as Entity. See the documentation for detailed request entity.

httppost.setEntity(new StringEntity("<Job>" +

"<InputURL>"+imageURL+"</InputURL>"+

"</Job>"));

 

// Execute the post request

response = httpClient.execute(httppost );

 

System.out.println(response.toString()); // Printing the response

 

if (response != null) {

ret = EntityUtils.toString(response.getEntity());

 

System.out.println("Response: "+ret); // Printing the response Entity

 

// Create staXParser object to pare the response string (which is in xml format)

StaXParser stp = new StaXParser();

 

stp.readXMLStream(ret.toString());         // Parse the response string

 

System.out.println("Final status : "+StaXParser.STATUS); // Printing the status

 

if(StaXParser.STATUS.equalsIgnoreCase("Submitted"))

{

// Send GET request to get the URI of the output file

HttpGet httpget = new HttpGet(StaXParser.JOBURL);

 

System.out.println("Sending GET Request");

while(true)

{

HttpResponse response1 = httpClient.execute(httpget);

 

if(response1 != null)

{

ret = EntityUtils.toString(response1.getEntity());

 

System.out.println("RES 1 : "+ret);

// Printing response for GET request

 

StaXParser stp1 = new StaXParser();

stp1.readXMLStream(ret);

// Parse the response from GET request

 

System.out.println("GET Job status : "+StaXParser.STATUS);                                                                    //Printing Job Status

 

// Rest of the code is self explanatory

if(StaXParser.STATUS.equalsIgnoreCase("Finished"))

{

String tempFileURI = null;

 

for(int p=0; p<StaXParser.URIARRAY.length; p++)

{

if(StaXParser.FILETYPEARRAY[p] != null && StaXParser.FILETYPEARRAY[p].equalsIgnoreCase("TXT"))

{

System.out.println(StaXParser.URIARRAY[p]+"\n"+StaXParser.FILETYPEARRAY[p]);

// Printing the output "Txt" file path & type

tempFileURI = StaXParser.URIARRAY[p];

}

}

System.out.println("**********************************************");

ReadTextFile.readFile(tempFileURI);

System.out.println("**********************************************");

System.exit(1);

 

}

}

}

}

}

}catch(Exception e){

e.printStackTrace();

}

}

 

}

 

Security and privacy are among our top priorities. Terms of Use and Privacy Policy provide additional legal information. On the technology level, our platform is designed with multiple security layers in mind. Some of them are:

  • Upon subscribing to online services, each developer receives an auto-generated long alpha-numeric value as their account ID. That ID is used for submissions instead of any developer-identifying private information.
  • Trusted secure connection (HTTPS) is available with every subscription. This is a setting that each developer can select between HTTP or HTTPS in their Subscription Portal that affects transmission type.
  • Any submitted document is auto-encoded with a long alpha-numeric randomly generated value.
    For example: 8d4227c3eb5d4744bc0ba7b8a67bfe37.PDF
  • Any submitted job request is auto-encoded with a long alpha-numeric randomly generated value (different from the document name). Job ID does not carry any external information about owner or submitter.
    For example: http://api.ocr-it.com/ocr/v2/getStatus/b727fd6e7a4241ebb1bbc01e74a4a135
  • The entire system is automated and operates without human operator involvement. A small group of highly controlled internal developers has access to different separate parts of the system relevant to that developer’s activities.
  • The platform has auto-cleanup features. By default, an image and its conversion result will be available on the server for seven (7) days from the conversion date. After expiration of 7 days all images and results associated with that job request get automatically deleted. The job ID and valid URL to that job remain in internal database in case it needs to be referenced in the future.
  • Immediate on-demand auto-cleanup. Using API, developers may choose to remove results of their conversion before the seven (7) day default expiration period. An API call allows deleting images and conversion results instantaneously as that call is made, for example as soon as result finished downloading. This capability allows storing your sensitive data only for seconds at a time, as necessary for processing.

If there are other questions, concerns or ideas about security of OCR Cloud 2.0 API, please contact our support team at support@ocr-it.com.