FAQ
Don’t see an answer to your question? Send your questions to support@ocr-it.com.
OCR Cloud 2.0 API
Input Formats
PDF TIF JPG PNG BMP
Output Formats
JPG, TIF, PDF Image export Searchable PDF
Text under image
Text over image
PDF/A
PDF Compressed with Custom Security/Metadata/Tags
TXT (standard or Unicode)
DOC / RTF
XLS
XML
HTML
Multiple output streams available
<?php
//************************************************
// 1. First, we need to build an XML string to send
// to the API. In the example below, the sring is
// built in a very straight forward way.
//************************************************
// Lets start with adding to our string input URL and file type, which is
// the file you want to process
$filename = ' http://www.wisetrend.com/online_ocr/english_photo_bw.tif';
$inputURL = '<InputURL>' . $filename . '</InputURL>';
$inputTYPE = '<InputType>TIF</InputType>';
// If you want to build a page to where the result would
// be proccessed, you need to specify the Notify URL
// Uncomment the next line if you want to use the feature
// and dont forget to add it to a job string down below
//$notifyURL = '<NotifyURL>http://www.example.com/ocrNotify.php</NotifyURL>';
//(Optional)Next step is to add cleanup options.
$cleanup = '<CleanupSettings>';
$cleanup .='<Deskew>false</Deskew>';
$cleanup.='<RemoveGarbage>true</RemoveGarbage>';
$cleanup.='<RemoveTexture>false</RemoveTexture>';
$cleanup .='<RotationType>Automatic</RotationType>';
$cleanup .='</CleanupSettings>';
// (Optional)After clean up options, we are going to indicate
// OCR setting for your file
$settings='<OCRSettings>';
$settings.='<PrintType>Print;Typewriter</PrintType>'; //I have indicated multiple print types separated by ';'
$settings.='<OCRLanguage>English;Danish</OCRLanguage>'; //and again, multiple items separated by ';'\
$settings.='<SpeedOCR>true</SpeedOCR>';
$settings.='<AnalysisMode>TextAggressive</AnalysisMode>';
$settings.='<LookForBarcodes>false</LookForBarcodes>';
$settings.='</OCRSettings>';
// (Optional) You can also specify a format is which you want to receive
// you processed file in
$output='<OutputSettings>';
$output.='<ExportFormat>Text;PDF</ExportFormat>';
$output.='</OutputSettings>';
// Now create a job with all the settings you have specified
$job = '<Job>' . $inputURL . $inputTYPE . $cleanup . $settings . $output .'</Job>';
//****************************************************
// 2. Now that we have made our upload string with all
// the settings we desire, we are going to create an
// xml request using curl.
//****************************************************
// First, we are going to define POST URL
$key = 'your_key_here';
define('XML_REQUEST', $job); //this is the actual request to be sent
define('XML_POST_URL', 'http://svc.webservius.com/v1/wisetrend/wiseocr/submit?wsvKey=' . $key ); //this is where we want it to be sent
// Then we will initialize handle and set request options
$header[] = 'Content-type: text/xml';
$header[] = 'Connection: close';
// Now, we will use defined settings above to create an xml
// request using curl
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, XML_POST_URL);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_TIMEOUT, 4);
curl_setopt($ch, CURLOPT_POSTFIELDS, XML_REQUEST);
curl_setopt($ch, CURLOPT_HTTPHEADER, $header);
// After we have created the request we will
// execute it (Optional) and also time the transaction
$start = array_sum(explode(' ', microtime()));
$result = curl_exec($ch); //executes our xml request
$stop = array_sum(explode(' ', microtime()));
$totalTime = $stop - $start;
// It is always a good practice to check for errors
// In my case, i just output them for visual review/debugging
if ( curl_errno($ch) )
{
$result = 'ERROR -> ' . curl_errno($ch) . ': ' . curl_error($ch);
}
else
{
$returnCode = (int)curl_getinfo($ch, CURLINFO_HTTP_CODE); //checks for url errors
switch($returnCode){
case 404:
$result = 'ERROR -> 404 Not Found';
break;
default:
break;
}
}
// Once the execution is done close the handle
curl_close($ch);
// (Optional) Output the results and time
echo '</br>';
echo 'Total time for request: ' . $totalTime . "\n";
echo "Submit Result" .$result;
//*****************************************************
// 3. After we have sucessfuly sent the request over to
// the API, we need to receive a result. Usually, result
// is available several seconds later, se we would need
// to query the API with received job ID. Or, you can
// build a webpage that will alert you once the links
// to the new generated files are ready.
//*****************************************************
//Let's start by reading the xml we received as a result
// of executing curl xml request
$dom = new DOMDocument;
$dom->loadXML($result);
$xml = simplexml_import_dom($dom); // we are using simplexml here
$jobURL = $xml->JobURL; // a job URL has been obtained
// Now that we have URL for our pending job,
// we will querry it untill we recieve a status
// indicating that the job is done.
$i=0;
$status =2;
while($status>0){ //status = 0, means we are done.
$reader = new XMLReader(); //we are using XMLReader function
$reader->open($jobURL);
while($reader->read()){ // if we have more than one URI, this while will take care of it.
if($reader->nodeType == XMLReader::ELEMENT && $reader->name == 'Status'){ //find status xml element
$reader->read(); // go to its value
if($reader->value=='Finished'){ //if the status is finished, we will read one more time to receive URI(s)
$status= $status -1; // indicates that we are ready to read URI, since the job is finished
}
}
if($reader->nodeType == XMLReader::ELEMENT && $reader->name == 'Uri' && $status==1){ //This is where the link to the new file(s) is located
$reader->read();
$url[$i] = $reader->value; // we are creating array, if necessary, for storing URI
$i++;
}
}
sleep(3); // wait 3 seconds before doing another querry
}
$reader->close(); // close the xml reader
// (Optional) Output the URI of files you have requested
// If you requested more or less filetypes, change output accordingly
echo '<br/> JobTXT: ' . $url[0];
echo '<br/> JobPDF: ' . $url[1];
?>
URL Example:
HTTP POST to http://svc.webservius.com/v1/wisetrend/wiseocr/submit?wsvKey={your_key}
Message body example (minimum):
<Job>
<InputURL> http://www.wisetrend.com/online_ocr/english_photo_bw.tif </InputURL>
</Job>
Message body example (enhanced):
<Job>
<InputURL> http://www.wisetrend.com/online_ocr/english_photo_bw.tif</InputURL>
<InputType>TIF</InputType>
<NotifyURL>http://www.example.com/ocrNotify.php</NotifyURL>
<OCRSettings>
<OCRLanguage>English</OCRLanguage>
</OCRSettings>
</Job>
Message body example (full):
<Job>
<InputURL>http://www.wisetrend.com/online_ocr/english_photo_bw.tif</InputURL>
<InputType>TIF</InputType>
<CleanupSettings>
<Deskew>true</Deskew>
<RemoveGarbage>true</RemoveGarbage>
<RemoveTexture>true</RemoveTexture>
<RotationType>Automatic</RotationType>
</CleanupSettings>
<OCRSettings>
<PrintType>Print</PrintType>
<OCRLanguage>English</OCRLanguage>
<SpeedOCR>false</SpeedOCR>
<AnalysisMode>MixedDocument</AnalysisMode>
<LookForBarcodes>true</LookForBarcodes>
</OCRSettings>
<OutputSettings>
<ExportFormat>Text</ExportFormat>
</OutputSettings>
</Job>
Response example:
<JobStatus>
<JobURL>http://webocr.wisetrend.com/ocr/getStatus/123ABCResponseExample123ABC</JobURL>
<Status>Submitted</Status>
</JobStatus>
using System;
using System.Collections.Generic;
using System.Linq;
using System.Xml.Linq;
using System.Text;
using System.Net;
using System.Threading;
namespace OCRSampleClient
{
class Program
{
static void Main(string[] args)
{
string imageUrl = " http://www.wisetrend.com/online_ocr/english_photo_bw.tif";
string key = "your_key_here";
XElement xe = new XElement("Job",
new XElement("InputURL", imageUrl)
);
WebClient wc = new WebClient();
wc.Headers[HttpRequestHeader.ContentType] = "text/xml";
ServicePointManager.Expect100Continue = false;
string results = wc.UploadString(
"https://svc.webservius.com/v1/wisetrend/wiseocr/submit?wsvKey=" + key,
"POST",
xe.ToString()
);
string textURL = null;
//Use "dumb polling" every 2 seconds in this simplified example, until done.
//In a real application, notification with NotifyURL should be used instead.
while (true)
{
XElement resultsXml = XElement.Parse(results);
string jobURL = resultsXml.Element("JobURL").Value;
string status = resultsXml.Element("Status").Value;
if (status == "Finished")
{
textURL = (
from elem in resultsXml.Element("Download").Elements("File")
where elem.Element("OutputType").Value == "TXT"
select elem.Element("Uri").Value
).First();
break;
}
//Exit if there's an error
if ((status != "Submitted") && (status != "Processing")) break;
Thread.Sleep(2000); // 2 seconds
results = wc.DownloadString(jobURL);
}
if (textURL == null)
{
Console.WriteLine("An error has occurred");
}
else
{
string text = wc.DownloadString(textURL);
Console.WriteLine(text);
}
Console.WriteLine("PRESS ENTER TO EXIT");
Console.ReadLine();
}
}
}
"""Use WiseTREND OCR cloud from Python.Example:import wisetrendimport timejoburl = wisetrend.submit("API-KEY", "http://url/to/the/document")while True:time.sleep(5)status, downloads = wisetrend.getstatus(joburl)print statusif status == "Finished":for outputtype, url in downloads.iteritems():print outputtype, urlbreakCredits:Copyright (c) 2011 Bernt R. BrennaLicensed under the MIT license: http://www.opensource.org/licenses/mit-license.php"""import urllib2from xml.etree import ElementTree as Eclass SubmitError(Exception):passURL = "http://svc.webservius.com/v1/wisetrend/wiseocr/submit?wsvKey="def submit(key, inputurl):"""Returns the job url"""job_element = E.Element("Job")E.SubElement(job_element, "InputURL").text = inputurlrequest = urllib2.Request(URL + key, data=E.tostring(job_element), headers={"Content-Type": "text/xml"})try:response = urllib2.urlopen(request)except urllib2.HTTPError as e:raise SubmitError(e.read())else:jobstatus = E.fromstring(response.read())if jobstatus.find("Status").text == "Submitted":return jobstatus.find("JobURL").textelse:raise SubmitError(E.tostring(jobstatus))def getstatus(joburl):"""Returns a tuple (Status, {"OutputType": downloaduri})If the status is not Finished, returns (Status, {})"""jobstatus = E.fromstring(urllib2.urlopen(joburl).read())if jobstatus.find("Status").text == "Finished":return "Finished", dict((file_element.find("OutputType").text, file_element.find("Uri").text)for file_element in jobstatus.findall("Download/File"))else:return jobstatus.find("Status").text, {}if __name__ == "__main__":from optparse import OptionParserparser = OptionParser(usage="\n1. %prog submit --key secretkey --inputurl inputurl\n2. %prog getstatus --joburl joburl")parser.add_option("--key")parser.add_option("--inputurl")parser.add_option("--joburl")opts, args = parser.parse_args()if len(args) != 1:parser.error("Specify submit or getstatus")if args[0] == "submit":joburl = submit(opts.key, opts.inputurl)print "Job URL:", joburlif args[0] == "getstatus":status, downloads = getstatus(opts.joburl)print "Status:", statusfor outputtype, url in downloads.iteritems():print outputtype, url
// *******************************************************
// Contributor: Vinothkumar Arputharaj
// Contact: vino4all@gmail.com
// *******************************************************
import org.apache.http.HttpHost;
import org.apache.http.HttpResponse;
import org.apache.http.client.methods.HttpGet;
import org.apache.http.client.methods.HttpPost;
import org.apache.http.client.params.ClientPNames;
import org.apache.http.client.params.CookiePolicy;
import org.apache.http.conn.params.ConnRoutePNames;
import org.apache.http.entity.StringEntity;
import org.apache.http.impl.client.DefaultHttpClient;
import org.apache.http.params.BasicHttpParams;
import org.apache.http.params.HttpConnectionParams;
import org.apache.http.params.HttpParams;
import org.apache.http.util.EntityUtils;
public class OCRRestClient {
private static int proxyPort = 80; // Assign your Proxy Port here
private static String proxyHost="myproxyhost.com"; // Assign your Proxy Host here
public static void main(String[] args) {
HttpParams myParams = new BasicHttpParams();// variable to add additional parameters to the httpClient
// The next two lines are necessary to avoid "Connection Timeout" Exception
HttpConnectionParams.setConnectionTimeout(myParams, 10000);
HttpConnectionParams.setSoTimeout(myParams, 10000);
// Create a client object
DefaultHttpClient httpClient = new DefaultHttpClient(myParams);
HttpHost proxy = new HttpHost(proxyHost, proxyPort); // For connections that need proxy
httpClient.getParams().setParameter(ConnRoutePNames.DEFAULT_PROXY,proxy);// Set proxy to the http client
try {
String ret = null; // Actual response as string
httpClient.getParams().setParameter(ClientPNames.COOKIE_POLICY,CookiePolicy.RFC_2109);
String key = "your key"; // Enter your secret key obtained from WiseTrend
// Image URL. This needs to be a "http" / "https" url.
String imageURL = " http://www.wisetrend.com/online_ocr/english_photo_bw.tif";
HttpResponse response = null; // Response object
HttpPost httppost = new HttpPost("http://svc.webservius.com/v1/wisetrend/wiseocr/submit?wsvKey="+key);
// Set the content type of the request
httppost.setHeader("Content-Type","text/xml");
// Set the request parameter as Entity. See the documentation for detailed request entity.
httppost.setEntity(new StringEntity("<Job>" +
"<InputURL>"+imageURL+"</InputURL>"+
"</Job>"));
// Execute the post request
response = httpClient.execute(httppost );
System.out.println(response.toString()); // Printing the response
if (response != null) {
ret = EntityUtils.toString(response.getEntity());
System.out.println("Response: "+ret); // Printing the response Entity
// Create staXParser object to pare the response string (which is in xml format)
StaXParser stp = new StaXParser();
stp.readXMLStream(ret.toString()); // Parse the response string
System.out.println("Final status : "+StaXParser.STATUS); // Printing the status
if(StaXParser.STATUS.equalsIgnoreCase("Submitted"))
{
// Send GET request to get the URI of the output file
HttpGet httpget = new HttpGet(StaXParser.JOBURL);
System.out.println("Sending GET Request");
while(true)
{
HttpResponse response1 = httpClient.execute(httpget);
if(response1 != null)
{
ret = EntityUtils.toString(response1.getEntity());
System.out.println("RES 1 : "+ret);
// Printing response for GET request
StaXParser stp1 = new StaXParser();
stp1.readXMLStream(ret);
// Parse the response from GET request
System.out.println("GET Job status : "+StaXParser.STATUS); //Printing Job Status
// Rest of the code is self explanatory
if(StaXParser.STATUS.equalsIgnoreCase("Finished"))
{
String tempFileURI = null;
for(int p=0; p<StaXParser.URIARRAY.length; p++)
{
if(StaXParser.FILETYPEARRAY[p] != null && StaXParser.FILETYPEARRAY[p].equalsIgnoreCase("TXT"))
{
System.out.println(StaXParser.URIARRAY[p]+"\n"+StaXParser.FILETYPEARRAY[p]);
// Printing the output "Txt" file path & type
tempFileURI = StaXParser.URIARRAY[p];
}
}
System.out.println("**********************************************");
ReadTextFile.readFile(tempFileURI);
System.out.println("**********************************************");
System.exit(1);
}
}
}
}
}
}catch(Exception e){
e.printStackTrace();
}
}
}
Security and privacy are among our top priorities. Terms of Use and Privacy Policy provide additional legal information. On the technology level, our platform is designed with multiple security layers in mind. Some of them are:
- Upon subscribing to online services, each developer receives an auto-generated long alpha-numeric value as their account ID. That ID is used for submissions instead of any developer-identifying private information.
- Trusted secure connection (HTTPS) is available with every subscription. This is a setting that each developer can select between HTTP or HTTPS in their Subscription Portal that affects transmission type.
- Any submitted document is auto-encoded with a long alpha-numeric randomly generated value.
For example: 8d4227c3eb5d4744bc0ba7b8a67bfe37.PDF - Any submitted job request is auto-encoded with a long alpha-numeric randomly generated value (different from the document name). Job ID does not carry any external information about owner or submitter.
For example: http://api.ocr-it.com/ocr/v2/getStatus/b727fd6e7a4241ebb1bbc01e74a4a135 - The entire system is automated and operates without human operator involvement. A small group of highly controlled internal developers has access to different separate parts of the system relevant to that developer’s activities.
- The platform has auto-cleanup features. By default, an image and its conversion result will be available on the server for seven (7) days from the conversion date. After expiration of 7 days all images and results associated with that job request get automatically deleted. The job ID and valid URL to that job remain in internal database in case it needs to be referenced in the future.
- Immediate on-demand auto-cleanup. Using API, developers may choose to remove results of their conversion before the seven (7) day default expiration period. An API call allows deleting images and conversion results instantaneously as that call is made, for example as soon as result finished downloading. This capability allows storing your sensitive data only for seconds at a time, as necessary for processing.
If there are other questions, concerns or ideas about security of OCR Cloud 2.0 API, please contact our support team at support@ocr-it.com.
