[Promotion-technology] Fw: Google and open source OCR, by T.V. Raman

Robert Jaquiss rjaquiss at earthlink.net
Thu Jun 21 23:31:47 CDT 2007


Hello Colleagues:

     I thought this would be of interest. I presume that since Emacspeak is 
mentioned that OCRopus runs on a Unix type platform.


Regards,

Robert

----- Original Message ----- 
From: "BlindNews Mailing List" <blindnews at blindprogramming.com>
To: <BlindNews at BlindProgramming.com>
Sent: Thursday, June 21, 2007 8:54 PM
Subject: Google and open source OCR, by T.V. Raman


> The Official Google Blog
> Thursday, June 21, 2007
>
> Google and open source OCR, by T.V. Raman
>
> By T.V. Raman, Research Scientist
>
>>From time to time, our own T.V. Raman shares his tips on how to use Google 
>>from his perspective as a technologist who cannot see -- tips that sighted 
>>people, among others, may also find useful. - Ed.
>
> As someone who cannot see, I prefer to live in a mostly paperless world. 
> This means ruthlessly turning every piece of paper that enters my life 
> into a set of bits that I can process digitally. I scan in everything. 
> Until now, I have relied on commercial OCR packages to convert these 
> images into readable text. OCR is perhaps one of the areas where the 
> benefits of Moore's Law are most evident; today, OCR can do remarkably 
> well when handed a page image. Until now, my only dissatisfaction with the 
> status quo in this area has been that commercial OCR engines afford me 
> little flexibility with respect to training them to do better on documents 
> that are specific to me.
>
> The advent of our own open source OCR initiative, OCRopus (source code: 
> Ocropus Sources) is a welcome change in this regard.
>
> LINK:
> http://code.google.com/p/ocropus/
>
> I introduced support for OCRopus in Emacspeak recently, and the HTML 
> output this produces compares favorably with output from commercial OCR 
> engines, provided you place the page at the right orientation on the 
> scanner.
>
> LINK:
> http://code.google.com/p/emacspeak
>
> OCRopus' extensibility, and the ability to express the OCR as a structured 
> HTML document makes it an ideal starting point for producing rich spoken 
> output. The possibilities are enormous for people being able to 
> collectively train, customize and improve an OCR engine.
>
> 6/21/2007 09:11:00 AM
> Posted by T.V. Raman, Research Scientist
>
>
>
> http://googleblog.blogspot.com/2007/06/google-and-open-source-ocr.html
>
> --
> BlindNews mailing list
>
> To contact a list moderator about a problem or to make a request, send a 
> message to BlindNews-Owner at BlindProgramming.com
>
> The BlindNews list is archived at: http://GeoffAndWen.com/blind/
>
> To address a message to all members of the list, send mail to: 
> BlindNews at blindprogramming.com
>
> Access your subscription info at: 
> http://blindprogramming.com/mailman/listinfo/blindnews_blindprogramming.com
>
> To unsubscribe via e-mail: send a message to 
> BlindNews-Request at BlindProgramming.com with the word unsubscribe in either 
> the subject or body of the message 



More information about the Promotion-technology mailing list