Follow us on
Connect
More on TechnologyTell: Gadget News | Apple News

Posts tagged ocr

Lego Mindstorms + Kindle + Laptop = E-book Scanner
September 7, 2013 | 6:47 pm

Fullscreen capture 972013 54201 PMPaging Rube Goldberg… Found via BoingBoing, Arik Hesseldahl has a report at AllThingsD about an Austrian university professor who has used a Lego Mindstorm kit to hack together an e-book de-DRM scanner out of his Kindle and his laptop. Professor Peter Pergathofer built a Lego device that keys the page down button on the Kindle, then the space bar on the computer, to take a picture of one Kindle page at a time. The computer then submits the picture to a text recognition service to OCR it into a text file. Pergathofer created the project to protest against...

The 5 Steps of Intelligent Proofreading
December 21, 2012 | 10:07 pm

Over the years I’ve scanned and OCR’ed many printed books into electronic form for Gutenberg Australia—most of the Edgar Wallace collection there is my work, for instance—and during that time it’s become clear that not all typos are equal. After awhile, in fact, it became possible for me to divide typos into categories, as follows: Category 1: Typos due to English orthography Some letter sequences in English serif text happen to resemble others. The sequence ‘of her’, for instance, looks very much like ‘other’, and ‘thing’ looks very much like ‘tiling’. Every second or third book I scanned had these mistakes in it...

Embarrassing e-book typo proves ‘shift’ happens
September 13, 2011 | 5:15 am

im-yours_276I had thought that I wouldn’t find an e-book typo more hilarious than “the next Jew chapters” or “arroz con polio” from the Young Wizards series. But The Guardian Books blog has found what may very well be one of the greatest typos of all time, in Susan Andersen’s novel Baby, I’m Yours. The passage in question in the e-book was supposed to read, “He stiffened for a moment but then she felt his muscles loosen as he shifted on the ground.” [emphasis mine] However, the accidental change of a “f” to a “t” (presumably in the OCR process;...

Young Wizards e-book errors to be fixed by publisher, thanks to reader feedback
March 30, 2011 | 12:26 pm

Diane Duane has posted an update to her blog on the error correction issue with Young Wizards e-books. She contacted her editor, who contacted the digital editions department at her publisher, and she’s received a response from them that they have developed a new error-correction process that looks specifically for commonly-occurring OCR errors and eliminates them at the XML level (so that corrected e-books can be generated in multiple formats from the new source material.). They would like to run the books through this process. Then, Diane can go back through and look to see what errors still exist, which...

Ongoing publisher inattention to e-book quality is highly annoying
March 13, 2011 | 8:35 pm

NookReview 012 Update: Diane Duane has written an interesting piece in response, that I will cover in full when I have time. I was going to bring this up in my review of the Nook Reader app, but realized that doing so would be putting the blame in the wrong place. When I was reading the Young Wizards series by Diane Duane on the Nook Reader, I ran into this particularly egregious typo in the first chapter of So You Want to Be a Wizard. “The reader is invited to examine the next Jew chapters…” And it was not the last. Characters might enjoy meals...

Publisher pricing and quality issues make piracy more attractive
March 9, 2011 | 1:13 am

Audrey Watters at ReadWriteWeb takes a look at the contentious issue of e-book vs. paper pricing and whether it is likely to promote piracy. Mentioning Random House’s decision to join the agency pricing crowd, and the ongoing anti-trust investigation in Europe, she links to a Reddit thread discussing examples of e-books priced higher than their paperback or hardcover versions. The Reddit thread is kicked off by one person complaining about the prices on these books (“I love the kindle but this pricing stuff right now is making me question all of it. I have a hard time placing...

New Google Books tool traces use of words over time
December 19, 2010 | 2:51 pm

screenGoogle Books isn’t just an e-book store. It’s a pile of data, waiting to be mined. And while the metadata on many of the books in Google’s database may not be in the best of shape, enough books have good metadata that they can be used for some fairly interesting projects. Ars Technica has the story on one of these. A group of Harvard researchers created a tool that could be used to trace the usage of words or phrases in books over the last few centuries. And what’s more, Google has made the tool publicly available via a...

Haptic Braille device could let blind read print books in braille
December 10, 2010 | 2:12 pm

haptic-brailleAnyone who has seen the movie Sneakers is familiar with the idea of braille screens for reading by the blind. In the real world, however, braille screens are gimmicky, expensive, non-portable devices prone to mechanical failure, and I am told most blind computer users make do with speech-synthesizers instead (be it on their computer, or via hand-held devices like the Intel or LookTel gadgets I’ve mentioned before). But speech-synth does have some drawbacks, especially for reading a book—the voice can be annoying, for one thing, getting in the way of immersion into the book (which is why I...

ABBYY FineReader Express, a phone-camera-compatible OCR tool
September 10, 2010 | 7:15 am

adxgetmedia Mediabistro’s GalleyCat has a post about ABBYY FineReader Express, an OCR program that can even use cell phone cameras (though for best results, a 5 megapixel version is recommended, which would seem to limit it to the iPhone 4’s camera). The post mentions it in the context of scanning “orphan works” such as the “hundreds of pages from 1930s novels, periodicals, and self-published materials that couldn't leave the New York Public Library” that GalleyCat editor Jason Boog read through during a project. As a demonstration, it includes a photograph of a page from such a work, a screenshot...

iPod Touch 4 is not quite an iPhone, but close
September 4, 2010 | 9:47 pm

ipod-touch-2010It turns out that the iPod Touch 4 is going to be just as good an e-book reader as the iPhone 4, but is not quite up to par in other respects. On GigaOm, Kevin C. Tofel writes about why the iPod Touch 4 is almost but not quite the “contract-free iPhone” Jobs touted it as. In particular, he focuses on the lack of built-in GPS, data plan connectivity, and the 5 megapixel camera that graces the iPhone 4. The reason for this is that, regardless of what the pricing article I cited the other day indicates, Apple actually...

A happy ending to Douglas Cootey’s problems with his error-encrusted ebook – thanks to TOR
August 9, 2010 | 2:33 am

images.jpgOn July 28 we published a story about how Douglas Cootey bought a copy of Ender's Game from the iBookstore and how it was full of errors. Apple refused to replace the book, even though a corrected edition had been posted to the store. Now the end of the story. From Douglas' website: On Thursday, August 5th, Apple contacted me via email with instructions on how to redownload the corrected version. A reader let me know they had received that email, too, so I know that others with the corrupted text received their corrected versions. It all took place within...

Prizmo photo OCR software coming soon to iPhone
July 21, 2010 | 6:12 pm

Prizmo Want to use your iPhone to photograph and OCR scan printed matter? Your chance may be coming soon. CNet reports that Creaceed’s Prizmo software, a desktop photo-OCR package that includes camera tethering and perspective correction, will soon be coming to the iPhone. No word yet on price; the desktop version costs $40. The app's crowning feature is that it can fix bad perspective, just like its desktop sibling, as well as let users snap photos without having to press the shutter button. Creaceed has devised a system through which users can simply say "take picture,"...