0

Picture 1.pngIt’s true. The Monitor has an article about Luis von Ahn, who helped develop the "Captcha" – the squiggly text we have to decode on some websites in order to enter text of our own. Well, von Ahn realized that doing this was a waste of tens of thousands of man hours. So:

In 2007, he came up with reCaptchas. Now, instead of frittering away their time typing random characters, Internet users spell actual words plucked from old books that computers have trouble reading. ”’

But computer programs are only 80 percent accurate in older books. They stump over blurry lines, places where the ink has bled together over time, and less uniform fonts.

Carnegie Mellon computers send the indecipherable words to more than 100,000 websites that use them in the reCaptcha security checks. Any website or blogger can sign up for the free service.

The Internet user sees two distorted words. One is a control word that the computer already knows. The other is a word that computers failed to read.

Once that word has been identified by multiple people it’s accepted as correct. The system’s accuracy rate of 99.1 percent is about the same as professional human transcribers.

Web users now provide about 3,000 man-hours a day of free labor in 10-second bursts of human computation, correcting more than 10 million words every day. ReCaptchas have solved 5 billion words in less than two years. Most people aren’t even aware that their brain power is being harnessed, although every reCaptcha includes a button that users can click to explain the program. …

The reCaptcha program is also helping to digitize The New York Times newspaper. It’s about half way through archiving every edition printed from 1851 to 1980, when the paper went digital.

“The New York Times will have been transcribed word by word by people around the world in less than a year,” says von Ahn.

“The total number of people who have helped to do this is about 400 million,” he adds. “In other words, about 6 percent of the world’s population has helped digitize the New York Times. They’re not really wasting their time typing reCaptchas.”

TeleRead previously covered ReCaptcha here.

 
0