5

On October 1, Distributed Proofreaders quietly celebrated its fifth birthday. Distributed Proofreaders, in case you did not know, is the principal supplier of etexts to Project Gutenberg. It is a web-based environment for the computer-assisted, distributed proofreading of OCR-ed texts.

Project Gutenberg now hosts more than 16,000 ebooks, but until recently that amount numbered in the hundreds. Part of the enormous growth was caused by a web application written by Charles Franks that gave readers who had always wanted to give back to the project, but had thus far been scared away by the long production time of any given ebook, a way to contribute.

That web application was Distributed Proofreaders: a site where a volunteer could correct a single page of a book if that was all they had time for.

Michael Hart started Project Gutenberg in 1971 with the publication of an electronic version of the US Declaration of Independence, and in the process invented the ebook and the spam run (he sent the file to all hundred computers on the internet). In 1990, almost twenty years later, Project Gutenberg had 100 etexts to offer its visitors.

Until that time, the project’s volunteers pretty much did their own thing; there was a nebulous person at the equally opaque Project Gutenberg that they talked to, but other than that it was monk’s business as usual. In 1999 however, Greg Newby and Pietro Di Miceli introduced forums where volunteers could talk to each other. And apparently talk they did, coming up with new ways to share the burden of producing etexts.

“It was in this new, expansive atmosphere,” as Jim Tinsley writes in the preface to The Project Gutenberg FAQ 2002 (recommended reading for the introduction alone!) “with ideas flooding in from enthusiasts newly energized by the project, that Charles Franks (Charlz) came up with the idea of a web site that would serve to distribute the work of proofing a book among many volunteers. But not only did he think of the concept; he went ahead and did it!”

Distributed Proofreaders was run from a computer in Charles Franks’ home, until somewhere in 2002 the Internet Archive offered to host the site instead. On November 8, Charles Frank and another volunteer, Charles Aldorondo, decided to try a little experiment to stress test the new hardware and to get more volunteers involved. They sent a story to the highly popular “News for Nerds” website Slashdot, knowing full well that if it got published, Distributed Proofreading would experience a Slashdotting: an increase in visitors so high, that only the strongest webservers could survive.

As it happened, the story did get published, the server held up, the old guard volunteers almost did not, and Distributed Proofreaders corrected a hundred etexts within a few days. Membership sky-rocketed from 841 to 5,156 within a week.

The next few days I will be talking a little bit more about the history of Distributed Proofreaders, highlighting special books we produced, the sort of texts a volunteer can run into, and the community that also is Distributed Proofreaders. And squirrels.

In all fairness, Distributed Proofreaders was not the first of its kind: the Christian Classics Ethereal Library, a “Project Gutenberg” for christian etexts, uses a similar system that predates Charles Franks’ software.

 
5