<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Crowd sourcing error-corrections in books&#8212;and maybe newspapers, magazines and Web sites</title>
	<atom:link href="http://www.teleread.com/2009/08/27/crowd-sourcing-error-corrections-in-books-and-maybe-newspapers-magazines-and-web-sites/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.teleread.com/ebooks/crowd-sourcing-error-corrections-in-books-and-maybe-newspapers-magazines-and-web-sites/</link>
	<description>News &#38; views on e-books, libraries, publishing and related topics</description>
	<lastBuildDate>Wed, 15 Feb 2012 18:02:38 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
	<item>
		<title>By: pond</title>
		<link>http://www.teleread.com/ebooks/crowd-sourcing-error-corrections-in-books-and-maybe-newspapers-magazines-and-web-sites/comment-page-1/#comment-1138636</link>
		<dc:creator>pond</dc:creator>
		<pubDate>Sun, 30 Aug 2009 11:49:05 +0000</pubDate>
		<guid isPermaLink="false">http://www.teleread.org/2009/08/27/crowd-sourcing-error-corrections-in-books-and-maybe-newspapers-magazines-and-web-sites/#comment-1138636</guid>
		<description>This is exactly what&#039;s needed over at google books to correct their abysmal OCR of the old public domain texts.

One thing not noted in the big announcement that Google is offering their books in epub: used to be Google gave us the choice of pdf or txt. Now txt is gone, replaced by epub. But the epub, unlike the txt, is readable only on a few programs, and how many of those programs let you correct the bad OCR?

I pulled down Hilaire Belloc&#039;s 1902 edition of &#039;The Path to Rome&#039; to check. As this was featured on the Google Books home page, I reckoned this might be an edition they could be proud of. Not so.

Every page header was included, and the pages didn&#039;t break according to the original. Some page headers were separate paragraphs, others were set inline into other paragraphs. In all, a terrible reading experience.

If I were a teacher trying to save my students some case, could I recommend this epub edition? Not at all. The pdf version not only escapes all these problems, it also gives teacher and students the same page number references for in-class study. The big problem with the pdf, of course, is it&#039;s impossible to read on a small pda- or smartphone-screen.

Setting all the Google public-domain books into some sort of wiki and allowing readers to correct them would go a long, long way to rectifying all the epub drawbacks, and actually making the books useful.</description>
		<content:encoded><![CDATA[<p>This is exactly what&#8217;s needed over at google books to correct their abysmal OCR of the old public domain texts.</p>
<p>One thing not noted in the big announcement that Google is offering their books in epub: used to be Google gave us the choice of pdf or txt. Now txt is gone, replaced by epub. But the epub, unlike the txt, is readable only on a few programs, and how many of those programs let you correct the bad OCR?</p>
<p>I pulled down Hilaire Belloc&#8217;s 1902 edition of &#8216;The Path to Rome&#8217; to check. As this was featured on the Google Books home page, I reckoned this might be an edition they could be proud of. Not so.</p>
<p>Every page header was included, and the pages didn&#8217;t break according to the original. Some page headers were separate paragraphs, others were set inline into other paragraphs. In all, a terrible reading experience.</p>
<p>If I were a teacher trying to save my students some case, could I recommend this epub edition? Not at all. The pdf version not only escapes all these problems, it also gives teacher and students the same page number references for in-class study. The big problem with the pdf, of course, is it&#8217;s impossible to read on a small pda- or smartphone-screen.</p>
<p>Setting all the Google public-domain books into some sort of wiki and allowing readers to correct them would go a long, long way to rectifying all the epub drawbacks, and actually making the books useful.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Con</title>
		<link>http://www.teleread.com/ebooks/crowd-sourcing-error-corrections-in-books-and-maybe-newspapers-magazines-and-web-sites/comment-page-1/#comment-1136616</link>
		<dc:creator>Con</dc:creator>
		<pubDate>Thu, 27 Aug 2009 22:29:00 +0000</pubDate>
		<guid isPermaLink="false">http://www.teleread.org/2009/08/27/crowd-sourcing-error-corrections-in-books-and-maybe-newspapers-magazines-and-web-sites/#comment-1136616</guid>
		<description>The National Library of Australia has crowd-sourced OCR-correction on their &lt;a href=&quot;http://newspapers.nla.gov.au/&quot; rel=&quot;nofollow&quot;&gt;Australian Newspapers&lt;/a&gt; website.</description>
		<content:encoded><![CDATA[<p>The National Library of Australia has crowd-sourced OCR-correction on their <a href="http://newspapers.nla.gov.au/" rel="nofollow">Australian Newspapers</a> website.</p>
]]></content:encoded>
	</item>
</channel>
</rss>

<!-- Performance optimized by W3 Total Cache. Learn more: http://www.w3-edge.com/wordpress-plugins/

Page Caching using disk: enhanced
Database Caching using disk: basic
Object Caching 298/324 objects using disk: basic

Served from: www.teleread.com @ 2012-02-15 13:10:26 -->
