scribd and piracyWhen Scribd announced their partnership with Smashwords last week, I did some poking around various boards to check the reaction. What I discovered was that Scribd and piracy were linked together in the minds of many people. On Friday, I had a chance to talk with Andrew Weinstein, VP of Content Strategy at Scribd to ask him what they were doing to change that impression. He was candid, and some of his answers surprised me (in a good way).

One of the complaints I had read was from a reader who refused to subscribe to the service because of what she perceived as Scribd supporting piracy. Her evidence? Copies of To Kill a Mockingbird, which has never been officially released as an eBook, so it’s pretty obvious that any copies in Scribd would be unauthorized. I checked, and yes, I found several copies.

When I asked Weinstein about it, he started by acknowledging that Scribd has long had a reputation for harboring pirated works. Then he added, “It’s reasonable that someone is upset if a pirated piece of content is there. We do what we can, but the system isn’t perfect.”

He also noted that Scribd began as a document uploading and sharing service. That’s a strength, in that they have a business model that is six years old. It’s a weakness in that their model attracts people who want to upload documents, some legitimate, some not.

So, acknowledging that the system isn’t perfect won’t count with many people, which leads to the question, exactly what are they doing to deal with piracy?

scribd app iconHe said they have a three-tier approach:

1. Legal (Terms of Use)

2. Proactive process to prevent pirated works from being uploaded

3. Reactive process when something slips through

Legal (Terms of Use)

Scribd recently updated all its legal documents and terms of service to be clear that they do not condone or tolerate piracy. They laid out exactly what is expected in the Uploader Agreement. Here’s the relevant clause for this article

(ii) You are the creator and owner of or have the necessary licenses, rights, consents, releases and permissions to use and to authorize Scribd and Scribd’s Users to use Your User Content in the manner permitted herein;

All right, you may be saying, so what? Terms of Use agreements only stop the law-abiding. Pirates don’t care about them, anyway. True, but for any of Scribd’s other measures to work, they do have to start by clearly stating what is and is not acceptable.

2. Proactive process to prevent pirated works from being uploaded

This is where things get more interesting, and we’ll get to the To Kill a Mockingbird example.

Scribd has a database of published works. When a document is uploaded, it’s checked against the database, and if it matches a document in the database, it’s assumed to be infringing, and it’s kicked back.

Sounds simple. I was still confused about the To Kill a Mockingbird example. I mean, easy, right?

Not so much. Weinstein reminded me that, since the book has never been officially released, uploaded copies are either scanned page images or OCR’d copies. Scanned images have no text for their crawlers to crawl. And OCR’d copies are imperfect. That’s right. Their system is too picky. It’s looking for exact matches, and it’s unlikely any scanned copy would be an exact match, so the system doesn’t recognize it.

Hmm. System tweaking? Yes, that’s exactly what they are doing. They expect to have an upgrade by the end of the next month which will be more forgiving and should allow fewer imperfect copies through.

Alert readers will spot another problem. The system is only as good as the data in it, and it doesn’t contain every book ever published. I think that’s one of the best reasons to publish through Smashwords and opt in to Scribd. That guarantees your book will be in their system.

He also said they have a three strike policy on infringing users.

3. Reactive process when something slips through

We’ve seen why the system isn’t yet 100% accurate, so what if something slips through? Scribd does provide a mechanism to report infringing works. They also support counter claims in case something is reported erroneously. Weinstein says it happens when a someone does have the rights to upload but someone else doesn’t realize the rights are valid.

As a layperson, it sounds like a decent system. Humans are tricky, and given an eighth of a chance, we’ll figure out some way to game any system. I agree with Scribd that their system can’t be perfect, but yes, they can do better, and it looks like they are working toward them.

I’m comfortable remaining a subscriber, and, as I said above, distributing my books through them puts them into the database, which sounds like a good first step to avoiding having my content pirated on Scribd.