Might an algorithm for predicting success of novels offer hope for the slushpile?
January 10, 2014 | 9:17 am
Scientists have analyzed what goes into a best-selling or poorly-performing novel, and come up with an algorithm that predicts a book’s commercial success with an 84% success rate. Oddly enough, the criteria for commercial success seem to be the same sorts of advice you get from writing coaches and workshops:
They found several trends that were often found in successful books, including heavy use of conjunctions such as “and” and “but” and large numbers of nouns and adjectives.
Less successful work tended to include more verbs and adverbs and relied on words that explicitly describe actions and emotions such as “wanted”, “took” or “promised”, while more successful books favoured verbs that describe thought processes such as “recognised” or “remembered”.
So, there’s more nouns and adjectives, fewer nouns and adverbs…and even “show, don’t tell.” Funny how that works. The article also mentions using Dan Brown’s The Lost Symbol as an example of a “less successful” book, because despite its commercial success the critics didn’t like it—and besides, what article on literature is complete without a Dan Brown potshot?
I was going to wonder what it might mean for the world if the publishing industry started using this algorithm as a predictor of success for books in their slushpile—but then I realized, they essentially already do. It’s just that they’re called “people,” who use their own internal algorithms to decide whether to accept or reject a particular book. I wonder whether the human slushpile pickers are more or less successful than 84% at picking winners?
But thinking about it further, most major publishers reject slushpile entries without even a glance these days just because nobody has time to go through them and figure out which ones are any good. Perhaps feeding the slushpile into a computer using this algorithm to separate the sheep from the goats could serve as a sort of pre-reading filter, allowing publishers to consider as many “good” works as possible when they would previously have had to reject everything.
Of course, with only an 84% success rate some sheep would be rejected and some goats would make it through, but that’s better than turning away all the potentially good manuscripts, or having to sift through all the dreck in search of the few good submissions. Could this help publishers compete with self-publishing by turning away fewer of the writers who would otherwise go on to publish themselves?
For that matter, in readers’ hands this algorithm might assist in picking through the “Internet slushpile” of self-published titles. It wouldn’t necessarily tell them whether they would like a book, but at least would let them know whether it was badly-written before they bought it. (Though, granted, sample chapters can already do that now at least to some extent.)
I also confess to wishing I could run some of my own Internet fiction through it to see how well I scored…
(Found via The Digital Reader.)