Stereotyping in plot construction can be the bane of the writer, but what if there were only six basic formats for all plots, and those types could be plotted? That’s the contention of Matthew Jockers as part of his ongoing research into “the relationship between sentiment and plot shape in fiction,” via “an R package titled ‘syuzhet’ … designed to extract sentiment and plot information from prose.”

Inspired partly by some remarks by Kurt Vonnegut, Jockers “set out to develop a systematic way of extracting plot arcs from fiction. I felt this might help me to better understand and visualize how narrative is constructed. The fundamental idea, of course, was nothing new. What I was after is what the Russian formalist Vladimir Propp had defined as the narrative’s syuzhet (the organization of the narrative) as opposed to its fabula (raw elements of the story).”

Exactly how to quantify and map “the organization of the narrative elements,” however, is the problem. Jockers cited Christopher Booker’s 2004 book, The Seven Basic Plots: Why We Tell Stories, based on seven basic Jungian archetypes, but admits that this method and all other “proposed taxonomies suffers from anecdotalism.” In other words, they don’t demonstrate “how to compare, mathematically and computationally, the shape of one story to another.”

Jockers’s solution lies in “the Fourier transformation,” which, he claims, “provides a way of transforming the sentiment-based plot trajectories into an equivalent data form.” And with this method, he adds, he “measured and compared 40,000+ plot shapes and then clustered the resulting data in order to reveal six common, perhaps archetypal, plot shapes.”

Those actual results are due in a follow-up post. For the time being, though, I’ve no doubt that Jockers got six basic patterns from his data. But it reminds me of Leo Tolstoy’s criticism of another misapplication of the scientific method to the human domain in War and Peace, when he talks about “theorists who believed in a science of war with immutable laws—laws of oblique movements, outflankings, and so forth.” There is no clear rationale yet at least for how Jockers can systematically and verifiably quantify plot elements. Is a plot element something you can define mathematically? What does it produce that’s mathematically verifiable? Otherwise, how can you crunch its numbers? My suspicion is that the designation of plot elements is going to be as arbitrary, and even anecdotal, as any of the preceding taxonomies. In other words: Garbage in, garbage out. Still, those pretty shapes might prove insipiring to some writers …



  1. This might prove useful for English professors in desperate need for article topics, but I see no value for authors.

    Most of us don’t use the various plot flow charts and methodologies out there already so a new one wouldn’t float our boats either.

    I also have a problem with the term “sentiment.” Sentiment is fake glitter as opposed to the pure gold of human emotion, and it’s the last thing authors want found in their work.

The TeleRead community values your civil and thoughtful comments. We use a cache, so expect a delay. Problems? E-mail