Want to write a best-seller? Scientists claim this algorithm will tell you how

'Statsitical stylometry' looks at vast amounts of data in order to sift out the stylistic tropes that define a popular novel

Sophie Murray-Morris
Friday 10 January 2014 15:31 GMT
Encyclopædia Britannica, Eleventh Edition
Encyclopædia Britannica, Eleventh Edition (Stewart Butterfield / Creative Commons)

Ever wondered what the secret is to a novel’s success? Computer scientists from the US think they might have discovered the secret.

The new technique, with an accuracy rate of 84%, can tell aspiring writers whether their book will shoot to fame or be a total slump even before it is published.

Researchers at New York based Stony Brook University analysed over 40,000 books from a broad range of genres, as well as film scripts, to collate the findings. Notable titles included A Tale of Two Cities by Charles Dickens and The Lost Symbol by Dan Brown.

The technique, called statistical stylometry, differentiates between highly successful literature and less prosperous literary works by using vast amounts of data to define variations in literary style between one writer or genre and another.

The researched defined a book’s success by looking at its download figures and Amazon sales records.

A high percentage of verbs, adverbs and foreign words could be the reason why some books are failing, according to the research. They may also rely on verbs that more explicitly describe actions and emotions, including words such as “wanted”, “took”, “promised”, “cried”, and “cheered”. These books may also depend on overused words, such as cliché terms like “love” and their settings may be common geographical settings.

In contrast, more successful books use more conjunctions such as “and”, “but”, and “or”. They also included more thought-processing verbs such as “recognised” and “remembered”, the research revealed.

Yejin Choi, assistant professor at Stony Brook University, said: “Predicting the success of literary works poses a massive dilemma for publishers and aspiring writers alike.”

She added: “Based on novels across different genres, we investigated the predictive power of statistical stylometry in discriminating successful literary works, and identified the stylistic elements that are more prominent in successful writings.”

“Our work is the first that provides quantitative insights into the connection between the writing style and the success of literary works.”

Join our commenting forum

Join thought-provoking conversations, follow other Independent readers and see their replies


Thank you for registering

Please refresh the page or navigate to another page on the site to be automatically logged inPlease refresh your browser to be logged in