Welcome to the new Independent website. We hope you enjoy it and we value your feedback. Please contact us here.

Want to write a best-seller? Scientists claim this algorithm will tell you how

'Statsitical stylometry' looks at vast amounts of data in order to sift out the stylistic tropes that define a popular novel
  • @smurraymorris

Ever wondered what the secret is to a novel’s success? Computer scientists from the US think they might have discovered the secret.

The new technique, with an accuracy rate of 84%, can tell aspiring writers whether their book will shoot to fame or be a total slump even before it is published. 

Researchers at New York based Stony Brook University analysed over 40,000 books from a broad range of genres, as well as film scripts, to collate the findings. Notable titles included A Tale of Two Cities by Charles Dickens and The Lost Symbol by Dan Brown.

The technique, called statistical stylometry, differentiates between highly successful literature and less prosperous literary works by using vast amounts of data to define variations in literary style between one writer or genre and another.

The researched defined a book’s success by looking at its download figures and Amazon sales records.

A high percentage of verbs, adverbs and foreign words could be the reason why some books are failing, according to the research. They may also rely on verbs that more explicitly describe actions and emotions, including words such as “wanted”, “took”, “promised”, “cried”, and “cheered”. These books may also depend on overused words, such as cliché terms like “love” and their settings may be common geographical settings.

In contrast, more successful books use more conjunctions such as “and”, “but”, and “or”. They also included more thought-processing verbs such as “recognised” and “remembered”, the research revealed.

Yejin Choi, assistant professor at Stony Brook University, said: “Predicting the success of literary works poses a massive dilemma for publishers and aspiring writers alike.”

She added: “Based on novels across different genres, we investigated the predictive power of statistical stylometry in discriminating successful literary works, and identified the stylistic elements that are more prominent in successful writings.”

“Our work is the first that provides quantitative insights into the connection between the writing style and the success of literary works.”