After Danielle Jones disappeared on 18 June 2001, a series of text messages were sent from her phone. Police were suspicious that the later messages in the series were not written by Danielle, and linguistic analysis was able to show that they were in fact more likely to have been written by her uncle, Stuart Campbell. Likewise, when Jenny Nicholl disappeared in 2005, a linguistic analysis showed that text messages sent from her handset were probably written not by her but by her ex-lover, David Hodgson. Both Hodgson and Campbell were convicted of murder, and the linguistic evidence played an important role in their prosecution.
In the Hodgson case, Professor Malcolm Coulthard was able to show that the suspect messages were stylistically close to the undisputed messages of Hodgson – features like the lack of a space after the digit substitution in items such as "go2shop", contrasting with "ave 2 go" in Ms Nicholl's messages.
This is powerful evidence. But soon, forensic authorship analysis may move from a skill based in expert intuition to a true forensic science. There has been considerable work in developing statistical and computational approaches to authorship analysis. One of the more successful approaches introduces the statistical metaphor of stylistic distance between texts. Texts which are charted close to each other can be linked to the same author, whereas for texts that are more distant, it can be argued that there is a lower likelihood of shared authorship.
Such techniques are very useful in literary and historic authorship analysis work, but tend to be conditional on having sufficient textual data to work with. The problem for a forensic linguist is that cases frequently involve small amounts of data. In this sense, authorship analysis of text messages provides what might be seen as an extreme challenge. Last year, I worked on that challenge with my colleagues Jessica Woodhams and Andrew Price. We built a measure of similarity that could be used to answer a number of authorship analysis questions. It can show, for example, that pairs of texts known to be written by Jenny Nicholl differ significantly less than pairs where one text is taken from Ms Nicholl and one from David Hodgson – and that the disputed messages are stylistically closer to Hodgson than Ms Nicholl.
It could be argued that breaking through anonymity in writing is yet another encroachment on civil liberties. But being able to analyse these short and fragmentary electronic texts is hugely useful to investigators. And in the novel context of mass anonymity that the explosion in the number of electronic texts provides, these techniques are a tremendous social good.
Dr Tim Grant is deputy director of the University of Aston's Centre for Forensic Linguistics. He was speaking at the BA Festival of Science in Liverpool yesterday