Rupert Cornwell: Twitter dialect says you are what you tweet

Out of America: Got 'sumthin' or 'suttin' to get off your chest? Go ahead. But from your 140 characters, experts will work out where you come from
Click to follow
The Independent Online

We British pride ourselves on our array of regional dialects and idiom – not like those Americans, who all speak the same. Well, maybe not all of them. There's the south of course, with its slow drawl, and idiosyncratic pronunciation and turns of phrase. And then you've got New Jersey/New York, as used in The Sopranos and countless Mafia movies, and doubtless by all those mobsters (sorry, "alleged" mobsters) rounded up by the FBI the other day. But the rest of them talk the same, don't they?

Well not exactly. American English, according to linguistics scholars, is as varied as any language can be. There's midland speech, which breaks down into north midland and south midland, the former prevalent across the Midwest, the latter stretching south to the point where it becomes indistinguishable from southern speech – which itself is anything but monolithic. In turn, midland American English is distinct from the northern variant, which is split between inland north and north central.

Then you've got New England English, and California English, and the English they speak in the uppermost parts of the upper Midwest (which to my inexpert ear has a distinctly Canadian timbre to it.) Throw in the accents and verbal peculiarities of the old eastern industrial cities like Boston, Philadelphia, and Pittsburgh and you realise that, far from being homogeneous, the US is home to a veritable linguistic stew.

If there is an American equivalent of our received pronunciation (RP), it probably exists in the north Midland area. This, it as been theorised, is why so many tele-marketing companies are based in Omaha, Nebraska, whose inhabitants are held to speak a "neutral" American that sounds normal everywhere across the country.

Internal linguistic boundaries are shifting. Once the Mason-Dixon line, roughly Maryland's border with Pennsylvania, was supposed to be the cultural demarcation line between the north and south, with Washington DC firmly in the southern camp. These days, the city feels more northern than southern, and the expansion of its suburbs in northern Virginia, it is argued, has pushed the linguistic frontier southwards to Richmond, capital of the old Confederacy.

More to the point, in this age of instant communication, you would have imagined that such linguistic differences were disappearing, flattened by the triple juggernaut of cable television, the internet and proliferating social networks. But not a bit of it. A fascinating new study of traffic on Twitter suggests that even in a 140-character universe, regional dialects and peculiarities are thriving.

Researchers at Carnegie Mellon University in Pittsburgh zeroed in on a single week in March last year. They looked at a total of 380,000 messages, each geo-tagged so that the location of the tweeter could be pinpointed. And amid the distortions and abbreviations of Twitter-speak, geography shone through – so much so that Jacob Eisenstein, one of the study's authors, boasted that, with a reasonable sample, his team could identify where a microblogger was based to within 300 miles. Not bad in a country 3,000 miles across.

Some tweets of course were dead giveaways. Take "yinz" and "y'all". The former is particular to Pittsburgh, while the latter, of course, is a standard identifying mark of a southerner. But both stem from the same problem: the lack of a distinct form for the second person plural in English: "yinz" is a contraction of "you ones", while "y'all" is short for "you all".

But other distinctions are more surprising. Take "cool", in its non-meteorological sense. In southern California, it is tweeted as "coo"; in northern California, as "koo". You can be very tired anywhere – but in northern California, people tend to be "hella" tired, in New York you're "deadass" tired, while in LA you're simply "tired af" (the last two letters signal a common vulgarism).

When tweeters are amused, they frequently say "LoL". But in Washington, they prefer another vulgar acronym: "LLS". In the gritty northeastern troika of Pittsburgh, Cleveland and Philadelphia, tweeters are coarser still. "Ctfu",they tweet, which stands for "cracking the fuck up".

Another oddity is provided by New York. In Twitterese, something is often written "sumthin". In the Big Apple, a common variant is "suttin". "U" is a standard abbreviation for "you", in texting and tweeting. But the Big Apple is again the exception, preferring "uu" or "youu". And when they're talking about themselves, they again double up. It's not "I", but "II".

Maybe we shouldn't be too surprised. Slang is often shorthand, and frequently it is regionally based. Not only does Twitter demand succinctness, it also links people with similar interests, who are likely to live in the same part of the country. Purists will be dismayed at what they see as further debasement of the tongue. Studies like this one, they would argue, serve only to lend respectability to the deplorable.

But you can't help wondering. The vitality of English in part reflects the relative closeness of the written language to the spoken. Twitter is accelerating this process. The intriguing question is: will some or other bit of regional Twitter-speak go global – in other words, amid the clutter, could a new OK be lurking?