Spies in the 'forests'

Last week, The Independent reported that a US spyagency had patented a system for eavesdropping on phone calls. Now it islab-testing software that can sift through calls and e-mails in search of keyphrases
Click to follow
The Independent Online

THE US Department of Defense is lab-testing technology thatcould make it easier automatically to sift through a vast pool of privatecommunications, including international telephone phone calls, in a similarmanner to using an Internet search engine.

THE US Department of Defense is lab-testing technology thatcould make it easier automatically to sift through a vast pool of privatecommunications, including international telephone phone calls, in a similarmanner to using an Internet search engine.

The technology, called "SemanticForests", is a software program that analyses voice transcripts and otherdocuments in order to allow intelligent searching for specific topics. Thesoftware could be used to analyse computer-transcribed telephone conversations.It is named for its use of an electronic dictionary to make a weighted "tree" ofmeanings for each word in a target document.

Two US Department of Defenseacademic papers, published as part of the Text Retrieval Conference (TREC) in1997 and 1998, provide the first evidence that the US government has actuallybuilt a working prototype of this technology and is testing it. The papers revealthat the US military had been honing Semantic Forests over at least two years,from 1996 to 1998, to make it more effective at siphoning off useful information.

According to the 1998 paper, the software was originally developed to "workwith imperfect speech recogniser transcripts". The US Department of Defensedeclined to comment on the matter.

In a series of lab tests, the softwaresifted through large pools of documents, including transcripts of speech and datafrom Internet discussion groups. In one set of tests, scientists increased theaverage precision rate for finding relevant documents per query from 19 per centto 27 per cent in just one year, from 1997 to 1998.

It appears that SemanticForests is intelligent enough to handle questions given in plain English. One ofthe sample questions used to test the software was, "What have the effects of theUN sanctions against Iraq been on the Iraqi people, the Iraqi economy, or worldoil prices?"

The US National Security Agency is also closely associated withSemantic Forests. One of the authors of Semantic Forests, Patrick Shone, was alsoone of the inventors of an NSA-patented system for eavesdropping on internationalphone calls, which is similar to Semantic Forests.

The NSA applied for thepatent, No 5,937,422, seven months before the first Semantic Forestpaper was delivered at TREC. However, the patent only became public after winningUS Patent Office approval in August this year.

The NSA is believed to conductlarge-scale, automatic eavesdropping on some types of written internationalcommunications such as e-mail, according to a May 1999 interim reportcommissioned by European Parliament's Scientific and Technical Options Assessment(STOA) panel.

Glyn Ford MEP, who instigated the STOA's investigation, said hewas concerned that the US was testing technology that might be used to eavesdropon international telephone calls. "It appears the NSA has abilities over andabove what has been indicated to us to date," he said.

There was "strongcircumstantial evidence" that the NSA had been engaged in economic espionage onoccasion, passing intercepted information on to American companies to give them acompetitive advantage, he said. While he was happy for intelligence agencies tospy on terrorists, he said that the NSA's "blanket approach" to monitoringtelephone calls and e-mails was "a serious breach of privacy rights".

Cryptographer Julian Assange, who moderates the online Australian discussionforum AUCRYPTO, discovered the department papers while investigating NSAcapabilities. "This is not some theoretical exercise. The US has actually builtand lab tested this technology, which is clearly aimed at telephone calls. Youdon't make a wheel like this unless you have something to put it on," he said.

US Congressman Bob Barr, who previously served with the CIA, said: "Thisreport underscores the need to update oversight procedures and legal standardsdesigned in the 1970s and not updated since, in light of the revolutionarytechnological changes of the past two decades. A perfected system to interceptvoice communications and allow government agencies to precisely pinpointconversational topics of interest would create a truly awesome potential forprivacy-invading abuses."

The outspoken Georgia Republican has been a drivingforce behind proposed legislation to force the NSA and CIA to report the legalstandards that they use while conducting signals intelligence activities,including electronic surveillance. The legislation has passed both houses ofCongress and is awaiting signature by President Clinton.

Dr Brian Gladman,the former director of Strategic Electronic Communications at the Ministry ofDefence, said the NSA would always like to find better ways to filter "voicetraffic" - international phone conversations - automatically forinformation. "The NSA's problem is finding needles in haystacks, and anytechnology that can chuck out hay without chucking out needles is of value tothem," he said.

"Automation is essential. It is likely the success rate willbe low, but this may not be an issue. It is better to deploy something that willallow 10 per cent of the interesting traffic to be found, than doing nothing andfinding nothing."

Dr Gladman speculated that the NSA was not using the newtechnology on international telephone calls at the moment, but was doing trialson it "to see if it is worth deploying".

The two Semantic Forests academicpapers came from the speech research branch of the US Department of Defense atFort Meade, Maryland - the location of the headquarters of the NSA. When the1998paper was downloaded from the TREC conference Internet site, the name of thefile was listed as "nsa-rev.pdf".

Bruce Schneier, the author of AppliedCryptography, claims that, paired with other types of spying technology, thissoftware could have a significant impact on people's privacy. "This technologycan be combined with voice-recognition technology to automatically find certainconversations by a particular person or ethnic group," he said.