Science: Hunt for the killer chip: From airliners to washing machines, our lives are controlled by computer software. Susan Watts talks to the experts who have to make it safe

Click to follow
Eleanor Bellamy, an 82-year-old widow, died after running her washing machine overnight. It burst into flames while she lay in bed. The incident, in Sheffield two years ago, was a tragic lesson in the dangers of poor software engineering. An error in the program controlling the machine had caused it to overheat.

The threat from computer-controlled washing machines may be less obvious than that from computer-controlled chemical plants or transport systems, but it is every bit as real. The quiet, all-pervasive march of the microprocessor means that almost every piece of machinery with which people make contact now contains a computer chip.

The software that runs on these chips can go spectacularly wrong. Big, noisy software disasters will always make headlines - exploding space shuttles, air crashes and misdirected ambulances. But the reliability of more mundane software, whether controlling traffic lights or domestic appliances, is just as much a matter of life and death.

Experience in such 'safety-critical' software is limited to a select group of experts scattered through a mixed bunch of industries. They all face the same technical challenges and the same potentially disastrous outcomes, yet vital experience that could help to prevent accidents is not being passed on.

This month 50 specialists in safety-critical systems gathered in Berkshire to swap ideas. It was a rare opportunity to tackle a serious obstacle to the prevention of software disasters - a block on the free flow of information created by fears over commercial confidentiality.

Exchanging information, even within the same industry, is often a battle. The chance to swap experience between sectors as diverse as the nuclear, transport and automotive industries is rare indeed. Brian Wichmann, an expert on safety- critical software from the National Physical Laboratory, gave the keynote address. He said it usually takes a disaster or a close shave before an industry wakes up to the importance of checking the reliability of the control software it relies on.

Nasa paid little attention to software quality until the Challenger disaster, a speaker from the space agency told the meeting. The medical world woke up to the role of software reliability only after an error in the system controlling a cancer therapy machine was blamed for several deaths after radiation overdoses.

It would be invaluable if companies published documents setting out how they have tackled the design of safety-critical software - a potentially vital guidebook for those about to embark on building similar systems of their own. This rarely happens. Just last week, Eurotunnel refused to publish the results of tests on the vast array of computerised systems in the Channel tunnel.

Mr Wichmann said that pressure to keep information confidential means the results of near misses are rarely written up. 'What is happening is a bit like what happens with people. Every generation is having to learn its lessons by making the same mistakes,' he said.

Perhaps most worrying is the reluctance of organisations which have real accidents or disasters to publish details of what went wrong. There are exceptions - often these are publicly-funded organisations on whom commercial pressures are less acute, and the sense of moral obligation to help society at large avoid repeating mistakes stronger.

The report on the 1992 breakdown of the London Ambulance Service computer was one example. The investigation into the software and quality management problems behind computer failures, alleged to have cost between 10 and 20 lives, resulted in a comprehensive document of enormous value to others.

There have been attempts to prompt industries to share expertise and experience. In 1990 the Department of Trade and Industry launched a 'Safe IT' programme, aimed at identifying best practice and producing guidelines.

Some industry experts say privately that the scheme has been a flop. People are prepared to share just so much information, but reach a cut-off point beyond which they will not venture for fear of giving away trade secrets.

Mike Hennell, from Liverpool Data Research Associates, is a consultant on safety-critical software. 'The people we really need to get to are the ones that don't even come to the conferences - the medics and the motor industry in particular,' he said after the recent meeting.

In the past few years the automotive industry has seen the arrival of computer-controlled engine management and braking systems. Even the airbags on Ford's latest Mondeo cars are controlled by chips. Ford and Rover were represented at the Berkshire meeting - aware, perhaps, of the damage to public relations and profits of any major recall of cars through a software fault.

Steve Collins, managing director of Real Time Associates, a computer language company, organised the Berkshire meeting. He said one surprise for delegates was the similarity of the approaches they had adopted independently.

Many of the delegates shared a lack of faith in international standards on software quality, and had made only passing reference to these in their systems design. They said these standards were inflexible and criticised them for ignoring the human aspect of systems design.

Fiona Hovenden, from the Open University's computer department, highlighted this problem at the meeting. She said one of the biggest difficulties in assuring software quality is taking control of the industry's 'mavericks' - talented but quirky people who often fail to document properly the software they write, making it hard for others to monitor the quality of their work. The answer, she said, is to force them to get involved by placing them in charge of software quality for a team of designers.

The level of ignorance of the risks is frightening. 'Sometimes I despair,' Mr Hennell said. 'I have been saying the same things for 20 years now, yet there are still some industries in which people are not even aware that the systems they are devising are safety-critical.'

(Photograph omitted)