AI system will trawl through millions of academic journals to find links missed by scientists

For the dogged scientist who has dedicated a lifetime to investigating minutiae such as the mating rituals of Amazonian ants or the weather patterns on far-off planets, it is a harsh reality that barely half of published research papers are read by anyone other than the authors themselves and their editors.

Such is the volume of august learning pouring from the world’s academic institutions that it is estimated some 1.8 million articles are produced every year to fill some 28,000 journals - far more than it is humanly possible to read and digest, even within relatively narrow fields.

But while it is easy to poke fun at the obscurity of some areas of scientific endeavour, the more worrying impact of this deluge of findings is the risk that small but vital breakthroughs are being missed. If the claims of an American artificial intelligence (AI) venture are to be believed, that peril is about to diminish dramatically thanks to computers.

The Allen Institute for Artificial Intelligence (AI2), set up by Microsoft co-founder Paul Allen, yesterday launched a search system that will use AI to trawl through the millions of academic papers publicly available online to tease out connections and findings which may have been previously missed. According to one study, one in two academic papers are only ever read by their authors, journal editors and referees. A further 90 per cent are never cited elsewhere.

Such is the pace of advances in AI - the science of trying to make machines capable of independent thought - that the creators of the Semantic Scholar tool believe that within a generation it and similar search engines will be able to relieve scientists of the tiresome business of having to read their peers’ research altogether.

Oren Etzioni, head of AI2, said: “What if a cure for an intractable cancer is hidden within the tedious reports on thousands of clinical studies? In 20 years’ time, AI will be able read - and more importantly understand - scientific text.

“These AI readers will be able to connect the dots between disparate studies to identify novel hypotheses and to suggest experiments which would otherwise be missed. AI-based discovery engines will help find the answers to sciences’ thorniest problems.”

The tool will allow researchers to narrow down their search areas to range of criteria from specific diseases or drugs to age groups or types of information such as graphs or data sets. Users will also be able to pose questions in natural language, for example “What are papers saying about middle-aged women with diabetes and this particular drug?”

The search engine, which is being made available free of charge, will initially look only at publications relating to computer science with other disciplines such as biology and chemistry expected to follow in the coming months.

Other organisations are also perfecting systems for analysing scientific discoveries. The research arm of the Pentagon is working on a project to identify potential treatments for some cancers by using a search mechanism to mine all existing research on the diseases for missed breakthroughs.

Dr Etzioni said: “With millions of papers coming out each year, there are no Renaissance men or women any more. People’s eyes glaze over and they miss that key paper or technique that they could use, in a medical case, to save somebody’s life.”