Cloudy with a chance of finding new cures for cancer: CanSAR database adapts weather forecasting computer programs


The world’s largest cancer database, which will use computer programs similar to those that forecast the weather to process millions of experimental results in seconds, has been launched by British scientists. The technological breakthrough has the potential to revolutionise the search for cancer cures.

CanSAR, developed by the Institute of Cancer Research and Cancer Research UK, will handle more data than would be generated by a million years of observations from the Hubble space telescope.

In total, 1.7 billion experimental results will be available, free of charge, to researchers all over the world. The database uses artificial intelligence to help scientists understand work in other fields and also find genes that affect cancer, an expensive and time-consuming process. Research that previously took months could now be done in minutes.

Dr Bissan Al-Lazikani, a member of the CanSAR team, said scientific advancements had created “gold mines” of raw data, but also a problem of information overload.

“The database is capable of extraordinarily complex virtual experiments drawing on information from patients, genetics, chemistry and other laboratory research. It can spot opportunities for future cancer treatments that no human eye could be expected to see.”

CanSAR contains more than eight million experimentally derived measurements, information on nearly a million biologically active chemical compounds, and data from more than 1,000 cancer cell lines. It also holds drug-target information from the human genome and laboratory animals. “We are living in an exciting era where new technologies are allowing us to build huge databases of patient data, gene variations that are related to disease and many more clinical observations,” Dr Al-Lazikani said.

“The problem is, the more of these gold mines of raw material that we have, the more important the following question becomes: how do we bridge the gap from this raw knowledge to drugs for patients?

“CanSAR links such raw gold mines of genetic data to a whole raft of independent chemistry, biology, patient data and disease information. It then uses sophisticated computer machinery and artificial intelligence to draw paths of knowledge between them, predict risks and opportunities and make suggestions that can be tested in the lab and take us closer to a drug.”

CanSAR in numbers

The CanSAR database contains data on 1 million biologically active chemical compounds and data from more than 1,000 cancer cell lines. It condenses more data than would be generated by operating the Hubble Space Telescope for a million years.