The value of using data

The world creates 2.5 quintillion bytes of data every day, equivalent to over 150,000 iPads’ worth of information. In medical research, large and complex datasets contain huge amounts of information. Using this data in research is incredibly valuable, particularly when different datasets are linked together, as it can help us to understand the causes of disease, develop more effective treatments and improve our health services.

The key to this type of research is that it can be carried out on a large scale and reveal insights that would otherwise remain hidden. In clinical and population studies, huge numbers of people – sometimes in the millions – can be followed over many years to provide rich information on their health and behaviour. At the same time, large-scale studies in genetics and imaging are providing us with important information on disease mechanisms and the behaviour of cells and organs. These datasets are also offering opportunities to model complex bodily systems using computational approaches, which will allow us to understand how different components, such as cells, interact and underlie the behaviour of the whole system. Ultimately, this can help to shed light on the complex interaction between biological and lifestyle factors that cause disease, and how the body responds to different treatments.

The UK has a unique research advantage in this area. The NHS contains a wealth of patient information on the whole UK population throughout the life course. Researchers can analyse this data securely and link with genetic and biological data to gain new medical insights. No other country has this amount and type of data held by a single healthcare provider.

Data in research

The use of data is extremely valuable in medical research, and allows researchers to make discoveries that could not be achieved using other methods.

Researchers from the Wellcome Trust-MRC Cambridge Stem Cell Institute and Microsoft Research created the first computer model that can simulate blood cell development. The team based this on measurements of gene activity in over 3,900 blood stem cells. By using a computer to firstly model how the various blood cells in the body develop, the model could then be used to explore how blood cancers form and to find new treatments for them. “Because the computer simulations are very fast, we can quickly screen through lots of possibilities to pick the most promising ones as pathways for drug development,” says Professor Bertie Gottgens.

Data research can also provide insight into how a drug works and who is most likely to benefit from taking it. For example, in 2009 a group of MRC scientists explored whether the drug, metformin, can protect against cancer by bringing together and analysing data from several health datasets. Metformin is a drug widely used by patients with type 2 diabetes to control their blood sugar levels, but it also appears to be linked to a lower occurrence of cancer. Their research showed that metformin does reduce cancer rates in diabetic patients by activating a cancer suppressor gene. This finding could have important implications in the fight against cancer.