What might it allow us to do that we have not been able to do with the SEER database? The SEER database was created by the National Cancer Institute (NCI).Dr Yu: First, I think that the SEER database is the gold standard, or the best source of cancer observational data. It was designed as a research tool to actually ask questions.Dr Miller: Many of us who have used them can tell which ones oncologists had a hand in developing, and maybe which ones were done elsewhere. A hand in developing, a hand in the input and language of the field, or domain experts who understand what doctors really need, what patients really need, and, more importantly, how work flows so doctors can work efficiently.Dr Yu: Cancer Lin Q traces back to the Institute of Medicine, now called the National Academy of Medicine, which held a series of workshops a few years ago about big data and what was called "rapid learning systems." The idea was, with the massive data that we will be acquiring in the decades to come, how can we learn from that data and create a system where we learn from real-world experiences, understand what is actually happening out in the field, and supplement what we learn from randomized clinical trials?One of the things we are finding in Cancer Lin Q and other data sources is that we need to work with both the EHR vendors and our physicians to do a better job, frankly, of documentation so that it is more clear and accurate.We have all seen cutting and pasting; things like this really lead to the problem of not very good data. Besides working with our practices to get better-quality data into their records so that better data come up to Cancer Lin Q, we have been having discussions with SEER for over a year now about linking data.Dr Yu: That gets to another point, which is that data are only as good as what people enter.You can try to clean it up, but it takes a lot of human work, which is expensive and somewhat wasteful and inefficient.

The NCI is about 50 or 60 years old, so this is not a new idea.The NCI itself has said that observational data are important—important for research use. The SEER database does not cover the entire US population but a good chunk of it. Welcome to Medscape Oncology Insights, coming to you from the 2016 annual meeting of the American Society of Clinical Oncology (ASCO). I am Kathy Miller, professor of medicine at the Indiana University School of Medicine in Indianapolis, Indiana.We have been hearing a lot about big data over the past year or so.We need to think carefully about what the clinical value is of huge compilations of patient information. Dr Miller: I really appreciate you taking the time from your busy schedule to talk to us about big data.How can we dive deep into data to understand that, and maybe develop a randomized clinical trial to definitively answer [the questions]?Real-world big data—observational data or whatever you want to call it—really complement the randomized clinical trial.One of the ways to make observational data more useful, more accurate, more reliable and trustworthy is to link datasets so that you can triangulate, fill in the gaps, and have a more complete understanding.Dr Miller: Maybe also verify some of those data fields, because the SEER data are curated by humans.


