Skip to Content

Dealing with the data deluge

March 12, 2013 | Melissa Stamm

In recent years, technological advances have exponentially increased the speed and decreased the cost of cancer genomics.

“The cost of sequencing is now plummeting,” said William Pao, M.D., Ph.D., director of the Division of Hematology and Oncology and of Personalized Cancer Medicine at Vanderbilt.

Sequencing one person’s genome cost around $100 million in 2001 – but now costs less than $10,000, according to the National Human Genome Research Institute. The time required to complete a sequence has gone from years to about one month.

But all that progress comes with a down side: developing the resources to handle this deluge of genomic data.

In September 2011, Vanderbilt University Medical Center established the Center for Quantitative Sciences (CQS), directed by Yu Shyr, Ph.D., professor of Biostatistics, Biomedical Informatics, and Cancer Biology, to help investigators manage, analyze and interpret genomic data.

“Our primary goal is not only to provide traditional data analysis support…but to provide a systems biology approach,” Shyr said.

Computer storage and processing capacity is a limiting factor in dealing with data of this magnitude, he said. For example, the whole genome sequence from one individual can require a half-terabyte of storage and data processing. That’s roughly the amount of space it would take to store 500 million pages of text.

While Shyr notes that computing and processing limitations are a concern, Vanderbilt’s institutional resources – like the computing resource ACCRE (Advanced Computing Center for Research and Education) and genomic resources like VANTAGE (Vanderbilt Technologies for Advanced Genomics) and VANGARD (Vanderbilt Technologies for Advanced Genomics Analysis and Research Design) – provide fertile ground for genomics research.

With his team of more than 30 scientists and the long history of collaboration across the Vanderbilt campus, Shyr believes that a team-oriented approach is crucial to addressing the complex issues in genomics research.

The faculty and staff of the CQS are involved throughout the experimental process – from experimental design to analysis and interpretation of the data.

Shyr’s group also helps integrate genomic data from Vanderbilt investigators with national and international databases. His group has developed tools to allow Vanderbilt-Ingram investigators to access data from The Cancer Genome Atlas project – an initiative by the National Cancer Institute and the National Human Genomics Research Institute to characterize the genomes of several major cancer types.

This level of integration is essential to progress in untangling cancer’s complex genetic roots.

“We’re not only looking for a single gene, we’re looking for a pathway. And not only a single pathway, but a network of pathways at the systems biology level,” he said. “So I do think for the future of genomics research, we need a team of quantitative scientists to support it.”