The Cancer Genome Atlas is mining a treasure trove of genetic data in search of new cancer insights
May 21, 2014 | Leslie Hill
The human genome encompasses 3.3 billion base pairs in 25,000 genes, sequenced over 13 years in a $3 billion international effort.
Like the moon landing or cloning Dolly the sheep, the Human Genome Project—the mission to map a human’s complete set of DNA—captured the public’s attention and blew open the genetics field.
In the conclusion of an article analyzing the genetic sequence in the journal Nature in 2001, the researchers recognized the scientific cusp they were standing on: “The Human Genome Project … provides a capstone for efforts in the past century to discover genetic information and a foundation for efforts in the coming century to understand it.”
One of those efforts is The Cancer Genome Atlas, which uses large-scale genome sequencing to identify changes in the DNA of individual cancers. Understanding which mutations occur and how they drive errant cell growth is the groundwork for improving cancer care.
Vanderbilt-Ingram Cancer Center members have contributed tumor biospecimens to TCGA, used its data in research projects and are leading the next extension of this work to proteomics. It’s all part of a national effort to better prevent, detect and treat cancer.
Cancer is a disease of genes gone wrong, but we don’t know what all of those errors are. As gene sequencing technologies became cheaper, faster and more accurate through the 2000s, and the fruit fly, chimpanzee and human genomes had all been sequenced, it was natural to turn to cancer next.
The Cancer Genome Atlas (cleverly initialed TCGA, the letters of the four building blocks that make up DNA) began as a pilot in 2006 with $100 million in funding from the National Cancer Institute and National Human Genome Research Institute, and focused on the sequence of three cancers: lung, ovarian and glioblastoma (a type of brain tumor).
The pilot proved the concept and goals: that teams around the country could pool their results and create valid data, and that data could be made freely available to anyone in the world for further research.
Based on that success, the NCI committed the resources to sequence additional tumor types, and as of 2014, there are 32 tumor types completely sequenced or in progress. More than 350 research papers have been published citing TCGA data.
“TCGA provides tremendous opportunities for researchers to further unlock the mysteries of cancer at the molecular level. Eventually, our investment in this effort may contribute to the foundation of precision medicine, including individualized therapies to exploit tumor weaknesses, resulting in hope for a future with reduced cancer mortality,” said Stephen J. Chanock, M.D., director of NCI’s Division of Cancer Epidemiology and Genetics.
Some of those hopes have already been realized. Basal-like breast cancer shares striking molecular similarities with serous ovarian cancer, suggesting the two could be treated with the same therapies. Genomic patterns in colon and rectal tumors are the same, suggesting these cancer types should be grouped as one.
From the glioblastoma genetic sequencing, researchers discovered three previously unrecognized but frequently-occurring mutations, plus a potential reason some patients are resistant to a common chemotherapy drug. Via kidney cancer samples, a mutation that may cause increased aggressiveness in clear cell renal cell carcinoma was found.
Ron Eavey, M.D., Guy M. Maness Professor of Otolaryngology and chair of the Department of Otolaryngology, who supports head and neck cancer research, said TCGA is proving that the genes responsible for a tumor are more important than the location of the tumor.
“My sense is that in the future we’re not going to care whether someone has cancer on their nose or their toe, and we’ll stop classifying it as nose cancer or toe cancer. We’re going to start thinking about the genetic profile. How did the cells become aberrant? And that obviously is going to be a lot of different mechanisms, but some of those mechanisms have vulnerabilities and this will obviously help cancer researchers and clinicians develop new treatments.”
Eavey said TCGA discoveries also have implications for widespread cancer prevention.
“Another hope in the future is that if you know the genetic profile, perhaps there is some way of not having to wait to treat cancer. If my genetic profile says I’m highly susceptible to a certain cancer, maybe there are things that can be done to detect it early or potentially even prevent it altogether.”
The success of TCGA depended on medical centers around the country to commit to collecting tumor samples from patients during treatment and submitting them to TCGA biospecimen cores for processing.
Eavey and research assistants Brandee Brown and Michelle Pham coordinated the collection of head and neck cancer specimens at Vanderbilt.
“The Barry and Amy Baker Laboratory at Vanderbilt produced 80 specimens for The Cancer Genome Atlas, and the entire rest of the United States produced 420. We’re extremely proud of that contribution and grateful to the patients who donated their tissue,” Eavey said.
The lab is now collecting at an even higher rate of around 8-15 new patients per week for its own Head and Neck Cancer Tissue Biorepository, which now has over 30,000 specimens.
Sarcoma samples were collected by Ginger Holt, M.D., associate professor of Orthopaedic Surgery and Rehabilitation, Justin Cates, M.D., Ph.D., associate professor of Pathology, Microbiology and Immunology, and Kim Johnson, senior research specialist. Nearly 30 samples were contributed to TCGA.
Sarcoma is the term given to a broad spectrum of rare cancers in bone or soft tissue. There are only 9,000-10,000 cases in the U.S. each year, and Holt said TCGA is an unprecedented opportunity to research this cancer.
“We have failed miserably to do genome sequencing in our own labs because it is so scarce, and there are 50 subtypes of sarcoma. But for the variety of tumors, it is very important do this genetic analysis and comparison,” Holt said. “It’s great that a central body has taken interest in this, and it’s important enough that we need and want to invest our energies in it.”
Holt points to Ewing sarcoma to illustrate the importance of having genetic sequencing for a cancer.
“In Ewing sarcoma, we have found the translocation and know fusion gene targets. There has been a ton of research due to that, and we have some good treatments,” she said.
“This project will give us targets to attack in other sarcomas. Amazingly, the overall survival for sarcoma has not changed since 1970. Surgery has improved greatly, but we’re still giving the same chemo we gave in 1970. It’s time we had some new targets and therapies, and TCGA should do that.”
Sarcoma sample collection ended in December and analysis is underway.
“TCGA is good proof of the concept, that if you have the funding, the network and the energy, good things will come out of it. The sum is far greater than the parts,” Holt said.
The Next Step
If cancer was a restaurant, DNA would be the menu, the list of all possible orders. RNA would be the waiter’s slip, the communication between the menu and the kitchen. And proteins would be the food on the plate.
“Understanding cancer from just doing genomics is like judging a restaurant by licking the menu,” said Dan Liebler, Ph.D., Ingram Professor of Cancer Research and director of the Jim Ayers Institute for Precancer Detection and Diagnosis.
Liebler is interested in the food, and many believe that his field—proteomics, or the study of proteins—is the future of cancer research and treatment.
“The DNA and RNA give rise to abnormalities in proteins and their function, and it’s the proteins that actually make things go wrong. That’s what gives rise to abnormal function and growth and resistance to drugs.”
From 2006-2011, Liebler led one of five centers participating in an NCI-funded project called Clinical Proteomic Technology Assessment for Cancer (CPTAC), which sought to demonstrate that proteomic data could be collected, analyzed and shared at the same quality level as for genomics.
The project was renewed in 2011 as the Clinical Proteomic Tumor Analysis Consortium with the charge to use TCGA genomic data and tumor tissues to complete a proteomic analysis of certain cancers. The Vanderbilt Proteome Characterization Center was assigned colorectal cancer, while the Broad Institute studied breast cancer, and Johns Hopkins and Pacific Northwest National Laboratory collaborated on ovarian cancer.
“TCGA made it possible for us to do this. Not only did they collect all the specimens, but they gave them the full genomic annotation, which is the baseline information that we used to do our proteomics. We used their genomics to do better proteomics,” Liebler said.
By February 2013, Liebler’s team had used 95 TCGA colorectal tumor samples to generate the largest proteomics data set ever collected, which was made available for public download in September.
“We had to buy a new server because we had never had that much data before,” Liebler said. The research has been accepted for publication in the journal Nature.
One of the big questions to answer was how messenger RNA levels (the waiter’s order slip) compares to protein levels (the food served). It turns out only 1 in 3 orders are significantly correlated. The restaurant gets the order wrong two-thirds of the time.
“In the one-third that correlate, we know that is a very reinforced message. In the two-thirds that don’t correlate, the question is why? We can learn from that discordance,” Liebler explained.
This research was also the first to demonstrate proteomic subtypes for any type of cancer. It found five subtypes in colorectal cancer.
“Our ability to classify cancers into subgroups that are biologically distinct will really help us decide who we can give our established therapies to and expect good outcomes, and to develop new drug targets.”
The research is also helping scientists understand why some tumors may not respond to a drug even though it has the known target. Subtypes for lung and colon cancer, for example, have the same genetic mutation, but colon cancer patients do not respond to therapy.
“We now know that the colon tumors have the same protein target but have already developed two or three ways to signal around that target. So even though you hit it with the drug, you haven’t blocked that signaling. Lung cancers aren’t wired that same way.”
Liebler says proteomic technologies are a totally different kind of chemistry than genomic technology and have lagged behind by about 15 years. With proteomics catching up, he sees a future where the two technologies are used in tandem for better diagnosis and treatment.
“Cancers do fall into bins that have tendencies to behave in a certain way. We now know that there are bins, and we know that some of the features of the bins from cancer to cancer are recurring. That’s why certain drugs tend to work in multiple cancers but not all cancers,” he said.
“The thing that I’m excited about from a diagnostics perspective is that we can maybe use protein measurements in the near future to really clearly establish which bin you’re in.”