By Patrick Snodgrass ’13, thurj Staff

Illustrated by Sam Mendez

Since their inception, the Broad Institute of Harvard and MIT and the Dana Farber Cancer Program have been interested in researching the genetic underpinnings of cancer. As a result, it is no surprise that in 2004 these two institutions decided to collaborate with the Meyerson, Sellers, Getz, and Lander labs at Harvard Medical School (HMS) and MIT to fund the largest study in history investigating the mutations underlying cancer.

In this massive study, researchers have mapped and statistically analyzed the somatic copy number alterations—common insertions and deletions of genetic material implicated in cancer—of 3,131 cancer specimens from a range of cancer types. The data unearthed from this research has enabled the researchers to identify numerous previously undiscovered oncogenes. In addition, researchers have made the startling discovery that the majority of mutations that lead to cancer are conserved across all cancer types. The data amassed has yielded a wealth of information about the mutations shared among all types of cancer. In the process, the study has revealed exciting new research avenues in the fight against cancer, resulting in a seminal Nature publication and numerous new opportunities for future work.

Step 1: The Technology

Craig Mermel, a graduate student in Matthew Meyerson’s lab, was heavily involved in this research study. Describing somatic copy number alterations (SCNAs), Mermel explains, “[cells] gain some regions of the genome and lose some regions of the genome” when they are altered by cancer. The technology necessary to map these copy number changes that result from cancer, he asserts, has only been available in the last few years with the development of the single nucleotide polymorphism (SNP) array, which allows researchers to determine the genetic code of a cancer sample at numerous polymorphic sites in the genome.

The Meyerson lab extended the use of the SNP array when they applied it to SCNAs. As Mermel describes, “What Matthew’s [Meyerson’s] lab was interested in early on was extending this technology to enable you to actually read out the intensity of the DNA in addition to the genotype.” Researchers could then determine how many copies of a particular gene were present in a cancer sample based on how much DNA was interacting with the SNP array probes. This allowed the lab to document the copy number of approximately 250,000 genes in each cancer sample.

In order to determine the significance of the information gathered from this study, the researchers still had to develop a set of statistical tools to analyze the massive data sets they gathered. Dr. Rameen Beroukhim, an associate professor at HMS, provides insight into these statistical tools. He explains that “the way we determine which copy number changes are important is we determine which ones rise above the background rate” of mutations that would normally occur without the presence of cancer. The collaboration between his lab and the Meyerson lab over the last two to three years has resulted in the development of this methodology.

Relying on both of these technologies, the research collaborative analyzed 3,131 cancer samples for copy number alterations at approximately 250,000 sites with an SNP array. They then used newly developed statistical tools to determine which genes had statistically significant copy number alterations across its cancer type and across all cancer types. This analysis gave them the data they needed to make the following salient conclusions.

Step 2: The Results

After amassing sizeable datasets and statistically analyzing these datasets within and between cancer types, the research group was able to come to two major conclusions.

First, the two different types of SCNAs (arm-level and focal) differ in their correlation to developmental lineage. Arm-level SCNAs occur when chromosome-length pieces of the genome are gained or lost, and focal SCNAs occur when very short pieces of the genome are gained or lost. Whereas the same arm-length SCNAs appeared in cancers that arose from the same developmental lineages, focal SCNAs showed differing patterns. The data from only arm-length SCNAs, Mermel says, demonstrated that “whatever pattern of changes that you get seems to be constrained by some larger developmental lineage. There is some organization to cancer.” However, some other force seems to be driving the changes in focal SCNAs.

Second, although cancer researchers have known for a long time that certain mutations are common among all cancers, in the case of SCNAs, this group was able to quantify to what extent all mutations caused by cancer were conserved among all cancer types. As Beroukhim states, “we started to find that the events that were recurrent in any one cancer type tended to also be recurrent in other cancer types.” What is most surprising, as Mermel asserts, is that “eighty percent of the lesions in one cancer type are present in other types.”

Step 3: The Implications

In the process of analyzing these data sets, these researchers identified many genes such as those regulated by NF-κB as well as the genes MCL1, BCL2, and BCL2L1, which are most conserved among cancer types. Many of these genes were not previously believed to play a large role in cancer. Since cancer therapy is increasingly moving towards addressing the underlying biology of cancer after the success of Gleevec in treating chronic myeloid leukemia, these genes provide new targets for researchers designing new cancer therapies.

This study also addresses a large debate within the cancer research community as to whether cancer should be treated as one or a handful of diseases with numerous manifestations or as a collection of thousands of different diseases each with its independent mutation signature. Although Beroukhim believes that “there will continue to be important effort in both directions,” the study does provide evidence that it is possible to look for cancer therapies that address the conserved genetic underpinnings of all cancer types. This discovery goes against the current movement in the cancer research community of focusing on cancer as thousands of separate diseases.

Finally, the study provides insight into a new way to diagnose cancer. “We are still diagnosing lung cancer by looking at it under a microscope,” says Mermel. He goes on to state that this study shows that it would be much more effective to diagnose cancer not only by tissue type but also by the mutations that lead to that cancer. Although it would take time to develop the technology and therapies necessary to make this a reality, it would lead to much better treatment outcomes.

Step 4: Future Work

Despite the success of the study, there is much work that must be done to extend and verify the results. Although over 3,000 cancer specimens were analyzed, the researchers understand that many conserved genes were missed. As a result, Mermel says, “we’d love more samples.” The more samples they analyze, the more small SCNA events that can be identified. Thus, this is one major area of research that both Mermel and Beroukhim are currently investigating.

Using a higher resolution SNP array would also allow these researchers to identify more conserved mutations as well as glean more accurate information about the extent of their conservation. Regarding the current state of SNP array technology, Mermel comments, “where we had 250,000 markers across the genome, we can now do this at 2,000,000 markers across the genome.” Mermel even goes on to state that it is now possible to sequence the entire genome of these cancer samples, which would enable perfect resolution. Analyzing cancer samples at higher resolution is thus another area of interest.

Finally, this study only looked at SCNAs, which are but one of numerous types of mutations involved in cancer. As a result, Beroukhim asserts that it is integral to extend this study to identify “point mutations and other sources of genetic changes like translocations and fusion events.”  What these researchers are eventually working toward is a study that analyzes all types of mutations across hundreds of samples from every cancer subtype. The hope is that such a study will enable doctors to design more effective diagnostic and treatment options for cancer. However, such a study is years away from realization.

This massive study was made possible only through the partnership of multiple labs at both Harvard and MIT as well as through the support of numerous organizations such as the Dana Farber Cancer Institute and the Broad Institute. As the natural sciences, and especially the biomedical sciences, become increasingly interdisciplinary, such seminal work is often only possible through these academic and research partnerships. From the enthusiasm for scientific collaboration and partnership that pervades the Beroukhim and Meyerson labs, the challenges that these researchers will face as they extend this study and probe its implications will surely be confronted through the collaborative spirit that has driven their success to date.