Project GENIE Aims to Make Sense of Vast Amounts of Genomic Data

By Julie Grisham,

Tuesday, January 5, 2016

Illustration of DNA strands

Advances in gene sequencing methods have revolutionized our genetic understanding of cancer. But they’ve also led to such large quantities of data that scientists are struggling to analyze it all. Now a multicenter project aims to find better ways to interpret this abundance of information.

  • MSK’s Charles Sawyers is spearheading the initiative, called GENIE.
  • Six other leading cancer centers are involved in phase one of the project.
  • The goal is to improve clinical decision-making and provide new information for diagnosis and treatment.
  • The GENIE database already holds 17,000 patient records.

Genes hold an important key to our understanding of cancer and its underlying causes. A technology called next-generation sequencing, which significantly improves upon older gene sequencing methods, is revolutionizing the way cancer research is conducted. But the large amount of genomic data it produces has cancer researchers still figuring out the best ways to analyze the information.

To address this urgent demand, the American Association for Cancer Research (AACR) helped create a program called Project GENIE (Genomics Evidence Neoplasia Information Exchange). Memorial Sloan Kettering physician-scientist Charles Sawyers, a past president of the AACR, is spearheading the effort, which was announced late last year. Along with MSK, six other leading cancer research institutions will collaborate on the program.

We recently spoke with Dr. Sawyers about the creation of Project GENIE and his goals for it.

Why was a program like this needed?

Every year, thousands of cancer patients have the genomic changes in their tumors analyzed. The idea is that mutations found in the tumor will provide insights for personalized medicine. In some cases, the findings from tumor analyses clearly suggest the most appropriate therapy. But this level of clarity is available for only a handful of the hundreds of mutations found in patients’ tumors.

One of the reasons is that it’s difficult for one institution to collect enough data to make statistically significant connections between a particular mutation and a particular cancer, especially for tumors and mutations that are rare. The numbers are just too small. By aggregating existing and future genomic data from all seven participating institutions, we hope to address this problem.

Back to top

Once the program can interpret the data, how will it be used?

Beyond clinical decision-making, the data within the project can be used to identify and validate biomarkers [substances that may indicate the presence of cancer in the body], which can help with screening and diagnosis. They can also be used to identify additional mutations that can be targeted with drugs — either new drugs or existing ones. We hope they will also help justify to insurance companies and other healthcare payers why genomic analysis of tumors is valuable and worthwhile. Finally, we envision that the shared learnings from GENIE will benefit other global consortia and vice versa.

Back to top

How is the project set up?

The first phase of the project is limited to the seven founding members, but other institutions will have opportunities to join in later phases. Keeping in line with the themes of openness and sharing, all the project’s policies are available on its website. Announcements of data releases and membership applications will be equally open and transparent.

At the time GENIE was created, the database already held nearly 17,000 genomic records. This number is growing quickly as more patients are treated at participating institutions. In phase I, we’ll look at clinical outcomes data in response to specific clinical queries. These data will then be aggregated and linked in a way that will allow us to make correlations between various gene alterations and particular types of cancer, as well as how patients respond to therapies.

This process will continue to evolve and streamline as the project moves to the later stages. In addition to making the data open to the community, we will be evaluating requests for various clinical queries and undertaking subprojects through sponsored research agreements.

Back to top

What are your goals for this project in the longer term?

In the not-so-distant future, it is likely that all cancer patients around the world will have their tumor genomically sequenced, and that their physician will use a data registry to help make treatment decisions. The information will flow freely and patients will benefit rapidly. When this time comes, everyone involved in GENIE will know that he or she played a major role in making this new reality happen.

Back to top

The other institutions involved in Project GENIE include the Center for Personalized Cancer Treatment in the Netherlands, the Dana-Farber Cancer Institute, Institut Gustave Roussy in France, the Sidney Kimmel Comprehensive Cancer Center at Johns Hopkins University, Princess Margaret Cancer Centre in Canada, and the Vanderbilt-Ingram Cancer Center. Two bioinformatics partners will be involved to help with the computational aspects — the nonprofit, Seattle-based research group Sage Bionetworks and the cBioPortal for Cancer Genomics, which is located within MSK.


Hello, my wife has stage 4 NHFL. Diagnosed in 2011, 6 months chemo followed by 2 years of maint therapy. Back in 2002, while pregnant with our daughter it was found that she had a balanced chromosome translocation 1 and 19. Does what your researching have anything to do with this? And if so, does this increase our daughters chance of getting cancer? Did the translocation have anything to do with my wife's cancer?
Thank you
Philip Rizk

Philip, thank you for your question. Research is ongoing but it is not clear what causes follicular and other non-Hodgkin's lymphomas. Unlike some cancers, they are not passed down in families.

For questions about inherited risk for cancers, you can contact our Clinical Genetics Service. They can be reached at 646-888-4050. You can learn more about hereditary genetics and cancer at…

For additional information, you might contact the National Cancer Institute’s Cancer Information Service at 800¬4CANCER (800¬422¬6237). To learn more about the CIS, including Live Chat help and how to send them an email message, go to

I have read that one of the conclusions of The Cancer Genome Atlas study is that the mutations in tumors have a high degree of randomness and it is impossible to find any kind of consistent pattern from cell-to-cell or from patient-to-patient. Is this not true?

David, thank you for your comment. We sent your question to researcher Barry Taylor, who replied, “Thanks for your question. Indeed, of the many conclusions of The Cancer Genome Altas studies is that there is a tremendous degree of complexity in the mutations and genomic changes present in tumors both within and across cancer types. Nevertheless, they have also found very specific and consistent patterns too that are expanding our understanding of many cancer types. Of course, there is much work left to be done to reveal what this complexity means and how it can be exploited for the benefit patients.”

Add new comment

We welcome your questions and comments. While we share many of them with our world-class doctors and researchers, we regret that in order to protect your privacy, we are not able to make personal medical recommendations on this forum, nor do we publish comments that contain your personal information. If you would like to consult with an MSK doctor, we encourage you to make an appointment at 800-525-2225 or request an appointment online.

Your email address is kept private and will not be shown publicly.