Machine Learning May Help Classify Cancers of Unknown Primary

Illustration of a magnifying glass and DNA sequences

MSK-IMPACT, a test designed to detect mutations and other critical changes in the genes of tumors, can supply important information for cancer diagnosis.

Experts estimate that between 2 and 5% of all cancers are classified as cancer of unknown primary (CUP), also called occult primary cancer. This means that the place in the body where the cancer began cannot be determined. Despite many advances in diagnostic technologies, the original site of some cancers will never be found. However, characteristic patterns of genetic changes occur in cancers of each primary site, and these patterns can be used to infer the origin of individual cases of CUP.

In a study published November 14, a team from Memorial Sloan Kettering reports that they have harnessed data from MSK-IMPACT to develop a machine-learning algorithm to help determine where a tumor originates. MSK-IMPACT is a test to detect mutations and other critical changes in the genes of tumors. When combined with other pathology tests, the algorithm may be a valuable addition to the tool kit used to make more-accurate diagnoses. The findings were reported in JAMA Oncology.

“This tool will provide additional support for our pathologists to diagnose tumor types,” says geneticist Michael Berger, one of the senior authors of the new study. “We’ve learned through clinical experience that it’s still important to identify a tumor’s origin, even when conducting basket trials involving therapies targeting genes that are mutated across many cancers.”

Basket trials are designed to take advantage of targeted treatments by assigning drugs to people based on the mutations found in their tumors rather than where in the body the cancer originated. Yet doctors who prescribe these treatments have learned that, in many cases, the tissue or organ in which the tumor started is still an important factor in how well targeted therapies work. Vemurafenib (Zelboraf®) is one drug where this is the case. It is effective at treating melanoma with a certain mutation but doesn’t provide the same benefit in colon cancer, even when it’s driven by the same mutation.

Harnessing Valuable Data

Since MSK-IMPACT launched in 2014, more than 40,000 people have had their tumors tested. The test is now offered to all people treated for advanced cancer at MSK.

In addition to providing detailed information about thousands of patients’ tumors, the test has led to a wealth of genomic data about cancers. It has become a major research tool for learning more about cancer’s origins.

The primary way that pathologists diagnose tumors is to look through a microscope at tissue samples. They also examine the specific proteins expressed by cancers, which can help predict a cancer’s origin. But these tests do not always allow a definitive conclusion.

“However, there are occasionally cases where we think we know the diagnosis based on the conventional pathology analysis, but the molecular pattern we observe with MSK-IMPACT suggests that the tumor is something different,” Dr. Berger explains. “This new tool is a way to computationally formalize the process that our molecular pathologists have been performing based on their experience and knowledge of genomics. Going forward, it can help them confirm these diagnoses.”

This tool will provide additional support for our pathologists to diagnose tumor types.
Michael F. Berger geneticist

“Because cancers that have spread usually retain the same pattern of genetic alterations as the primary tumor, we can leverage the specific genetic changes to suggest a cancer site that was not apparent by imaging or conventional pathologic testing,” says co-author David Klimstra, Chair of MSK’s Department of Pathology.

“Usually the first question from patients and doctors alike is: ‘Where did this cancer start?’ ” says study co-author Anna Varghese, a medical oncologist who treats many people with CUP. “Although even with MSK-IMPACT we can’t always determine where the cancer originated, the MSK-IMPACT results can point us in a certain direction with respect to further diagnostic tests to conduct or targeted therapies or immunotherapies to use.”

Collecting Data on Common Cancers

In the current study, the investigators used data from nearly 7,800 tumors representing 22 cancer types to train the algorithm. The researchers excluded rare cancers, for which not enough data were available at the time. But all the most common types are represented, including lung cancer, breast cancer, prostate cancer, and colorectal cancer.

The analysis incorporated not only individual gene mutations but more complex genomic changes. These included chromosomal gains and losses, changes in gene copy numbers, structural rearrangements, and broader mutational signatures.

“The type of machine learning we use in this study requires a lot of data to train it to perform accurately,” says computational oncologist Barry Taylor, the study’s other senior author. “It would not have been possible without the large data set that we have already generated and continue to generate with MSK-IMPACT.”

Both Drs. Berger and Taylor emphasize that this is still early research that will need to be validated with further studies. In addition, since the method was developed specifically using test results from MSK-IMPACT, it may not be as accurate for genomic tests made by companies or other institutions.

Improving Diagnosis for Cancer of Unknown Primary

MSK’s pathologists and other experts hope this tool will be particularly valuable in diagnosing tumors in people who have CUP. Up to 50,000 people in the United States are diagnosed with CUP every year. If validated for this purpose, MSK-IMPACT could make it easier to select the best therapies and to enroll people in clinical trials.

“This study emphasizes that the diagnosis and treatment of cancer is truly a multidisciplinary effort,” Dr. Taylor says. “We want to get all the data we can from each patient’s tumor so we can inform the diagnosis and select the best therapy for each person.”

This work was funded in part by Illumina, the Marie-Josée and Henry R. Kravis Center for Molecular Oncology, Cycle for Survival, National Institutes of Health grants (P30-CA008748, R01 CA204749, and R01 CA227534), an American Cancer Society grant (RSG-15-067-01-TBG), the Sontag Foundation, the Prostate Cancer Foundation, and the Robertson Foundation.

Dr. Varghese has received institutional research support from Eli Lilly and Company, Bristol-Myers Squibb, Verastem Oncology, BioMed Valley Discoveries, and Silenseed. Dr. Klimstra reports equity in Paige.AI, consulting activities with Paige.AI and Merck, and publication royalties from UpToDate and the American Registry of Pathology. Dr. Berger reports research funding from Illumina and advisory board activities with Roche. All stated activities were outside of the work described in this study.