Despite its critical role in cancer diagnosis, the practice of pathology has changed very little in the last 150 years. It still requires taking a biopsy and studying a thin layer of tissue on a glass slide under a microscope.
A type of artificial intelligence called machine learning holds the potential to transform cancer pathology. Thomas Fuchs, a data scientist and expert in machine learning at Memorial Sloan Kettering, is leading a team that trains supercomputers to recognize cancer on digitized microscope slides. Speeding the process of analyzing samples could enable pathologists to focus their attention on the most relevant slides.
This initiative has now reached a major milestone with the publication of a study analyzing more than 44,000 digitized glass slide images from more than 15,000 people with cancer. The results show that this advanced machine-learning approach identifies nearly 100% of the cancer-containing biopsies, allowing pathologists to focus on the most informative portions of the biopsies. The findings are reported in Nature Medicine.
“We have built a model that can detect cancer with great accuracy, in line with what you would find with trained expert pathologists,” Dr. Fuchs says. “This novel machine-learning system could potentially be used by pathologists everywhere to help them make more accurate diagnoses.”
The new method developed by Dr. Fuchs’s team harnesses powerful computer technology trained on a vast amount of data. This is essential for building a reliable system that is clinical grade and safe for use in medical practice.
Other models can collapse when faced with actual clinical samples. “It’s like training a self-driving car in an empty parking lot,” Dr. Fuchs says. “It may work at first, but as soon as you hit Manhattan, it will fail drastically.”
Eliminating a Bottleneck
Gabriele Campanella, a graduate student of Dr. Fuchs’s, led the implementation of the system. It is based on deep learning, an advanced form of machine learning that mimics how the human brain recognizes objects by looking at examples. The system constantly adjusts its parameters when it makes mistakes and enables the computer model to detect cancer with high accuracy.
A powerful feature of this kind of artificial intelligence is its ability to train itself to identify cancer with less human input. Previous methods required experienced pathologists to painstakingly annotate slides to mark precisely where the cancer was. Only then was the information used to train the computer models to detect the cancer.
“With earlier approaches, the burden of annotation fell on a very small pool of human experts, which severely limited the number of slides that could be examined,” Mr. Campanella says. “Only 500 to 1,000 slides have been used in the past to train deep-learning models in pathology.”
The deep learning approach used by Dr. Fuchs’s team does not require annotations on the images. First, each slide is divided into very small sections — several thousand per slide. Through constant iteration, the computer model teaches itself what is or is not cancer by comparing its own result to the pathology report.
The automated process could be used in places that don’t have access to specialists at major cancer centers. Pathologists can spend hours each day studying slides that may yield little useful information. A prostate biopsy procedure from one patient, for example, can produce more than 45 slides, and each slide must be individually reviewed.
“Detecting small cancer lesions in these large images is like looking for a needle in a haystack, and pathologists have to be trained for years to be able to do that dependably,” Dr. Fuchs says. “This model provides a reproducible, robust system that helps them be accurate and more efficient, which is also better for their patients.”
Real-World Clinical Data
The achievement described in the study is also notable because the data were drawn from all the slides taken over a year at MSK, as a pathologist would encounter in actual practice. The patients had one of three common types of cancer: prostate cancer, skin cancer, or breast cancer that had spread to lymph nodes. Some slides included common artifacts such as air bubbles and slicing irregularities that can make them hard to read.
“This is the first truly clinical-grade artificial intelligence model in pathology,” Dr. Fuchs says. “We showed that we can train deep learning models on a large scale, using messy, real-world data.”
This is in stark contrast with earlier deep learning studies, which used “curated” sets of slides selected because they were clear, well defined, and easy to analyze.
A Widespread Benefit
Digital pathology has not been widely used for primary diagnosis in the United States. This is in part because the technology is costly, and the benefits of using a digital platform have been difficult to demonstrate. This artificial intelligence is the culmination of several years of close collaboration between MSK’s computer scientists, machine learning experts, pathologists, and oncologists.
“Memorial Sloan Kettering is a unique environment, with experts in different fields constantly working hand in hand,” Dr. Fuchs says. “But the ultimate goal is to democratize this knowledge so that pathologists and patients worldwide can benefit.”