Cancer is complex. Cancer cells take on aberrant behavior, which causes the symptoms of the disease and sabotages the function of the organs those cancer cells invade. Our immune response to cancer is equally complex; the same mechanisms that we use to recognize and fight infectious diseases recognize what’s different about tumor cells and kill them. While it’s hard to know how often the immune system successfully rids the body of nascent tumors before we can detect them, we do know that when cancer turns fatal, the immune system has failed in its mission.
Immunotherapy describes a collection of biological agents that boost the strength and specificity of immune cells, with the goal of marshaling an army of immune cells trained to fight disease. These therapies work very well on some types of cancer, but not others — and very well in some patients, but not others. We in IPOP want to know why this is and how to make this strategy effective for every patient.
We are using immunogenomics — the combined parallel study of the genomics of tumor cells and immune cells — to learn more about the immune response to cancer. Among the questions we aim to answer are: How does immunotherapy change the immune cell repertoire, and thus the selective pressure leveled against tumor cells? How does the tumor cell population evolve as this pressure changes? What is going on inside a tumor when anti-tumor immunity fails? And eventually, how can we tailor immunotherapy so that every patient experiences the greatest possible benefit with the least risk?
Principles of Immunogenomics
Tumor Evolution and Immune Selection
In cancer, normal cells lose their ability to control their own growth and become immortal. This process can result from rare inherited mutations in critical genes, but more often occurs gradually as a cell sustains damage to its DNA over time. DNA serves as a blueprint for how a cell should function, and so changes in DNA — mutations — can be dangerous.
Our immune system not only fights off infections due to viruses and bacteria but also provides surveillance to protect us from cancer. Both mutations that help cancer cells grow (“driver mutations”) and ones that do not (“passenger mutations”) can cause cells to express new versions of native proteins, which the immune system can then recognize as foreign because they do not occur anywhere else in the body. These foreign proteins, recognizable by our immune system, are called neo-antigens. The key immune cells that protect us from cancer by recognizing these neo-antigens are called T cells.
Cancers evolve through characteristic interactions with the immune system, referred to as the “three Es.” When the immune system is able to destroy an abnormal growth before it becomes a cancer, that’s called “elimination.” When the immune system is able to keep the tumor from growing but not eliminate it completely, that’s called “equilibrium.” When the immune system is either suppressed by the tumor or no longer recognizes it as foreign, that leads to immune “escape” — and net growth of the cancer cell population.
Immunotherapies reinvigorate the immune system when a tumor is escaping, helping to restore equilibrium, or even eliminate the tumor. Immunogenomics can help us better understand what underlying biology makes immunotherapy successful by providing information about the landscape of mutations and neoantigens in the tumor as it changes over time, including over the course of therapy; in parallel, it can help us understand the specificity and strength of the immune response as it acts upon the tumor. Research in this area will help us to better understand how a cancer escapes immune control and to devise better therapies for patients.
Cancer is an evolutionary process and tumor cells can accumulate hundreds of mutations as they grow and divide. Some of these mutations are immunogenic — recognizable as “non-self” by our immune system. For a mutation to be immunogenic, the mutated protein has to be processed inside the cancer cell, and the resulting mutated peptide (called a neo-peptide) must bind to one of the patient’s major histocompatibility complex (MHC) class I molecules in order to be presented on the cell surface. Then, a T cell must be able to recognize the neo-peptide with its T cell receptor (TCR) in order to subsequently trigger an immune response. The physical feature of the neo-peptide recognized by the TCR is called an epitope. Increasing evidence suggests that the immune response to these mutation-derived antigens is very specific and critical for a successful response to immunotherapies, including immune checkpoint blockade and adoptive T cell therapy. (For background and research, see the following additional references: 1, 2, 3, 4, 5, 6, 7.)
Immune Activation and Exhaustion
When a T cell recognizes an antigen with its T cell receptor, it becomes activated and begins to proliferate. The number of activated T cells in the population with that particular receptor increases, and together they can kill the invading tumor cells or other threat.
Because any immune reaction takes energy to sustain, can cause damage to healthy tissues, and takes away from the body’s ability to react to other challenges, the immune system has mechanisms to maintain balance. When a T cell becomes activated, it begins to express other receptors on its cell surface that serve as “off switches.” These switches can be triggered by other cells (such as other immune cells, healthy tissue, and even tumor cells) to shut down the killing action of the activated T cells. Furthermore, the longer a T cell is exposed to the antigen it recognizes, the weaker its ability to kill becomes — a phenomenon called exhaustion. It is thought that both of these mechanisms exist because most threats the immune system faces are acute — they occur suddenly, as in an infection, and are cured by the immune response in days or weeks. Tumors (as well as some infectious diseases) on the other hand, pose a chronic challenge, in which the immune cells are stimulated by the same antigens over longer periods of time (months or years), and so the T cells specific to that threat may become exhausted or actively suppressed by both tumor cells and healthy cells triggering their “off switches” to protect themselves.
Many immunotherapies used against cancer are designed to protect or rescue T cells from this exhaustion or suppression, allowing tumor-specific immune cells to regain their fully active killing functions. We are using high-throughput sequencing of both the immune cells and the tumor cells to: 1) improve immunodiagnostics for determining what aspect of a patient’s immune system is not functioning optimally; 2) describe how the mutations in the tumor population change when it is being selectively killed by rescued immune cells; 3) understand why these immunotherapies work better in some patients than in others; and 4) devise precision combinations of immunotherapies with chemotherapy and radiation therapy to maximize the killing of tumor cells while minimizing the damage to healthy tissues in every patient.
Mutations accumulate in cells due to environmental insults such as UV light and cigarette smoke and from sporadic DNA replication errors that occur during normal cell proliferation. Mutations that confer the ability to proliferate unchecked by the body’s normal regulatory systems are often referred to as driver mutations. Cells with such driver mutations can become abundant in the tumor population. Every time these cells divide, there is a chance that additional mutations will occur due to errors copying the DNA. Thus, in addition to driver mutations, tumor cells often accumulate random damage to many other parts of the genome, including those that do not accelerate cancer’s growth; these are called passenger mutations.
The mutational landscape of a tumor is composed of both driver and passenger mutations, which can be identified using high-throughput next-generation sequencing. Studying the number of each, their abundance in the population, and which mutations seem to have evolved together can reveal key information about the selective pressure the tumor is under (due to competing for limited resources like nutrients and oxygen, struggling to maintain essential cell processes despite rapid growth, or being attacked by the immune system) and can help us choose precise combinations of therapies to target the genetic and immunogenic weaknesses of the tumor.
We use whole-exome sequencing, whole-genome sequencing, and targeted gene sequencing to identify the genomic factors affecting antitumor immune activity. Briefly, our refined pipeline maps raw sequence reads to the human reference genome (GRCh37/hg19); the positions of insertions, deletions, and nucleotide variations are annotated; and artifacts from library preparation are removed.
We are interested in understanding the clonal composition of tumors. A clone is defined as a cluster of cells that share the same mutations, possibly due to a shared lineage. When a tumor contains many such lineages, it is called “subclonal,” and these distinctly arising subclones can accumulate new mutations that provide growth advantages, allowing them to out-grow less competitive subclones. Over time, the most competitive subclones make up a higher overall proportion of the tumor.
Not all subclones in a tumor necessarily respond to immunotherapy the same way. Some subclones may carry mutations that cause a stronger immune response than others. Therefore it is important to understand the clonal composition of tumors in order to design strategies that target enough of the tumor to perturb its growth at a clinically measurable level.
We estimate the relative frequency of cells within a tumor that carry a mutation based on genome sequencing data. For each mutation, we calculate the cancer cell fraction (CCF) based on variant allele frequency of the mutation, its copy number, as well as the sample’s purity. Analysis of CCF can help us identify subclones of cells that develop independently over the lifetime of a tumor, and deduce the relationship between the fitness of those subclones relative to others, as well as their susceptibility to immune targeting.
A major obstacle to the development of a strong, effective immune response to a growing tumor is the fact that tumor cells are very similar to healthy tissue. Antigens that arise in tumor cells due to mutations (neo-antigens) allow the immune system to recognize those tumor cells as non-self and can thereby trigger a tumor-specific immune response. It is thought that the number of neo-antigens present in a tumor is a crucial factor determining whether an immunotherapy will be successful at marshaling an effective antitumor immune response.
We are actively developing novel computational approaches to identify neo-antigens in human cancers. Our current method utilizes the same somatic mutation-calling pipeline as described above (see Genomic Sequencing), followed by neo-epitope analysis.
High-Throughput Tumor Antigen Screening
To better understand the specific interplay between a patient’s mutations and the immune system, mutant peptides are systematically tested for immunogenicity — the ability to activate T cells taken from the same patient. Results of this type of antigen screening can help in the creation of more personalized immunotherapies, such as tumor-specific vaccines or adoptive T cell therapies. Furthermore, IPOP seeks to understand the relative contributions of different types of mutations and antigens to effective immune responses with the goal of making patient-specific therapies more precise.
High-throughput T Cell Receptor Sequencing – TCRseq
Adaptive immune cells — T cells and B cells — help us to recognize specific threats, such as microbial pathogens (e.g., bacteria, viruses, fungi) and tumors. Each T cell or B cell expresses a receptor on its surface — the T cell receptor (TCR) or B cell receptor (BCR), respectively — that can bind to a particular molecular target, and differs from one immune cell to the next. When a TCR or BCR finds its target molecule, called an antigen, the T or B cell is signaled to divide and multiply. Each receptor is unique, generated by random recombination and alteration of DNA during development into a mature T or B cell, and the number of different TCRs that can be generated by one person is huge: between 1012-1020 over the course of a lifetime, with ~109 present in the repertoire at any given time. It is the vast diversity of these receptors that enables any one person to respond to antigens his or her immune system has never encountered before, and to raise an “army” against a particular antigen if it represents a threat.
Many of these immune cells are not circulating freely in the blood, but infiltrate and provide surveillance in tissues. This population of tissue-infiltrating lymphocytes (TILs) differs from the circulating population in that the former represents only a small sample of the total repertoire; T cells surveilling any particular tissue may be selected — on the basis of their receptors as well as growth factors and other signaling molecules — to reside in that particular organ or tissue.
Recent advances in high-throughput next-generation sequencing let us capture the TCRs from a whole sample (TCRseq) — circulating blood cells or T cell-infiltrated tissue — and describe the population in terms of the distribution of those TCRs. Using statistics, we analyze the diversity of these populations, compare them to one another, and look for patterns across groups of patients being treated for cancer. How does the TCR repertoire inside a tumor differ from that in the circulating blood?
We are currently defining properties that indicate tumor-specific reactivity: What does the antitumor T cell response look like when it’s working? When it’s failing? When it has been restored through immunotherapy? These properties may be useful as multi-dimensional biomarkers to monitor tumor progression and therapeutic response. We are also using TCR repertoire sequencing to identify receptors that could be adapted for use as antitumor therapeutics.
TCRseq @ MSK
We perform TCR sequencing of clinical samples on-site in collaboration with the Integrated Genomics Operation, provide analysis of the raw sequencing data (where applicable), as well as supported end-user analysis (under development). IPOP currently supports the following commercial TCR library generation platforms:
- iR Profile (TCRa and TCRb) — iRepertoire
- SMARTer Human TCR a/b Profiling (TCRa and TCRb) — Clontech
- ImmunoSEQ (TCRb only) — Adaptive Biotechnologies
We are actively testing and integrating new products and platforms, and developing immunotherapy-related analytical tools in collaboration with cBioPortal.
One of IPOP’s goals is to extract the immunogenomic information that will allow doctors to anticipate which patients are most likely to respond to immunotherapy. We study tumor phenotype, or cell behavior, which is largely determined by the levels at which each gene is expressed. In particular, we use high-throughput sequencing of RNA from tumor biopsies to study how expression of genes changes as cancer progresses, when therapy is given, and when therapy is effective. Comparing tumors from patients who respond with those from patients who do not allows us to identify any distinct sets of tumor features that can be translated into diagnostic, prognostic, and therapeutic biomarkers to be used for future patients.
The expression levels of genes also provide information about the environment in which the tumor evolves, particularly how the patient’s own immune system reacts to it. Using cutting edge computational techniques, we can integrate this information to understand what types of immune cells are successful in this process.
Tumors differ from one another, in part because each patient’s immune system reacts to a tumor using a unique set of cells to try to destroy it. Abnormal tumor cell behavior, specific antitumor immune activity, non-specific inflammatory immune activity, and tissue damage shape the gene expression profiles of both tumor and non-tumor cells in unique ways.
One application of differential gene expression analyses is to compare the pre-treatment and post-treatment profiles of tumors that responded to immunotherapy with those that did not. We can also identify marker genes or groups of functionally related genes that, if unusually high or low prior to treatment, correspond with better responses to particular therapies. Such predictive signatures could enable a simple pre-treatment biopsy to help tailor a patient’s treatment regimen.
In IPOP, our pipeline for automating and visualizing these analyses is constantly improving. High-dimensional data visualization tools such as oncoprints and Visne maps allow us to organize and render dozens of parameters (e.g., RNASeq gene expression data in parallel with clinical parameters) simultaneously, without sacrificing their complexity, to enrich our understanding of the cancer immune environment.
Cell composition (in silico deconvolution)
Many different immune cell types infiltrate tissues, where they perform different roles in surveilling for tumors, injuries, or infections. For example, certain types of T cells are capable of directly killing dysfunctional, tumorigenic, or infected cells, while monocytes and macrophages take up free-floating cell debris and present these potential antigens to T cells. This interaction, which requires both T cells and antigen-presenting cells, can help locally activate or suppress all the T cells that recognize the same antigens. Meanwhile, B cells produce antibodies that can rapidly spread throughout the body to neutralize a particular threat. Thus the relative abundance of different cell types can indicate which modes of tumor recognition are active, and which may be suppressed.
The type and degree of immune infiltration into tumors plays an important role in the efficacy of immunotherapy. The abundance of the messenger RNA (mRNA) of particular genes in a tissue biopsy not only allows us to identify differential expression gene between samples, but also enables us to calculate the relative abundance of different immune cell types in the local microenvironment. Briefly, from the mRNA of the bulk sample, we can detect high expression of signature genes or enrichment of a subset of genes that are specific to one cell type, and compare it to the expression of genes specific to other cell types. We use computational algorithms such as Supporter Vector Machines (SVMs) or Single-Sample Gene Set Enrichment (ssGSEA) to translate the expression of these signatures into relative abundances of the corresponding immune cell populations.
Because immunotherapies perform different functions — such as maintaining immune cell activation, rescuing immune cells that were activated then became exhausted, or stimulating antitumor reactivity among immune cells that were previously unexposed to tumor antigens — understanding which types of immune cells are present (or not) in the tumor microenvironment has implications for predicting response to these immunotherapies, and choosing the right one for each patient.
T cells recognize microbial threats and cancer by binding to degraded bits of foreign proteins (peptides) presented to them by the molecules of the major histocompatibility complex (MHC). These presentation molecules are expressed on the surface of most cell types, but especially strongly on certain immune cells that provide surveillance of tissues.
The genes that encode MHC class I proteins (called the HLA class I genes in humans) are located on chromosome 6, and there are three of them: HLA-A, HLA-B, and HLA-C. Every person has two copies (alleles) of each gene (one from each parent), and since these genes are the most polymorphic (variable in DNA sequence) in the entire human genome, the six alleles each person has are often all different, and rarely do they match those of genetically unrelated individuals. There are specific alleles (e.g. HLA-A*02:01) that are more prevalent worldwide. Moreover, the frequency of HLA alleles varies across geographic regions and populations.
We are examining how the HLA alleles a patient uses affect responsiveness to immunotherapy. The presentation of peptides to T cells by the MHC proteins plays a critical role in the adaptive immune response, and strongly influences how T cells respond to that peptide. For example, some MHC molecules activate T cells strongly, which is desirable if that antigen represents a threat (such as a viral infection or a dysfunctional or mutated protein produced by a tumor) but can be dangerous if the antigen is normal and occurs on healthy cells. Because potentiating the correct recognition of self versus non-self peptides by T cells is a major function of MHCs, and this distinction becomes muddled in the case of cancer, it is important to use genomic sequencing data to identify which six HLA alleles any patient has when trying to determine how his or her immune system will react to their mutated tumor peptides.
Currently the gold standard for identifying which HLA alleles a patient has is PCR-based typing, in which the HLA locus is specifically amplified and then sequenced. As genomic sequencing has achieved higher and higher coverage, in silico HLA genotyping offers an efficient alternative that is economical when a patient’s genome is already being sequenced. Current software tools provide up to 99% accurate resolution for most clinical applications. For clinical applications that require higher accuracy, such as prediction of tumor antigen presentation by certain HLA alleles, which differ from their closest other alleles by only a few nucleotides, we are refining the computational pipelines for HLA identification using ensemble approaches, population-based weighting, and alternative assemblies of the human reference genomes.
High-dimensional Functional Immune Profiling (CyTOF)
Understanding the cellular composition of tumor and immune cells on the level of phenotypic protein markers is a critical part of investigating tumor immunology. IPOP utilizes several experimental techniques to better quantify the expression of proteins of interest in individual tumor and immune cells. Antibody-based flow cytometry allows for the precise quantification of extracellular and intracellular proteins of interest. Using fluorescence-activated cell sorting (FACS), individual immune or tumor cell populations can be further subdivided for downstream analysis including DNA and RNA sequencing.
Occasionally, investigators may wish to quantify the expression of a large number of intracellular and extracellular proteins simultaneously from a single sample. Conventional flow cytometry limits the number of simultaneous parameters detectable due to fluorophore-generated spectral overlap. To overcome this barrier, IPOP utilizes mass cytometry by time-of-flight (CyTOF) technology. CyTOF identifies intracellular and extracellular proteins using antibodies conjugated to rare earth heavy metals. After antibody-based staining, the sample is ionized and the antibody composition of single cells are subsequently identified. The primary advantage of CyTOF is its ability to analyze a robust user-defined panel of cellular targets simultaneously from a single sample using an antibody-based approach. Multi-parametric data can subsequently be analyzed using conventional flow cytometry software or more sophisticated techniques including SPADE or ViSNE plots. IPOP CyTOF projects are currently done in collaboration with the Mount Sinai Human Immune Monitoring CORE (HIMC) (212-824-9354, firstname.lastname@example.org).
Personnel: Rajarsi Mandal