What Is Immunogenomics?


Cancer is complex. Cancer cells take on aberrant behavior, which causes the symptoms of the disease and sabotages the function of the organs those cancer cells invade. Our immune response to cancer is equally complex; the same mechanisms that we use to recognize and fight infectious diseases recognize what’s different about tumor cells and kill them. While it’s hard to know how often the immune system successfully rids the body of nascent tumors before we can detect them, we do know that when cancer turns fatal, the immune system has failed in its mission.

Immunotherapy describes a collection of biological agents that boost the strength and specificity of immune cells, with the goal of marshaling an army of immune cells trained to fight disease. These therapies work very well on some types of cancer, but not others — and very well in some patients, but not others. We in IPOP want to know why this is and how to make this strategy effective for every patient.

We are using immunogenomics — the combined parallel study of the genomics of tumor cells and immune cells — to learn more about the immune response to cancer. Among the questions we aim to answer are: How does immunotherapy change the immune cell repertoire, and thus the selective pressure leveled against tumor cells? How does the tumor cell population evolve as this pressure changes? What is going on inside a tumor when anti-tumor immunity fails? And eventually, how can we tailor immunotherapy so that every patient experiences the greatest possible benefit with the least risk?

Principles of Immunogenomics

Tumor Evolution and Immune Selection

In cancer, normal cells lose their ability to control their own growth and become immortal. This process can result from rare inherited mutations in critical genes, but more often occurs gradually as a cell sustains damage to its DNA over time. DNA serves as a blueprint for how a cell should function, and so changes in DNA — mutations — can be dangerous.

Our immune system not only fights off infections due to viruses and bacteria but also provides surveillance to protect us from cancer. Both mutations that help cancer cells grow (“driver mutations”) and ones that do not (“passenger mutations”) can cause cells to express new versions of native proteins, which the immune system can then recognize as foreign because they do not occur anywhere else in the body. These foreign proteins, recognizable by our immune system, are called neo-antigens. The key immune cells that protect us from cancer by recognizing these neo-antigens are called T cells.

mutual selection figure

Mutual selection of tumor cells and immune cells. Some mutations that occur in a tumor cell population (colored dots) give rise to mutant proteins that are immunogenic — recognizable by immune cells, such as lymphocytes. When tissue-infiltrating lymphocytes (TILs) encounter these neo-antigens, those with receptors that bind the neo-antigens will proliferate and become activated. These activated immune cells are then capable of killing the tumor cells that express the neo-antigen. Because not all neoantigens are shared by all tumor cells in the population, this process often leads not to complete elimination of the tumor, but to tumor evolution (bottom right). Other tumor antigens now need to be targeted for immune killing or else the tumor will escape immune control and grow. Gene expression by the tumor cells and the surrounding tumor microenvironment is often a critical variable shaping whether the immune response is strong enough to eliminate the cancer, or weak, allowing escape.

Cancers evolve through characteristic interactions with the immune system, referred to as the “three Es.” When the immune system is able to destroy an abnormal growth before it becomes a cancer, that’s called “elimination.” When the immune system is able to keep the tumor from growing but not eliminate it completely, that’s called “equilibrium.” When the immune system is either suppressed by the tumor or no longer recognizes it as foreign, that leads to immune “escape” — and net growth of the cancer cell population.

Immunotherapies reinvigorate the immune system when a tumor is escaping, helping to restore equilibrium, or even eliminate the tumor. Immunogenomics can help us better understand what underlying biology makes immunotherapy successful by providing information about the landscape of mutations and neoantigens in the tumor as it changes over time, including over the course of therapy; in parallel, it can help us understand the specificity and strength of the immune response as it acts upon the tumor. Research in this area will help us to better understand how a cancer escapes immune control and to devise better therapies for patients.

Antigen Presentation

Cancer is an evolutionary process and tumor cells can accumulate hundreds of mutations as they grow and divide. Some of these mutations are immunogenic — recognizable as “non-self” by our immune system. For a mutation to be immunogenic, the mutated protein has to be processed inside the cancer cell, and the resulting mutated peptide (called a neo-peptide) must bind to one of the patient’s major histocompatibility complex (MHC) class I molecules in order to be presented on the cell surface. Then, a T cell must be able to recognize the neo-peptide with its T cell receptor (TCR) in order to subsequently trigger an immune response. The physical feature of the neo-peptide recognized by the TCR is called an epitope. Increasing evidence suggests that the immune response to these mutation-derived antigens is very specific and critical for a successful response to immunotherapies, including immune checkpoint blockade and adoptive T cell therapy. (For background and research, see the following additional references: 12, 3, 4, 5, 6, 7.)

Antigen-processing machinery in normal and cancer cells

Antigen-processing machinery in normal and cancer cells. In both normal and tumor cells, degraded bits of proteins (peptides) are transported to the endoplasmic reticulum (ER) for loading onto the MHC class I molecules. The MHC-peptide complexes then move to the cell surface where they are monitored by T cells. If an epitope is recognized by the T cell receptor (TCR), it leads to T cell activation, T cell differentiation, and ultimately death of the cells presenting that epitope. These epitopes can be specific to tumor cells (neoepitopes) or epitopes derived from normal proteins that are expressed at unusually high levels in tumor cells.

Immune Activation and Exhaustion

When a T cell recognizes an antigen with its T cell receptor, it becomes activated and begins to proliferate. The number of activated T cells in the population with that particular receptor increases, and together they can kill the invading tumor cells or other threat.

Because any immune reaction takes energy to sustain, can cause damage to healthy tissues, and takes away from the body’s ability to react to other challenges, the immune system has mechanisms to maintain balance. When a T cell becomes activated, it begins to express other receptors on its cell surface that serve as “off switches.” These switches can be triggered by other cells (such as other immune cells, healthy tissue, and even tumor cells) to shut down the killing action of the activated T cells. Furthermore, the longer a T cell is exposed to the antigen it recognizes, the weaker its ability to kill becomes — a phenomenon called exhaustion. It is thought that both of these mechanisms exist because most threats the immune system faces are acute — they occur suddenly, as in an infection, and are cured by the immune response in days or weeks. Tumors (as well as some infectious diseases) on the other hand, pose a chronic challenge, in which the immune cells are stimulated by the same antigens over longer periods of time (months or years), and so the T cells specific to that threat may become exhausted or actively suppressed by both tumor cells and healthy cells triggering their “off switches” to protect themselves.

infographic of immune activation and exhaustion

Many immunotherapies used against cancer are designed to protect or rescue T cells from this exhaustion or suppression, allowing tumor-specific immune cells to regain their fully active killing functions. We are using high-throughput sequencing of both the immune cells and the tumor cells to: 1) improve immunodiagnostics for determining what aspect of a patient’s immune system is not functioning optimally; 2) describe how the mutations in the tumor population change when it is being selectively killed by rescued immune cells; 3) understand why these immunotherapies work better in some patients than in others; and 4) devise precision combinations of immunotherapies with chemotherapy and radiation therapy to maximize the killing of tumor cells while minimizing the damage to healthy tissues in every patient.

Core Technologies

Genomic Sequencing

Mutations accumulate in cells due to environmental insults such as UV light and cigarette smoke and from sporadic DNA replication errors that occur during normal cell proliferation. Mutations that confer the ability to proliferate unchecked by the body’s normal regulatory systems are often referred to as driver mutations. Cells with such driver mutations can become abundant in the tumor population. Every time these cells divide, there is a chance that additional mutations will occur due to errors copying the DNA. Thus, in addition to driver mutations, tumor cells often accumulate random damage to many other parts of the genome, including those that do not accelerate cancer’s growth; these are called passenger mutations.

The mutational landscape of a tumor is composed of both driver and passenger mutations, which can be identified using high-throughput next-generation sequencing. Studying the number of each, their abundance in the population, and which mutations seem to have evolved together can reveal key information about the selective pressure the tumor is under (due to competing for limited resources like nutrients and oxygen, struggling to maintain essential cell processes despite rapid growth, or being attacked by the immune system) and can help us choose precise combinations of therapies to target the genetic and immunogenic weaknesses of the tumor.

diagram of mutation detection

Our current mutation-calling pipeline implements four somatic mutation callers (MuTect, Strelka, Somatic Sniper, and VarScan) to increase the confidence of calling. Low confidence mutations, such as low coverage variants, regions of low mapability (source: encodeproject.org/annotations/ENCSR636HFF/) and loci of DNA sequences with repeats and low complexity (source: www.repeatmasker.org) are filtered out automatically and selected mutations are reviewed manually for quality assurance.

We use whole-exome sequencing, whole-genome sequencing, and targeted gene sequencing to identify the genomic factors affecting antitumor immune activity. Briefly, our refined pipeline maps raw sequence reads to the human reference genome (GRCh37/hg19); the positions of insertions, deletions, and nucleotide variations are annotated; and artifacts from library preparation are removed.

PersonnelNadeem RiazNils Weinhold, Rajarsi Mandal, Jonathan Havel, Luc Morris


We are interested in understanding the clonal composition of tumors. A clone is defined as a cluster of cells that share the same mutations, possibly due to a shared lineage. When a tumor contains many such lineages, it is called “subclonal,” and these distinctly arising subclones can accumulate new mutations that provide growth advantages, allowing them to out-grow less competitive subclones. Over time, the most competitive subclones make up a higher overall proportion of the tumor.

Not all subclones in a tumor necessarily respond to immunotherapy the same way. Some subclones may carry mutations that cause a stronger immune response than others. Therefore it is important to understand the clonal composition of tumors in order to design strategies that target enough of the tumor to perturb its growth at a clinically measurable level.

We estimate the relative frequency of cells within a tumor that carry a mutation based on genome sequencing data. For each mutation, we calculate the cancer cell fraction (CCF) based on variant allele frequency of the mutation, its copy number, as well as the sample’s purity. Analysis of CCF can help us identify subclones of cells that develop independently over the lifetime of a tumor, and deduce the relationship between the fitness of those subclones relative to others, as well as their susceptibility to immune targeting.


In a tumor cell population, particular mutations (colored circles) are often found in subsets of cells; any given tumor cell contains some but not all of the mutations observed in the population as a whole. The efficacy of an immunotherapy that bolsters the T cell response to a particular mutated tumor protein may be strongly influenced by how much of the tumor cell population expresses that mutant protein. (Top) Immunotherapy #1 enhances the T cell response to the mutant protein A (pink), which occurs in 50% of tumor cells. Thus, when immunotherapy #1 allows these T cells to become activated and kill their targets, mutation A is eliminated from the tumor, but the other 50% of tumor cells remain. (Bottom) Immunotherapy #2 enhances the T cell response to the mutant protein B (purple), which occurs in 75% of tumor cells, so treatment results in only 25% of tumor cells (those without mutation B) persisting. By measuring the abundance of mutations in a tumor cell population over time, including during therapy, we can learn about how mutations are linked. For example, when one mutation disappears or becomes more common, which other mutations go with it? We can also determine how immunotherapies are acting on the immune response. Do some therapies only bolster T cell responses to a small number of antigens, while others support more broad T cell stimulation? Together this information about both the tumor target and the nature of the stimulated immune response can be used to more precisely design therapy for each patient.

Personnel: Nadeem Riaz, Nils WeinholdLuc Morris

Neo-Antigen Prediction

A major obstacle to the development of a strong, effective immune response to a growing tumor is the fact that tumor cells are very similar to healthy tissue. Antigens that arise in tumor cells due to mutations (neo-antigens) allow the immune system to recognize those tumor cells as non-self and can thereby trigger a tumor-specific immune response. It is thought that the number of neo-antigens present in a tumor is a crucial factor determining whether an immunotherapy will be successful at marshaling an effective antitumor immune response. 

We are actively developing novel computational approaches to identify neo-antigens in human cancers. Our current method utilizes the same somatic mutation-calling pipeline as described above (see Genomic Sequencing), followed by neo-epitope analysis.

infographic of neoepitope prediction

Our algorithm for predicting neo-epitopes translates all missense mutations identified by the genomic mutation pipeline, generates all 9-amino acid peptides (the most frequently presented peptide length) that would contain the mutation, and uses the netMHC tool to compare the predicted rate of presentation of the mutated peptide to that of the 9-amino acid peptide with the unmutated (wild type) amino acid at the same position by that patient’s particular HLA alleles. Peptides for which the mutated version is more strongly presented than the wild type are considered potential neo-antigens.

PersonnelNadeem Riaz, Vladimir Makarov, Diego Chowell

High-Throughput Tumor Antigen Screening

To better understand the specific interplay between a patient’s mutations and the immune system, mutant peptides are systematically tested for immunogenicity — the ability to activate T cells taken from the same patient. Results of this type of antigen screening can help in the creation of more personalized immunotherapies, such as tumor-specific vaccines or adoptive T cell therapies. Furthermore, IPOP seeks to understand the relative contributions of different types of mutations and antigens to effective immune responses with the goal of making patient-specific therapies more precise.

minigene diagram

In order to maximize screening efficiency, plasmids encoding multiple tandem minigenes (TMGs) are generated. A single minigene consists of the DNA encoding a somatic mutation flanked on both sides by twelve amino acids from the wild type source protein. Up to ten minigenes are strung together to generate the TMGs used in screening. In vitro transcribed mRNA is then introduced into autologous dendritic cells (DCs) via electroporation to enable processing and HLA-presentation of the somatic mutation-containing peptides. Patient-derived T cells are co-cultured with TMG-transfected DCs. Neo-antigen peptide-induced T cell activation is quantified via detection of cytokine (e.g., interferon gamma) production using the highly sensitive ELISpot assay. Results are deconvolved by back-mutating (to wild type) each of the ten mutations contained in a reactive TMG and testing each for cytokine production in the co-culture assay described above. Intracellular cytokine staining is used for orthogonal validation of any positive hits from a minigene antigen screen.

Personnel: Raghu Srivastava, Jonathan Havel, Wei Wu

High-throughput T Cell Receptor Sequencing – TCRseq

Adaptive immune cells — T cells and B cells — help us to recognize specific threats, such as microbial pathogens (e.g., bacteria, viruses, fungi) and tumors. Each T cell or B cell expresses a receptor on its surface — the T cell receptor (TCR) or B cell receptor (BCR), respectively — that can bind to a particular molecular target, and differs from one immune cell to the next. When a TCR or BCR finds its target molecule, called an antigen, the T or B cell is signaled to divide and multiply. Each receptor is unique, generated by random recombination and alteration of DNA during development into a mature T or B cell, and the number of different TCRs that can be generated by one person is huge: between 1012-1020 over the course of a lifetime, with ~109 present in the repertoire at any given time. It is the vast diversity of these receptors that enables any one person to respond to antigens his or her immune system has never encountered before, and to raise an “army” against a particular antigen if it represents a threat.

diagram of Expansion of tumor-specific T cells

Expansion of tumor-specific T cells. A. When the TCR of a T cell binds a target antigen strongly, the T cell becomes activated and proliferates. Thus, that TCR is represented at an expanded frequency in the population. B. In the context of a tumor, a T cell whose TCR recognizes a tumor-specific antigen may be represented in higher abundance by the same mechanism. With immunogenomics, we can learn about tumors and the T cells that recognize them in parallel: How many TCRs are there? What do they have in common? Do patients with the same tumor share the same expanded TCRs? How do the proportions of these TCRs reflect and predict the changes in the abundance of tumor antigen targets?

Many of these immune cells are not circulating freely in the blood, but infiltrate and provide surveillance in tissues. This population of tissue-infiltrating lymphocytes (TILs) differs from the circulating population in that the former represents only a small sample of the total repertoire; T cells surveilling any particular tissue may be selected — on the basis of their receptors as well as growth factors and other signaling molecules — to reside in that particular organ or tissue.

TCRseq libraries represent a sample the peripheral blood and tumor-infiltrating lymphocyte (TIL) cell repertoires. While some TCRs occur at the same rate in both populations (pink), some TCRs are relatively more abundant in the TILs than in the peripheral blood (green, purple), while some are at much lower abundance, to the point that they aren’t detected (cyan). The TCRs that are most enriched in the tumor tissue may reside there disproportionally because their TCRs bind antigens that are present only in the tumor tissue, making these sequences of interest for further study, as possible biomarkers of tumor progression or as therapeutic templates or targets (see below).

Recent advances in high-throughput next-generation sequencing let us capture the TCRs from a whole sample (TCRseq) — circulating blood cells or T cell-infiltrated tissue — and describe the population in terms of the distribution of those TCRs. Using statistics, we analyze the diversity of these populations, compare them to one another, and look for patterns across groups of patients being treated for cancer. How does the TCR repertoire inside a tumor differ from that in the circulating blood?

We are currently defining properties that indicate tumor-specific reactivity: What does the antitumor T cell response look like when it’s working? When it’s failing? When it has been restored through immunotherapy? These properties may be useful as multi-dimensional biomarkers to monitor tumor progression and therapeutic response. We are also using TCR repertoire sequencing to identify receptors that could be adapted for use as antitumor therapeutics.

Particular TCR sequences associated with either the progression or regression of a tumor

Particular TCR sequences associated with either the progression or regression of a tumor could be used directly to develop therapeutics. (Top) The TCRs of cytolytic T cells (CTL) found to expand concomitantly with the regression of cancer could be tested as templates for engineered chimeric antigen receptor (CAR) T cells, which would be able to recognize and attack the tumor using a receptor based on that TCR. (Bottom) The regulatory T cells (Tregs) that inhibit the activity of active antitumor CTLs (thus protecting the tumor) could be blocked by immunotherapies targeting their TCRs.

TCRseq @ MSK

We perform TCR sequencing of clinical samples on-site in collaboration with the Integrated Genomics Operation, provide analysis of the raw sequencing data (where applicable), as well as supported end-user analysis (under development). IPOP currently supports the following commercial TCR library generation platforms:

  • iR Profile (TCRa and TCRb) — iRepertoire
  • SMARTer Human TCR a/b Profiling (TCRa and TCRb) — Clontech
  • ImmunoSEQ (TCRb only) — Adaptive Biotechnologies

We are actively testing and integrating new products and platforms, and developing immunotherapy-related analytical tools in collaboration with cBioPortal

Personnel: Jennifer Sims, Jonathan Havel

Gene Expression

One of IPOP’s goals is to extract the immunogenomic information that will allow doctors to anticipate which patients are most likely to respond to immunotherapy. We study tumor phenotype, or cell behavior, which is largely determined by the levels at which each gene is expressed. In particular, we use high-throughput sequencing of RNA from tumor biopsies to study how expression of genes changes as cancer progresses, when therapy is given, and when therapy is effective. Comparing tumors from patients who respond with those from patients who do not allows us to identify any distinct sets of tumor features that can be translated into diagnostic, prognostic, and therapeutic biomarkers to be used for future patients.

The expression levels of genes also provide information about the environment in which the tumor evolves, particularly how the patient’s own immune system reacts to it. Using cutting edge computational techniques, we can integrate this information to understand what types of immune cells are successful in this process.

Differential expression

Tumors differ from one another, in part because each patient’s immune system reacts to a tumor using a unique set of cells to try to destroy it. Abnormal tumor cell behavior, specific antitumor immune activity, non-specific inflammatory immune activity, and tissue damage shape the gene expression profiles of both tumor and non-tumor cells in unique ways.

infographic of differential expression of tumors

RNA from tumor tissue is subjected to high-throughput next-generation sequencing, which gives short nucleotide “reads” as output. These reads are ordered and aligned to the human genome, giving the amount of RNA from each gene. For each gene, we can then statistically test for differentially higher or lower expression between two groups of samples (for example, the tumors of therapy-responsive patients and non-responsive patients). We can then identify differentially expressed genes, or functionally related groups of genes, that reflect different programs of expressed genes or different cell compositions between tumors.


One application of differential gene expression analyses is to compare the pre-treatment and post-treatment profiles of tumors that responded to immunotherapy with those that did not. We can also identify marker genes or groups of functionally related genes that, if unusually high or low prior to treatment, correspond with better responses to particular therapies. Such predictive signatures could enable a simple pre-treatment biopsy to help tailor a patient’s treatment regimen.

In IPOP, our pipeline for automating and visualizing these analyses is constantly improving. High-dimensional data visualization tools such as oncoprints and Visne maps allow us to organize and render dozens of parameters (e.g., RNASeq gene expression data in parallel with clinical parameters) simultaneously, without sacrificing their complexity, to enrich our understanding of the cancer immune environment.

diagram of clinical response with anti-PD-1

In one recent study, hierarchical clustering of the expression of genes across tumor biopsies from patients who were strongly responsive (+++), weakly responsive (+), or non-responsive (-) to anti-PD-1 therapy identified two subsets of genes, one of which was highly expressed among the clinically responsive patients, and one of which was highly expressed among the non-responsive patients. Enrichment of functionally related genes in these clusters can be used to infer how such a gene signature is related to these levels of immune response.

Cell composition (in silico deconvolution)

Many different immune cell types infiltrate tissues, where they perform different roles in surveilling for tumors, injuries, or infections. For example, certain types of T cells are capable of directly killing dysfunctional, tumorigenic, or infected cells, while monocytes and macrophages take up free-floating cell debris and present these potential antigens to T cells. This interaction, which requires both T cells and antigen-presenting cells, can help locally activate or suppress all the T cells that recognize the same antigens. Meanwhile, B cells produce antibodies that can rapidly spread throughout the body to neutralize a particular threat. Thus the relative abundance of different cell types can indicate which modes of tumor recognition are active, and which may be suppressed.

The type and degree of immune infiltration into tumors plays an important role in the efficacy of immunotherapy. The abundance of the messenger RNA (mRNA) of particular genes in a tissue biopsy not only allows us to identify differential expression gene between samples, but also enables us to calculate the relative abundance of different immune cell types in the local microenvironment. Briefly, from the mRNA of the bulk sample, we can detect high expression of signature genes or enrichment of a subset of genes that are specific to one cell type, and compare it to the expression of genes specific to other cell types. We use computational algorithms such as Supporter Vector Machines (SVMs) or Single-Sample Gene Set Enrichment (ssGSEA) to translate the expression of these signatures into relative abundances of the corresponding immune cell populations.

Because immunotherapies perform different functions — such as maintaining immune cell activation, rescuing immune cells that were activated then became exhausted, or stimulating antitumor reactivity among immune cells that were previously unexposed to tumor antigens — understanding which types of immune cells are present (or not) in the tumor microenvironment has implications for predicting response to these immunotherapies, and choosing the right one for each patient.

From a bulk tumor sample, mRNA molecules are extracted from the mix of immune cells and non-immune cells.

From a bulk tumor sample, mRNA molecules are extracted from the mix of immune cells (colored and gray) and non-immune cells (brown, such as tumor). Running high-throughput next generation sequencing on the mRNA mixture provides the gene expression profile of the biopsy sample (left). Using known gene expression signatures, the proportion of mRNA molecules representing each infiltrated immune populations can be deconvolved (bottom), and the composition (relative abundance) of the immune cell types in the population can be inferred (right).


Personnel: Fengshen Kuo, Alexis Desrichard


T cells recognize microbial threats

T cells recognize microbial threats and cancer by binding to degraded bits of foreign proteins (peptides) presented to them by the molecules of the major histocompatibility complex (MHC). These presentation molecules are expressed on the surface of most cell types, but especially strongly on certain immune cells that provide surveillance of tissues.

The genes that encode MHC class I proteins (called the HLA class I genes in humans) are located on chromosome 6, and there are three of them: HLA-A, HLA-B, and HLA-C. Every person has two copies (alleles) of each gene (one from each parent), and since these genes are the most polymorphic (variable in DNA sequence) in the entire human genome, the six alleles each person has are often all different, and rarely do they match those of genetically unrelated individuals. There are specific alleles (e.g. HLA-A*02:01) that are more prevalent worldwide. Moreover, the frequency of HLA alleles varies across geographic regions and populations.

Frequency of select HLA-A alleles across different geographic regions

Frequency of select HLA-A alleles across different geographic regions. Shown are the normalized frequencies of some HLA-A alleles in diverse geographic regions. For each HLA-A allele, each colored bar represents the frequency of the allele in a particular geographic region. The data were obtained from http://www.allelefrequencies.net/.

We are examining how the HLA alleles a patient uses affect responsiveness to immunotherapy. The presentation of peptides to T cells by the MHC proteins plays a critical role in the adaptive immune response, and strongly influences how T cells respond to that peptide. For example, some MHC molecules activate T cells strongly, which is desirable if that antigen represents a threat (such as a viral infection or a dysfunctional or mutated protein produced by a tumor) but can be dangerous if the antigen is normal and occurs on healthy cells. Because potentiating the correct recognition of self versus non-self peptides by T cells is a major function of MHCs, and this distinction becomes muddled in the case of cancer, it is important to use genomic sequencing data to identify which six HLA alleles any patient has when trying to determine how his or her immune system will react to their mutated tumor peptides.

Currently the gold standard for identifying which HLA alleles a patient has is PCR-based typing, in which the HLA locus is specifically amplified and then sequenced. As genomic sequencing has achieved higher and higher coverage, in silico HLA genotyping offers an efficient alternative that is economical when a patient’s genome is already being sequenced. Current software tools provide up to 99% accurate resolution for most clinical applications. For clinical applications that require higher accuracy, such as prediction of tumor antigen presentation by certain HLA alleles, which differ from their closest other alleles by only a few nucleotides, we are refining the computational pipelines for HLA identification using ensemble approaches, population-based weighting, and alternative assemblies of the human reference genomes.

Personnel:  Diego Chowell, Vladimir Makarov, Fengshen Kuo

High-dimensional Functional Immune Profiling (CyTOF)

Understanding the cellular composition of tumor and immune cells on the level of phenotypic protein markers is a critical part of investigating tumor immunology. IPOP utilizes several experimental techniques to better quantify the expression of proteins of interest in individual tumor and immune cells. Antibody-based flow cytometry allows for the precise quantification of extracellular and intracellular proteins of interest. Using fluorescence-activated cell sorting (FACS), individual immune or tumor cell populations can be further subdivided for downstream analysis including DNA and RNA sequencing.

Occasionally, investigators may wish to quantify the expression of a large number of intracellular and extracellular proteins simultaneously from a single sample. Conventional flow cytometry limits the number of simultaneous parameters detectable due to fluorophore-generated spectral overlap. To overcome this barrier, IPOP utilizes mass cytometry by time-of-flight (CyTOF) technology. CyTOF identifies intracellular and extracellular proteins using antibodies conjugated to rare earth heavy metals. After antibody-based staining, the sample is ionized and the antibody composition of single cells are subsequently identified. The primary advantage of CyTOF is its ability to analyze a robust user-defined panel of cellular targets simultaneously from a single sample using an antibody-based approach. Multi-parametric data can subsequently be analyzed using conventional flow cytometry software or more sophisticated techniques including SPADE or ViSNE plots. IPOP CyTOF projects are currently done in collaboration with the Mount Sinai Human Immune Monitoring CORE (HIMC) (212-824-9354, immunemonitoring@mssm.edu).

Tumor-infiltrating lymphocytes and peripheral blood mononuclear cells

Tumor-infiltrating lymphocytes and peripheral blood mononuclear cells from a patient with head and neck squamous carcinoma were clustered by their staining for immunophenotypic markers (clusters represent cells with similar phenotypes, where closeness on the plot indicates similarity), using the t-distributed stochastic neighbor embedding (t-SNE) algorithm. Color scale indicates CD8 protein detected, normalized across cells, which distinguishes the cell types (clusters) expressing this marker of the cytolytic T cell lineage.


Personnel:  Rajarsi Mandal

Core Technology Partnerships