Dr. Shen’s research interest lies in developing statistical and computational genomics approaches and their applications to translational cancer research. She developed iCluster, a statistical data integration method for defining molecular subtypes of cancer and associated biomarkers across multiple “omic” data types simultaneously characterizing genomic, epigenomic, transcriptomic, and proteomic aberrations in a tumor. Her method has been widely used for integrative cancer subtype analysis in large-scale cancer genome consortium studies including the NCI/NHGRI Cancer Genome Atlas (TCGA) and the Canada-UK Molecular Taxonomy of Breast Cancer International Consortium (METABRIC). Working with thoracic oncologists at MSKCC, she applied statistical machine learning approaches for characterizing a patient’s prognostic risk based on the somatic mutational profile of the tumor, and explored the notion of a genomic staging of lung adenocarcinomas (stage IV) in real-world oncology datasets. She is also interested in tumor clonal heterogeneity analysis. Together with Dr. Venkatraman Seshan, she developed FACETS, an allele-specific copy number analysis method that can be used to explore copy number aberrations and clonal heterogeneity within a tumor using whole-genome, whole-exome and targeted capture sequencing data. Her recent research interest also includes a novel investigation of somatic variant richness using statistical methodologies developed in ecology and computational linguistics, a joint work with Dr. Colin Begg and Dr. Saptarshi Chakraborty. This project uses sophisticated statistical tools to extract information from rare variants in existing databases with a view to identifying the site of origin for cancers of unknown primaries and cancers detected from circulating cell-free DNA in the blood.