As the Klein Lab is just starting up, projects are in various stages of development. Here is a list of the first projects we are planning to work on:

  • Improving association tests through the use of priors. Our technique of choice for identifying inherited genetic variants influencing cancer phenotypes is genome-wide association studies. The advantage of this approach is that it is hypothesis-free, so that we make hardly any assumptions about the properties of the variants for which we search. However, this is also a disadvantage as prior knowledge about how a variant may function is ignored. We are investigating how prior information about the functional consequences of a mutation can be incorporated into genetic association studies. By integrating functional genomic data with genotype information, we hope to improve our power to discover DNA variants functionally relevant for cancer.
  • Genetic studies of specific cancer phenotypes. A major focus of the lab is on the identification of genetic variants that influence specific cancer-related phenotypes. To perform such studies requires close collaboration with clinicians, who can evaluate patients for the study and gather their DNA, as well as genomics core facilities that can do the genotyping. We will be performing many such studies in collaboration with various investigators, focusing on investigators at Memorial Sloan-Kettering Cancer Center. Currently, we are working with investigators at Memorial Sloan-Kettering Cancer Center to study the genetics of prostate, pancreatic, breast, gastric, and endometrial tumors as well as myoproliferative disease.
  • Prediction of which SNPs may be functional. Currently, there are 12,000,000 single nucleotide polymorphisms (SNPs) known in the human genome. These SNPs, each of which represents a difference in only one of the three billion nucleotides in the genome, contribute to the genetic variation observed between humans. To design studies that identify which SNPs are functionally related to certain phenotypes, and to interpret data from such studies, it would be useful to be able to predict which SNPs are more or less likely to have a biological effect. For instance, such information could be used as priors in the methods developed in project 1, above. We will use computational genome analysis, including comparative genomics, to identify which SNPs appear to be functionally relevant based on their alignment with known sequence features and conserved regions. We can then use this information to generate testable hypotheses about the function of specific SNPs, and test these hypotheses in vitro.