Statistical Issues in Somatic Mutation Studies of Cancer


The purpose of cancer genome sequencing studies is to determine the nature and types of alterations present in a typical cancer and to discover genes mutated at high frequencies. In the earlier of these studies to investigate the whole genome, about 20 thousand genes were sequenced, using a two stage design where all genes were sequenced on a “discovery” set of samples, and then those in which at least one alteration was found were sequenced in an additional “validation” sample. The two-stage sampling, the rarity of mutations, the varied size and composition of genes, all contribute to generating an interesting and unusual testing ground for statistical methodologies. In this lecture I will present some of the statistical challenges that arise in these studies, with special emphasis on multiple testing and gene set analysis.

Date & Time(s)


307 East 63rd Street, 3rd Floor Conference Room


Giovanni Parmigiani
Department of Biostatistics
Harvard University