We are interested in methods that integrate functional genomic data sets (e.g., ENCODE/Roadmap, gene expression) with GWAS to investigate the genetic basis of complex and monogenic traits. We focus both on novel locus discovery (e.g., by integration of gene expression measurements with association strength to discover new risk genes) or in fine-mapping of existing GWAS risk loci (e.g., by statistically overlapping functional annotation and population-specific LD patterns). See Gusev et al biorxiv 2015, Kichaev et al AJHG 2015 or Kichaev et al PlosGenetics 2014
With rapidly decreasing costs, sequencing is emerging as appealing alternative to genotyping arrays for large scale disease studies. We are interested in the design and analysis of cost-effective sequencing-based studies over tens of thousands of samples with the goal of maximizing power to identify disease associations per budget invested. See our extremely low-coverage sequencing GWAS paper (Pasaniuc et al. 2012)* or Mancuso et al Nat Genetics 2015.
The genome of admixed individuals, such as African Americans or Latinos, is a mosaic of chromosomal regions originating from the ancestral populations. Inferring the ancestral origin of each of these regions is a key component in disease mapping in admixed populations. Our ongoing research focuses on a wide array of problems in this area ranging from methods for local ancestry inference to optimally powered association statistics that take into account differences in the genetic makeup of the ancestral populations. See, our LAMP-LD (Baran et al. 2012)* method for local ancestry inference as well as our MIXSCORE approach for incorporation of admixture and GWAS association signal to improve power (Pasaniuc et al. 2011)*.