Pathfinder: Integrating SNP, chromatin and expression data for probabilistic fine-mapping

Genome-wide association studies (GWAS) have revealed that the majority of variants associated with disease lie in noncoding regulatory sequences. More recent studies have identified thousands of QTLs associated with chromatin modifications, which in turn are known to associate with changes in gene regulation. Thus, one proposed mechanism by which these genetic variants act is through epigenetic features, such as histone modifications, which in turn have downstream effects on transcription. In this work, we propose a method that integrates information across all three levels of data – genetic, chromatin, and gene expression – in order to identify the causal variant and chromatin mark that may be influencing gene expression. We demonstrate in simulations that our probabilistic approach produces well-calibrated posterior probabilities for causality and outperforms existing methods with respect to SNP-, mark-, and overall path-mapping.

Beta version of the code can be found here.

Please contact Megan (meganroytman@ucla.edu) or Bogdan(pasaniuc@cla.edu) for any questions/suggestions about the software.