By Huwenbo Shi, Feb 16th, 2018, see original post @Huwenbo

The paper “Explaining Missing Heritability Using Gaussian Process Regression” by Sharp et al. tries to tackle the problem of missing heritability and the detection of higher-order interaction effects through Gaussian process regression, a technique widely used in the machine learning community. The authors obtained estimates of broad-sense heritability for a number of mice and yeast phenotypes using an RBF kernel that models higher-order interactions and found these estimates significantly larger than the narrow-sense heritability of these phenotypes. The authors also detected several loci displaying interaction effects.

In genetics, phenotypes are modeled by the following equation

where width="20px" height="20px"> is the phenotype measurement of the i-th individual, the genotype vector, a random effect term that captures relatedness among individuals, and the environmental noise. Here, is a function that maps the genotype vector into a real number. Under this model, heritability is defined as the proportion of variance in that is due to variation of ,

Different flavors of heritability exist based on the complexity of the function and the input that goes into . In general geneticists work with four types of heritability, as listed below.

- Broad-sense heritability: Broad-sense heritability is the amount of variance in phenotypes that is due to all genetic variations including both additive and epistatic effects. For , the function can be any function that incorporates any order of interactions between genetic variations. This is the most general definition of heritability.
- Narrow-sense heritability : Narrow-sense heritability is the amount of variance in phenotypes that is due to all additive genetic effects. For , the function is a linear function that takes in first-order terms.
- SNP heritability : SNP heritability is the amount of variance in phenotypes that is due to additive genetic effects of a given set of SNPs. For , the function is a linear function that takes in a fixed set of SNPs.
- GWAS heritability : GWAS heritability is the amount of variance in phenotypes that is due to additive genetic effects of GWAS hits. For , the function is a linear function that takes in GWAS hits only.

Based on the definition of the four flavors of heritability, it follows that . The missing heritability problem often refers to the gap between and the narrow-/broad-sense heritability.

Parametric regression problems often involve a function, governed by a set of parameters , that maps each input with a response. For example, in Poisson regression , the distribution of the response variable is characterized by the mean parameter and the density function of Poisson.

Gaussian Process Regression is different from parametric regression in that one does not assume any parametric form for the function . Instead, a Gaussian Process prior assumes that the function values of , , for a number of inputs, , follow a multivariate normal distribution where is the kernel matrix, measuring the similarity between samples, that contraints the possible space of . Because the only constraint on the kernel function is that the covariance matrix is positive definite, this enables Gaussian Process Regression to model a broad range of functions.

The following is a list of kernel functions that are widely used (credit to Wikipedia),

- Linear kernel:
- Polynomial kernel:
- RBF kernel:

Specifying the kernel function is a fundamental step of Gaussian Process Regression. An appropriate kernel allows one to model interaction of any order among genetic variations. In the Sharp et al. paper, the authors proposed a generalized version of the RBF kernel to measure similarity between two individuals and across the genotypes of SNPs,

where is a parameter that governs the overall similarity between and , the contribution of SNP to the variations of the phenotype - a large suggests that SNP contributes little to the variation of the phenotype, and a small implies signifiant contribution. By examining the magnitude of the hyperparameters , one can infer whether a genetic loci contribute significantly to the trait.

Overfit may occur when the number of parameters to estimate is larger than the amount of data one has. To avoid overfitting and improve parsimony of the model, the authors imposed a Gamma prior over the inverse of , . The Gamma prior has density function

Setting removes any mode in the density function, resulting in a monotonically decreasing function with a heavy tail concentrated around 0 (see figure below), enforcing most of to be close to zero.

Gaussian Process prior allows one to analytically perform integration over the space of , resulting in a posterior for the parameters

where incorporates the sparsity-inducing priors. The integration step effectively averages over all possible f(⋅)f(⋅), discarding the need to estimate each instance of separately. This step also increases power to detect loci that contribute to phenotypes.

There is no analytical solution to the posterior mode or mean of θ. However, sampling based approach (e.g. MCMC) can be used to start from a starting point and lead to the posteior mode. In the Sharp et al. paper, a Hybrid Monte Carlo that models a particle’s trajectory was used to make inference over θ.

Once the parameters estimated, one can use these estimates to quantify broad-sense heritability from the Gaussian Regression model. The basic idea is as follows:

- For each sample in the training data, one first predicts its phenotype using the estimated parameters.
- The variance of the predicted phenotype can be found analytically using the conditional distribution of multivariate normal.
- The ratio between the sum of each individual’s variance and the phenotype variance gives the broad-sense heritability.