Bayesian models with dominance effects for genomic evaluation of quantitative traits

ROBIN WELLMANN; JÖRN BENNEWITZ

doi:10.1017/S0016672312000018

Bayesian models with dominance effects for genomic evaluation of quantitative traits

Published online by Cambridge University Press: 22 February 2012

ROBIN WELLMANN and

JÖRN BENNEWITZ

Show author details

ROBIN WELLMANN*: Affiliation:
Department of Animal Husbandry and Animal Breeding, University of Hohenheim, D-70599 Stuttgart, Germany
JÖRN BENNEWITZ: Affiliation:
Department of Animal Husbandry and Animal Breeding, University of Hohenheim, D-70599 Stuttgart, Germany
*: *Corresponding author: Department of Animal Husbandry and Animal Breeding, University of Hohenheim, D-70599 Stuttgart, Germany. Tel: +711 459 23008. Fax: +711 459 23101. e-mail: [email protected]

Article contents

Summary
Introduction
Theory
Application
Discussion
References

Rights & Permissions

Summary

Genomic selection refers to the use of dense, genome-wide markers for the prediction of breeding values (BV) and subsequent selection of breeding individuals. It has become a standard tool in livestock and plant breeding for accelerating genetic gain. The core of genomic selection is the prediction of a large number of marker effects from a limited number of observations. Various Bayesian methods that successfully cope with this challenge are known. Until now, the main research emphasis has been on additive genetic effects. Dominance coefficients of quantitative trait loci (QTLs), however, can also be large, even if dominance variance and inbreeding depression are relatively small. Considering dominance might contribute to the accuracy of genomic selection and serve as a guide for choosing mating pairs with good combining abilities. A general hierarchical Bayesian model for genomic selection that can realistically account for dominance is introduced. Several submodels are proposed and compared with respect to their ability to predict genomic BV, dominance deviations and genotypic values (GV) by stochastic simulation. These submodels differ in the way the dependency between additive and dominance effects is modelled. Depending on the marker panel, the inclusion of dominance effects increased the accuracy of GV by about 17% and the accuracy of genomic BV by 2% in the offspring. Furthermore, it slowed down the decrease of the accuracies in subsequent generations. It was possible to obtain accurate estimates of GV, which enables mate selection programmes.

Type: Research Papers
Information: Genetics Research , Volume 94 , Issue 1 , February 2012 , pp. 21 - 37

DOI: https://doi.org/10.1017/S0016672312000018 [Opens in a new window]
Copyright: Copyright © Cambridge University Press 2012

Introduction

Genomic selection refers to the use of genome wide and dense markers for the prediction of breeding values (BV) and subsequent selection of individuals (Meuwissen et al., Reference Meuwissen, Hayes and Goddard2001). Results from simulations and from real validation studies revealed that the accuracy of predicted BV of individuals without own records or without progeny records can be remarkably high (Meuwissen et al., Reference Meuwissen, Hayes and Goddard2001; Calus et al., Reference Calus, Meuwissen, de Roos and Veerkamp2008; Luan et al., Reference Luan, Woolliams, Lien, Kent, Svendsen and Meuwissen2009; Habier et al., Reference Habier, Tetens, Seefried, Lichtner and Thaller2010; Hayes et al., Reference Hayes, Bowman, Chamberlain and Goddard2009), which offers the opportunity to accurately select individuals at an early stage of their life as parents of the next generation. This technique has become a standard tool in dairy cattle breeding (Hayes et al., Reference Hayes, Bowman, Chamberlain and Goddard2009), and its implementation in other livestock species is foreseen, e.g. in poultry breeding (Wolc et al., Reference Wolc, Stricker, Arango, Settar, Fulton, O'Sullivan, Preisinger, Habier, Fernando, Garrick, Lamont and Deckers2011), in pig breeding, and also in plant breeding (Piepho, Reference Piepho2009; Heffner et al., Reference Heffner, Sorrells and Jannink2009).

The core of genomic selection is the prediction of BV from massive marker data. Most influential methods were already proposed by Meuwissen et al. (Reference Meuwissen, Hayes and Goddard2001). These are genomic best linear unbiased prediction (G-BLUP), BayesA and BayesB, which differ in their assumptions about the distribution of marker effects. G-BLUP assumes a normal distribution of marker effects, whereas BayesA assumes a more heavy tailed Student t-distribution (Gianola et al., Reference Gianola, de los Campos, Hill, Manfredi and Fernando2009). Since the Student t-distribution approximates the normal distribution when the degree of freedom v increases, G-BLUP can be considered as a limiting case of BayesA. Although these methods perform well in simulation studies and applications, many markers are not needed to capture the effects of quantitative trait loci (QTLs) because they are either redundant or not in linkage disequilibrium (LD) with a QTL. This is accounted for by BayesB, which assumes that marker effects come from a Student t-distribution with a certain probability and take the value 0 otherwise. A marker effect comes from the Student t-distribution if the marker is needed to capture a QTL effect. BayesB contains BayesA as the special case where the prior probability of a marker to be needed equals one. An alternative is the stochastic search variable selection (SSVS) model (George & McCulloch, Reference George and McCulloch1993) that was introduced to QTL mapping by Yi et al. (Reference Yi, George and Allison2003) and was applied by Meuwissen (Reference Meuwissen2009) to genomic selection. It assumes that effects of single nucleotide polymorphisms (SNPs) come from a mixture of two normal distributions with different variances, and SNPs with negligible effects come from the distribution with little variance. Bayesian SSVS is similar, but assumes prior distributions for the variances of SNP effects. As a consequence, SNP effects are from a mixture of two Student t-distributions with different variances. Bayesian SSVS is also known as BayesC (Verbyla et al., Reference Verbyla, Hayes, Bowman and Goddard2009, Reference Verbyla, Bowman, Hayes and Goddard2010). It contains BayesB as the limiting case, in which the variance of one mixing distribution approaches zero. Further Bayesian models that have been proposed are the Bayesian Lasso that assumes Laplace priors (Park & Casella, Reference Park and Casella2008; Legarra et al., Reference Legarra, Robert-Granié, Croiseau, Guillaume and Fritz2011), and a Bayesian model that uses identity by descent (IBD) probabilities (Meuwissen & Goddard, Reference Meuwissen and Goddard2004). However, Calus et al. (Reference Calus, Meuwissen, de Roos and Veerkamp2008) found that a marker effect–based model that does not make use of IBD probabilities provides similar accuracies for high–density marker panels.

Nearly all published models only include additive effects (Calus, Reference Calus2010) and little has been done to generalize these models for the prediction of genomic BV to explicitly account for dominance. One reason is probably that estimated BV or deregressed BV obtained from routine evaluations (Garrick et al., Reference Garrick, Taylor and Fernando2009) are used as observations in most applications of genomic selection, so dominance deviations of individuals are absent in the data. However, if individual phenotypes are available, the inclusion of dominance effects could not only increase the accuracy of genomic selection, but predicted dominance effects could also be used to find mating pairs with good combining abilities by recovering inbreeding depression and utilizing possible overdominance. Toro & Varona (Reference Toro and Varona2010) demonstrated that the inclusion of dominance into Bayesian models can indeed increase the accuracy of genomic selection. They assumed independent additive and dominance effects which were in accordance with their simulation protocol. Reproducing kernel Hilbert space regression (RKHS regression) is proposed as an alternative method for the estimation of genotypic values (GV), especially if non-additive effects such as dominance or epistasis are included in the phenotypes (Gianola et al., Reference Gianola, Fernando and Stella2006). It assumes that for each pair of genotypes, the covariance of its GV is defined by a covariance function (de los Campos et al., Reference de los Campos, Gianola and Rosa2009). The covariance function is the reproducing kernel of a Hilbert space. RKHS regression yields an estimate of the function g that maps genotypes to GV. This estimate is optimal in a well–defined sense among all functions that belong to the Hilbert space (Gianola & van Kaam, Reference Gianola and de los Campos2008). RKHS regression does not make assumptions about the mechanism that makes GV random since any symmetric and finitely positive-semidefinite function K is the reproducing kernel of a Hilbert Space (Shawe-Taylor & Cristianini, Reference Shawe-Taylor and Cristianini2004).

The aim of this paper is to introduce Bayesian linear regression models for genomic evaluation of quantitative traits that account for dominance effects of QTLs. These models are generalizations of Bayesian SSVS that includes only additive effects. The proposed models differ in the way the dependency between additive effects, dominance effects and allele frequencies is modelled. Plausible informative priors are chosen which are in agreement with the genetic architectures of quantitative traits suggested in the literature. We call these generalizations the BayesD models, where D stands for dominance. The paper is organized as follows. We start with a brief literature review on the dependencies between additive effects, dominance effects and allele frequencies in real populations and discuss how Bayesian models could account for these dependencies. Statistical models that realistically account for these dependencies are defined thereafter. The joint posterior distribution of unknown model parameters is given and used to derive a Markov chain for the prediction of these parameters. Moments of the random effects are derived and are used for the calculation of the hyper-parameters. We also demonstrate that kernels for RKHS regression can be derived from the assumptions of our models. The resulting RKHS estimate is the BLUP for the underlying Bayesian model. The proposed methods are applied to a simulated population. The models are compared with respect to their ability to predict dominance deviations, GV and BV.

Theory

Possibilities to model the genetics of dominance

Since the aim of this paper is to account for dominance in genomic evaluations of quantitative traits, we present different possibilities to model the joint distribution of additive and dominance effects of the markers and show how these models are able to account for the genetic architectures that are suggested in the literature. The mathematical definitions of our models are given in the next section. Note that the proofs of all numbered equations can be found in the electronic appendix. Table 1 summarizes symbols used in this paper.

Table 1. Table of symbols

Consider biallelic QTLs with alleles 0 and 1. In this section, a _j and d _j denote the additive and dominance effect of QTL j. The 1-allele has frequency p _j and the 0-allele has frequency q _j=1−p _j. The three possible genotypes 00, 01 and 11 have GV 0, a _j+d _j and 2 a _j. Dominance effects can cause inbreeding depression. Inbreeding depression of a trait is the expected decrease of the phenotypic value when the inbreeding coefficient increases from 0 to 1. >0 requires that the expectation of dominance effects is positive, provided that allelic effects are assumed to be random realisations from some distribution. This means that the effect of the heterozygous genotype is usually above the average effect of the two homozygous genotypes. Analogously, for a trait with outbreeding depression we have <0 and the expectation of dominance effects is negative. Dominance effects also cause dominance variance. Dominance variances of up to 50% of the additive variance have been reported for traits in dairy cattle (van Tassell et al., Reference van Tassell, Misztal and Varona2000). Substantially larger dominance variances are unlikely to occur due to the U-shaped distribution of allele frequencies (Hill et al., Reference Hill, Goddard and Visscher2008). Dominance variances have also been estimated for some traits in beef cattle (Duangjinda et al., Reference Duangjinda, Bertrand, Misztal and Druet2001) and pigs (Serenius et al., Reference Serenius, Stalder and Puonti2006). In general, estimates of dominance variance must be interpreted with caution because they could be confounded e.g. by maternal effects or environmental covariance of full sibs. Moreover, an accurate estimation of dominance variance requires at least 20 times as much data as the estimation of additive variance (Misztal, Reference Misztal1997).

Figure 1 gives an overview of different possibilities to model the joint distribution of additive effects and dominance effects. The figure shows samples drawn from the proposed joint prior distribution of additive and dominance effects of markers with allele frequency q _j=p _j=0·5 for different scenarios, where additive effects are assumed to be Student t-distributed with v=2·5 degrees of freedom. Note that the models presented in the next section are more general than this illustrative example.

Fig. 1. Samples drawn from the joint prior distribution of additive and dominance effects of markers with allele frequency q_j =p_j =0·5, where additive effects are Student t-distributed with v=2·5 degrees of freedom. The distribution specifications of BayesD1–BayesD3 are given in Section 2(ii).

The simplest possibility is to assume independence of additive and dominance effects, where dominance effects have the same distribution as additive effects. It can be seen in Fig. 1 that this prior assumes that for QTL with large dominance effect the additive effect is likely small in magnitude. This is because the t-distribution is heavy tailed, so it allows ‘large’ effects to occur. But large effects are rare events, so under independence, the probability is small that for a large dominance effect, the additive effect is also large. Thus, dominance is mainly due to overdominant alleles. But this is out of line with standard genetics theory (Kacser & Burns, Reference Kacser and Burns1981; Charlesworth & Willis, Reference Charlesworth and Willis2009) because theory predicts that recessive deleterious alleles rather than overdominant alleles are the primary cause of inbreeding depression. This model is therefore not further considered.

A second possibility (BayesD1) is to assume that the joint distribution of additive and dominance effects is elliptical with Student t-distributed margins. Roughly speaking, this model assumes that additive and dominance effects of a QTL are of the same magnitude. Figure 1 shows that overdominance is quite common under this assumption. The importance of overdominance cannot yet be accurately quantified (Charlesworth & Willis, Reference Charlesworth and Willis2009), so this model may be appropriate for populations in which many overdominant QTLs are expected to exist. For most applications, however, the prior possibly allows for too much overdominance.

Therefore, we consider as a third possibility (BayesD2) the case that absolute additive effects and dominance coefficients are independent. The dominance coefficient $\delta _{j} \equals {\textstyle{{d_{j} } \over {\vert a_{j} \vert }}}$ of a QTL is the ratio between the dominance effect and the absolute additive effect. If δ_j>1 or δ_j<−1 then the QTL is overdominant (or underdominant). If δ_j=1 or δ_j=−1 then the QTL is dominant or recessive. For δ_j=0, the QTL is completely additive and a QTL with −1<δ_j<0 or 0<δ_j<1 shows incomplete dominance or recessivity. As demonstrated in Fig. 1, the model is able to make the prior assumption that the probability is small that a dominance effect is much larger in magnitude than the additive effect, so overdominance is a rare but not negligible event. This is in accordance with Bennewitz & Meuwissen (Reference Bennewitz and Meuwissen2010) who used results from QTL experiments and found that dominance coefficients were normally distributed with small but positive mean. Here, we also assume a normal distribution for the dominance coefficient.

Caballero & Keightley (Reference Caballero and Keightley1994) showed that the dominance effect depends on the additive effect such that QTLs with large absolute additive effect likely have large dominance coefficients. Moreover, they concluded tentatively that mutations of small effect show highly variable degrees of dominance with an average dominance coefficient close to zero. Deleterious alleles with large effect are usually close to recessivity due to the hyperbolic relationship between enzyme activity and flux (Kacser & Burns, Reference Kacser and Burns1981). The prior distribution of BayesD2 cannot account for this. Therefore, we consider as a fourth possibility (BayesD3) the case that markers with large absolute additive effects tend to be associated with large dominance coefficients. That is, for additive effects of markers that are small in magnitude, the average dominance coefficient is close to 0, and for additive effects of markers that are large in magnitude the average dominance coefficient is close to 1. This is also in accordance with García-Dorado et al. (Reference García-Dorado, López-Fanjul and Caballero1999), who suggested an average dominance coefficient of 0·8 for non-severe deleterious mutations and of 0·94–0·98 for new lethal mutants.

In addition to the dependency between dominance effects and absolute additive effects, the dependency between the allele frequencies and the signs of the allelic effects could also be considered. They are dependent because selection has likely shifted allele frequencies away from values for which the expected allele-frequency change per one generation of selection is high. From this argument, a characterization of the dependency can be derived as follows: since selection for (or against) a recessive allele is inefficient if its frequency is low (Falconer & Mackay, Reference Falconer and Mackay1996; Fig. 2.2), recessive alleles likely have low frequencies. Recessiveness of the 1-allele implies that d _j<0 if a _j>0 and d _j>0 if a _j<0. Since p _j<0·5 is likely to hold, we have sign(a _j)=−sign((q _j−p _j)d _j) in both cases. Now consider selection for (or against) a dominant allele. If the 1-allele is dominant, then the other allele is recessive. From the above argument it follows that the other allele likely has a low frequency (q _j<0·5). It follows that p _j>0·5. Thus, dominant alleles likely have high frequencies. Dominance of the 1-allele implies that d _j>0 if a _j>0 and d _j<0 if a _j<0. Since p _j>0·5 is likely to hold, we have again sign(a _j)=−sign((q _j−p _j)d _j) with high probability. The models BayesD2 and BayesD3 account for this. Alternatively, the equation could be derived from the contribution 2p _jq _j(a _j+(q _j−p _j)d _j)² of the QTL to the additive variance because it is unlikely that a QTL has a frequency p _j for which the contribution of the QTL to the additive variance is large. The contribution is small if a _j≈−(q _j−p _j)d _j. This also shows that sign(a _j)=sign(−(q _j−p _j)d _j) likely holds for the majority of the alleles. Wellmann & Bennewitz (Reference Wellmann and Bennewitz2011) obtained plausible estimates for parameters of the joint distribution of additive effects and dominance effects for the trait productive life (PL) in dairy cattle only if the model accounts for this. From this argument it also follows that alleles with large effect are not likely to have intermediate frequencies, except for overdominant alleles (Falconer & Mackay, Reference Falconer and Mackay1996; p. 27ff.) and some pleiotropic alleles (e.g. DGAT1 in Holstein cattle, see Grisart et al., Reference Grisart, Farnir, Karim, Cambisano, Kim, Kvasz, Mni, Simon, Frère, Coppieters and Georges2004).

The linear regression model

In this section, the general regression model and various submodels are defined. The submodels differ in the joint prior distributions of the additive and dominance effects of the markers as motivated in the previous section. We consider a linear model of the form

$y \equals X\beta \plus Z_{A} \tilde{a} \plus Z_{D} \tilde{d} \plus Zu \plus E\comma$

where the vector y consists of n observations, β is a vector of fixed effects and would usually include the intercept of the model. The matrix X has full column rank. The vector u~_p(0, Σ) is normally distributed and independent from the marker effects with covariance matrix Σ and Z is a known n×p matrix. The vector $\tilde{a} \equals \lpar {\tilde{a}}_{\setnum{1}} \comma \ldots \comma {\tilde{a}}_{M} \rpar ^{\rm T}$ contains the additive effects of the markers and $\tilde{d} \equals \lpar {\tilde{d}}_{\setnum{1}} \comma \ldots \comma {\tilde{d}}_{M} \rpar ^{\rm T}$ consists of the dominance effects of the markers, where M is the number of markers. The errors E ₁, …, E _n are independent and normally distributed with variance σ², i.e. E|σ²~_n(0, σ²I).

Additive effects and dominance effects of the markers are random variables. Randomness of allelic effects is most easily understood by imagining that the population is a random sample from all hypothetical populations to which the method could be applied, and the trait is randomly chosen among all traits with similar genetic architecture that could be analysed. That is, for each hypothetical population the effect of a marker at a particular position in the genome is a random realization from some distribution. We assume biallelic markers with alleles 0 and 1. Take ${\tilde{\theta }}_{j}$ to be the effect of marker j. We have ${\tilde{\theta }}_{j} \equals \lpar {\tilde{a}}_{j} \comma {\tilde{d}}_{j} \rpar$ , if dominance effects are included in the model and ${\tilde{\theta }}_{j} \equals {\tilde{a}}_{j}$ otherwise. It is assumed that the distribution of ${\tilde{\theta }}_{j}$ is a mixture of two distributions that differ only by a scaling factor ε, so conditionally on a Bernoulli distributed indicator variable γ_j~(1, p _LD) we can write

${\tilde{\theta }}_{j} \vert \gamma _{j} \sim \gamma _{j} {\cal F} \plus \lpar 1 \minus \gamma _{j} \rpar \varepsilon {\cal F}\comma$

where the parameter 0⩽ε≪1 is chosen small and the distribution is specified below. Thus, if γ_j=1, then the marker effect comes from the distribution with large variance. This occurs with probability p _LD=E(γ_j). Markers j with γ_j=1 are those needed to capture the effects of QTLs. The a priori expected number of markers that are needed is Mp _LD.

We can write ${\tilde{\theta }}_{j} \equals \kappa _{j} \theta _{j}$ , where κ_j=(1−γ_j)ε+γ_j and θ_j~ is called the ‘putative’ effect of marker j. In the remaining part of the paper, (a _j,d _j)=θ_j, (or a _j=θ_j) denote the putative additive effect and the putative dominance effect of marker j. Thus, if marker j is needed to capture a QTL effect (γ_j=1), then ${\tilde{a}}_{j} \equals a_{j}$ and ${\tilde{d}}_{j} \equals d_{j}$ . Otherwise, the putative marker effect is regressed towards zero and we have ${\tilde{a}}_{j} \equals \varepsilon a_{j}$ and ${\tilde{d}}_{j} \equals \varepsilon d_{j}$ .

The distribution of the putative marker effect θ_j is specified next. It is the distribution of the effects of markers needed to capture the effects of QTLs. As mentioned in the previous section, the sign of the additive effect sign(a _j), the absolute additive effect |a _j| and the dominance effect d _j depend on each other in a complicated way. The model assumes that the absolute additive effect |a _j| has a folded t-distribution, since conditionally on an inverse chi-square distributed parameter τ_j² it has a half-normal distribution. The prior is therefore

$\eqalign{ \tau _{j}^{\setnum{2}} \vert \gamma _{j} \tab\sim {\rm Inv} \minus \chi ^{\setnum{2}} \lpar v\comma s^{\setnum{2}} \rpar \comma \cr \vert a_{j} \Vert \tau _{j}^{\setnum{2}} \comma \gamma _{j} \tab\sim \vert {\cal N} \lpar 0\comma \tau _{j}^{\setnum{2}} \rpar \vert. \cr}$

The putative dominance effect d _j may depend on the absolute additive effect |a _j| in order to allow for a prior for which overdominance is a rare event. It may also depend on τ_j² in order to account for the fact that additive and dominance effects are similar in magnitude. Conditionally on |a _j| and τ_j², the dominance effect is normally distributed with mean μ_d(|a _j|)=E(d _j||a _j|) and variance σ_d²(|a _j|,τ _j²)=Var(d _j||a _j|,τ_j²). That is, we define the prior distribution of the putative dominance effects as

$d_{j} \vert \vert a_{j} \vert \comma \tau _{j}^{\setnum{2}} \comma \gamma _{j}\sim{\cal N} \lpar \mu _{d} \lpar \vert a_{j} \vert \rpar \comma \sigma _{d}^{\setnum{2}} \lpar \vert a_{j} \vert \comma \tau _{j}^{\setnum{2}} \rpar \rpar.$

The probability ${{\rm pos}}_{j} \lpar d_{j} \rpar \equals P\lpar a_{j} \gt 0\vert d_{j} \rpar$ that the additive effect a _j is positive, given d _j, may be different for each marker and depends on the frequency of the 1-allele and thus on the coding of the marker. Let v _j=1 if a _j>0 and v _j=0 otherwise. Thus, sign(a _j)=2v _j−1 and v _j has the prior

$v_{j} \vert d_{j} \comma \vert a_{j} \vert \comma \tau _{j}^{\setnum{2}} \comma \gamma _{j} \sim {\cal B} \lpar 1\comma {\rm pos}_{j} \lpar d_{j} \rpar \rpar.$

As motivated in the previous section, the sign of the additive effect depends on the dominance effect and the allele frequency such that

${\rm sign}\lpar a_{j} \rpar \equals \minus {\rm sign}\lpar q_{j} \minus p_{j} \rpar {\rm sign}\lpar d_{j} \rpar \comma$

likely holds for the majority of QTL. It is convenient to assume that this equation also holds with high probability for the marker effects. Therefore, we assume that the probability that a _j is positive, given d _j, equals

$P\lpar a_{j} \gt 0\vert d_{j} \rpar \equals {{\rm pos}}_{j} \lpar d_{j} \rpar \equals {{1 \minus w_{j} \>{\rm sign}\lpar d_{j} \rpar } \over 2}\comma$

where w _j∊(−1,1) may depend on the frequency of marker j. For example, we may choose w _j=0, w _j=0·9 sign(q _j−p _j) or w _j=q _j−p _j. This parameter has the interpretation

(1)

$w_{j} \equals 1 \minus 2P\lpar {\rm sign}\lpar a_{j} \rpar \equals {\rm sign}\lpar d_{j} \rpar \rpar.$

For the variance of the errors, we use the prior

$\sigma ^{\setnum{2}}\sim{\rm Inv} \minus \chi ^{\setnum{2}} \lpar v{ \ast } \comma s\ast^{\setnum{2}} \rpar\ {\rm or} \ p\lpar \sigma ^{\setnum{2}} \rpar \propto 1 \ {\rm or} \ p\lpar \sigma ^{\setnum{2}} \rpar \propto {1 \over {\sigma ^{\setnum{2}} }}.$

For the improper uniform prior let v*=−2, s*=0, and for the improper prior $p\lpar \sigma ^{\setnum{2}} \rpar \propto {\textstyle{1 \over {\sigma ^{\setnum{2}} }}}$ let v*=0. Random effects are a priori independent, i.e. $u\comma \sigma ^{\setnum{2}} \comma \lpar {\tilde{a}}_{\setnum{1}} \comma {\tilde{d}}_{\setnum{1}} \comma \gamma _{\setnum{1}} \comma \tau _{\setnum{1}}^{\setnum{2}} \rpar \comma \ldots \comma \lpar {\tilde{a}}_{M} \comma {\tilde{d}}_{M} \comma \gamma _{M} \comma \tau _{M}^{\setnum{2}} \rpar$ are independent.

We consider the following submodels. The first submodel (BayesD0) contains only additive effects, so μ_d(|a _j|)=0, σ_d²(|a _j|, τ_j²)=0, and w _j=0. Since w _j=0, the putative additive effects a _j have a t distribution with v degrees of freedom, mean 0 and variance ${\rm var}\lpar a_{j} \rpar \equals E\lpar \tau _{j}^{\setnum{2}} \rpar \equals s^{\setnum{2}} {\textstyle{v \over {v \minus 2}}}$ if v>2. For v⩽2, the variance does not exist. If v is large, then the putative additive effects are approximately normally distributed. Note that BayesC of Verbyla et al. (Reference Verbyla, Bowman, Hayes and Goddard2010) appears as the special case where ε>0. Most of the Bayesian models mentioned in the introduction are also special cases or limiting cases of BayesD0.

In model BayesD1, additive effects and dominance effects are conditionally independent given τ_j² with w _j=0, μ_d(|a _j|)=μ_D and σ_d²(|a _j|,τ_j²)=s _D²τ_j², where s _D>0. As a consequence, the distribution of the putative marker effect θ_j=(a _j,d _j) is elliptical.

In model BayesD2, absolute additive effects |a _j| and dominance coefficients $\delta _{j} \equals {\textstyle{{d_{j} } \over {\vert a_{j} \vert }}} \sim {\cal N}\lpar \mu _{\rmDelta } \comma \sigma _{\rmDelta }^{\setnum{2}} \rpar$ are independent. Thus, μ_d(|a _j|)=|a _j|μ_Δ and σ_d²(|a _j|,τ_j²)=a _j²σ_Δ².

In model BayesD3, additive effects and dominance coefficients are dependent such that large additive effects are associated with large dominance coefficients. More precisely, we assume $\delta _{j} \vert \vert a_{j} \vert \sim {\cal N}\lpar {\mu _{\rmDelta } \left( {{\textstyle{{\vert a_{j} \vert } \over s}}} \right)\comma \sigma _{\rmDelta }^{\setnum{2}} } \rpar$ , where $\mu _{\rmDelta } \lpar x\rpar \equals {\textstyle{x \over {s_{\rmDelta } \plus x}}}$ with s _Δ>0. Note that the function μ_Δ increases with μ_Δ(0)=0 and ${\lim}\limits_{x \to \infty } \mu _{\rmDelta } \lpar x\rpar \equals 1$ , so dominance coefficients of marker effects that are small in magnitude are centred at zero and dominance coefficients of marker effects that are large in magnitude are centred at one. Thus, $\mu _{d} \lpar \vert a_{j} \vert \rpar \equals \vert a_{j} \vert \mu _{\rmDelta } \left( {{\textstyle{{\vert a_{j} \vert } \over s}}} \right)$ , and σ_d²(|a _j|,τ_j²)=a _j²σ_Δ².

The joint posterior distribution and the Markov chain

In this section, we present the joint posterior distribution for construction of the Markov chain. Since the Markov chain cannot be used for ε=0, we assume ε>0 in this section. In applications, the parameter ε would usually be chosen as small as possible in order to approximate a BayesB-type model. Alternatively, it could be chosen such that the accuracies of predicted BV and GV are maximized.

Let ξ=(β, u, σ²). We have p(ξ)∝p(u)p(σ²) because the assumption that β is a fixed effect is equivalent to the assumption that it has the flat prior p(β) ∝1. The joint posterior distribution is

$\eqalign{p\lpar \tilde{\theta }\comma \xi \comma \gamma \comma \tau ^{\setnum{2}} \vert y\rpar \propto \tab p\lpar y\vert \tilde{\theta }\comma \xi \rpar p\lpar u\rpar p\lpar \sigma ^{\setnum{2}} \rpar\cr\tab\times \prod\limits_{j \equals \setnum{1}}^{M} \,p\lpar {\tilde{\theta }}_{j} \vert \tau _{j}^{\setnum{2}} \comma \gamma _{j} \rpar p\lpar \tau _{j}^{\setnum{2}} \rpar p\lpar \gamma _{j} \rpar \comma$

where the likelihood function is

$p\lpar y\vert \tilde{\theta }\comma \xi \rpar \tab \propto \tab \lpar \sigma ^{\setnum{2}} \rpar ^{ \minus n\sol \setnum{2}} \exp\left( { \minus {{E^{\,\rm T} E\.} \over {2\sigma ^{\setnum{2}} }}} \right)\comma$

with $E \equals y \minus X\beta \minus Zu \minus Z_{A} \tilde{a} \minus Z_{D} \tilde{d}$ . An explicit representation of the conditional prior distribution $p\lpar {\tilde{\theta }}_{j} \vert \tau _{j}^{\setnum{2}} \comma \gamma _{j} \rpar$ of the marker effects is needed in order to derive a Markov chain that uses this parameterization. We have

(2)

$\eqalign{ p\lpar{\tilde{\theta }}_{j} \vert \tau _{j}^{\setnum{2}} \comma \gamma _{j} \rpar \tab \equals {{2g_{j} \lpar {\tilde{a}}_{j} \comma {\tilde{d}}_{j} \rpar } \over {\kappa _{j} \sqrt {2\pi \tau _{j}^{\setnum{2}} } }} \exp\left(\! { \minus {{{\tilde{a}}_{j}^{\,\setnum{2}} } \over {2\kappa _{j}^{\setnum{2}} \tau _{j}^{\setnum{2}} }}} \right)\psi _{\kappa _{j} \comma \tau _{j}^{\setnum{2}} } \lpar {\tilde{a}}_{j} \comma \tilde d_{j} \rpar \comma \cr}$

where κ_j=(1−γ_j)ε+γ_j,

$\scale85%{g_{j} \lpar {\tilde{a}}_{j} \comma {\tilde{d}}_{j} \rpar \equals \left\{ {\openup3\matrix{ {\displaystyle{1 \over 2}\comma }\hfill \tab {\;{\rm for}\;{\rm BayesD}0\; \minus \;{\rm BayesD}1\comma } \cr {\displaystyle{{1 \minus w_{j} {\rm sign}\lpar {\tilde{a}}_{j} \rpar {\rm sign}\lpar {\tilde{d}}_{j} \rpar } \over 2}\comma } \tab {\;{\rm for}\;{\rm BayesD}2\; \minus \;{\rm BayesD}3\comma } \cr}} } \right.$

and

$\scale88%{\psi _{\kappa _{j} \comma \tau _{j}^{\setnum{2}} } \lpar {\tilde{a}}_{j} \comma \tilde d_{j} \rpar \equals \left\{\! {\matrix{ {1\comma }\hfill \tab\hskip-3pt {{\rm for}\;{\rm BayesD}0\comma }\hfill \cr \displaystyle{{1 \over {\kappa _{j} \sqrt {2\pi \sigma _{d}^{\setnum{2}} } }}\exp\left( { \minus {{\left( {\textstyle{{ {\tilde{d}}_{j} } \over {\kappa _{j} }}} \minus \mu _{d} \right) ^{\setnum{2}} } \over {2\sigma _{d}^{\setnum{2}} }}} \right)\comma } \tab\hskip-3pt {{\rm for}\;{\rm BayesD}1\;}\cr\tab\hskip-3pt {\hbox{\mbox{-}}} {\rm BayesD}3.} \cr} } \right.$

Here, we used the abbreviation

$\openup3pt\eqalign{ \sigma _{d}^{\setnum{2}} \equals \sigma _{d}^{\setnum{2}} \left( {{{\vert {\tilde{a}}_{j} \vert } \over {\kappa _{j} }}\comma \tau _{j}^{\setnum{2}} } \right)\comma\ \mu _{d} \equals \mu _{d} \left( {{{\vert {\tilde{a}}_{j} \vert } \over {\kappa _{j} }}} \right). \cr}$

The Markov chain for the prediction of the model parameters is generated by Gibbs sampling with a Metropolis–Hastings step. The algorithm is described in Appendix A. Starting with initial values, the parameters are sampled from their full conditional posterior distributions. The lengthy but straightforward proofs of the full conditional posterior distributions are given in the electronic appendix.

Moments of the random effects

Moments of the random effects given in this section are needed for the calculation of hyper-parameters. They are also needed for the calculation of the RKHS-kernel which is defined in the next section. For the second moment of a _j to exist, we assume v>2. Since the putative marker effect θ_j and γ_j are independent, we have $E\lpar {\tilde{a}}_{j}^{k} \rpar \equals E\lpar \kappa _{j}^{k} \rpar E\lpar a_{j}^{k} \rpar$ and $E\lpar {\tilde{d}\,}_{j}^{k} \rpar \equals E\lpar \kappa _{j}^{k} \rpar E\lpar d_{j}^{k} \rpar$ for k∊{1, 2}. Moreover, we have $E\lpar {\tilde{a}}_{j} {\tilde{d}}_{j} \rpar \equals E\lpar \kappa _{j}^{\setnum{2}} \rpar E\lpar a_{j} d_{j} \rpar$ and $E\lpar \vert {\tilde{a}}_{j} \vert \rpar \equals E\lpar \kappa _{j} \rpar E\lpar \vert a_{j} \vert \rpar$ , where E(κ_j^k)=(1−p _LD)ε^k+p _LD. These formulae depend on moments of the putative marker effects. They can be calculated as

(3)

$E\lpar a_{j} \rpar \equals \minus w_{j} E\left( {\vert a_{j} \vert \left( {1 \minus 2\phi \left( {{{ \minus \mu _{d} \lpar \vert a_{j} \vert \rpar } \over {\sigma _{d} \lpar \vert a_{j} \vert \comma \tau _{j}^{\setnum{2}} \rpar }}} \right)} \right)} \right)\comma$

$E\lpar \vert a_{j} \vert \rpar \equals \lambda \sqrt {E\lpar a_{j}^{\setnum{2}} \rpar } \comma$

$E\lpar a_{j}^{\setnum{2}} \rpar \equals s^{\setnum{2}} {v \over {v \minus 2}}\comma$

$E\lpar d_{j} \rpar \equals E\lpar \mu _{d} \lpar \vert a_{j} \vert \rpar \rpar \comma$

$E\lpar d_{j}^{\setnum{2}} \rpar \equals E\lpar \sigma _{d}^{\setnum{2}} \lpar \vert a_{j} \vert \comma \tau _{j}^{\setnum{2}} \rpar \rpar \plus E\lpar \mu _{d} \lpar \vert a_{j} \vert \rpar ^{\setnum{2}} \rpar \comma$

$E\lpar a_{j} d_{j} \rpar \equals \minus w_{j} E\lpar \vert a_{j} \Vert d_{j} \vert \rpar \comma$

$E\lpar \vert a_{j} \vert \vert d_{j} \vert \rpar \equals E\left( {\vert a_{j} \vert \mu _{d} \lpar \vert a_{j} \vert \rpar K\left( {{{\sigma _{d} \lpar \vert a_{j} \vert \comma \tau _{j}^{\setnum{2}} \rpar } \over {\mu _{d} \lpar \vert a_{j} \vert \rpar }}} \right)} \right)\comma$

where (Psarakis & Panaretos, Reference Psarakis and Panaretos1990):

$\eqalign{ \lambda \equals \tab 2\sqrt {{{v \minus 2} \over \pi }} {{\rmGamma \left( {{\textstyle{{v \plus 1} \over 2}}} \right)} \over {\rmGamma \left( {{\textstyle{v \over 2}}} \right)\lpar v \minus 1\rpar }}\comma \cr K\lpar x\rpar \equals \tab x\sqrt {{2 \over \pi }} \exp\left( { \minus {1 \over {2x^{\setnum{2}} }}} \right) \plus \left( {1 \minus 2\phi \left( { \minus {1 \over x}} \right)} \right)\comma \cr}$

φ is the cumulative distribution function of a standard normal distribution, and Γ is the Gamma function. The formulae can be further simplified for the considered submodels. Simplified formulae are given in Table 2, where t~t _v has a t-distribution with v degrees of freedom.

Table 2. Model-specific expectations

Prediction of genotypic values

Take Ω={0, 1, 2}^M to be the set of all possible multilocus genotypes at the M markers. For x∊Ω, x _j is the number of 1-alleles at a particular marker j. According to the regression model described above, an individual with genotype x is assumed to have GV

$g_{{\rm GV}} \lpar x\rpar \equals \mathop{\sum}\limits_{j \equals \setnum{1}}^{M} \,\left( { {\tilde{a}}_{j} \plus \lpar 2 \minus x_{j} \rpar {\tilde{d}}_{j} } \right)x_{j} \comma$

provided that Z _A is the gene content matrix with entries 0, 1, 2 and Z _D is the indicator matrix for heterozygosity. That is, the GV is the sum of all additive effects and dominance effects that are carried by the individual. According to Falconer & Mackay (Reference Falconer and Mackay1996), the GV can be partitioned into a BV, a dominance deviation, and a contribution to the overall mean that is equal for all individuals. The BV is

$g_{{\rm BV}} \lpar x\rpar\equals \mathop\sum\limits_{j \equals \setnum{1}}^{M} \,\lpar { {\tilde{a}}_{j} \plus \lpar q_{j} \minus p_{j} \rpar {\tilde{d}}_{j} } \rpar \lpar x_{j} \minus 2p_{j} \rpar \comma$

and dominance deviation is

$g_{{\rm DV}} \lpar x\rpar \equals \mathop\sum\limits_{j \equals \setnum{1}}^{M} \, \minus {\tilde{d}}_{j} x_{j} \lpar x_{j} \minus 1 \minus 2p_{j} \rpar \minus 2p_{j}^{\setnum{2}} {\tilde{d}}_{j}.$

These are the formulae of Falconer & Mackay (Reference Falconer and Mackay1996, Table 7.3), except that summation is over the markers rather than over the QTLs.

Different methods are considered for the prediction of BV, dominance deviations and GV: the Bayesian methods explained above and an RKHS method. For the Bayesian methods, additive and dominance effects of the markers are predicted as posterior means from the MCMC algorithm. The predicted values are then inserted into the above equations to get the estimated genomic genotypic value EGV(x), the estimated genomic breeding value EBV(x) and the estimated genomic dominance deviation EDV(x).

In our paper, an RKHS method is used only for the prediction of GV. The kernel is derived from the assumptions of the regression model. Recall that the known genotypes are fixed explanatory variables. Randomness of GV arises from the randomness of allelic effects, so the function g _GV(⋅) is random. Since RKHS regression assumes that GV are normally distributed with mean zero (Gianola & van Kaam, Reference Gianola and de los Campos2008), but $E\lpar g_{{\rm GV}} \lpar x\rpar \rpar \equals \sum\nolimits_{j \equals \setnum{1}}^{M} \,\lpar {E\lpar {\tilde{a}}_{j} \rpar \plus \lpar 2 \minus x_{j} \rpar E\lpar {\tilde{d}}_{j} \rpar } \rpar x_{j} \ne 0$ , we cannot estimate g _GV(⋅) directly. Instead, we estimate g(x)=g _GV(x)−E(g _GV(x)) with RKHS regression from the observations that are diminished by the expected GV. Then, an estimate of the GV is obtained as ${\widehat{g}}_{{\rm GV}} \lpar x\rpar \equals {\widehat{g}}\lpar x\rpar \plus E\lpar g_{{\rm GV}} \lpar x\rpar \rpar$ . For convenience, g(x) is also said to be a GV. The model assumptions imply that the covariance between GV of individuals with genotypes x and $\tilde{x}$ is

(4)

$\eqalign{ \tab {\rm Cov}\lpar g\lpar x\rpar \comma g\lpar \tilde{x}\rpar \rpar \equals\mathop\sum\limits_{j \equals \setnum{1}}^{M} \,x_{j} {\tilde{x}}_{j} {\rm Var} \lpar {\tilde{a}}_{j} \rpar \plus \mathop\sum\limits_{j \equals \setnum{1}}^{M} \,x_{j} {\tilde{x}}_{j} \lpar 2 \minus {\tilde{x}}_{j} \rpar \cr\tab\quad\times\lpar 2 \minus x_{j} \rpar {\rm Var} \lpar {\tilde{d}}_{j} \rpar \plus \mathop\sum\limits_{j \equals \setnum{1}}^{M} \,x_{j} {\tilde{x}}_{j} \lpar 4 \minus x_{j} \minus {\tilde{x}}_{j} \rpar {\rm Cov}\lpar {\tilde{a}}_{j} \comma {\tilde{d}}_{j} \rpar. \cr}$

This equation defines the covariance matrix K for the GV of the phenotyped and genotyped individuals in the estimation set. Formulae for the calculation of the variances and covariances that appear on the right–hand side of the equation are given in the previous section. Then, the covariance matrix K can be calculated and used as a genomic relationship matrix for BLUP-prediction of GV. The resulting estimates are RKHS estimates (de los Campos et al., Reference de los Campos, Gianola and Rosa2009). Take $\widehat{\bf g}$ to be the vector of predicted GV for individuals in the estimation set. It is used for the prediction of GV of non-phenotyped individuals as follows. For an individual with genotype x ₀, the vector K ₀ of covariances between the GV of this individual and the GV of all individuals in the estimation set can be calculated from equation (4). The estimated GV of the new individual is ${\widehat{g}}\lpar x_{\setnum{0}} \rpar \equals K_{\setnum{0}}^{T} K^{ \minus \setnum{1}} \widehat{\bf g}$ . Since x ₀ was arbitrary, this equation defines an estimate ${\widehat{g}}$ of g. Note that the function $K\lpar x\comma \tilde{x}\rpar \equals {\rm Cov}\lpar g\lpar x\rpar \comma g\lpar \tilde{x}\rpar \rpar$ is symmetric and finitely positive-semidefinite because it is defined via a covariance function. The name RKHS regression results from the fact that a Hilbert space exists for which the covariance function $K\lpar x\comma \tilde{x}\rpar$ is a reproducing kernel. The Hilbert space is defined in Appendix C. The estimated function ${\widehat{g}}$ belongs to this Hilbert space and is optimal in a well–defined sense among all functions that belong to this Hilbert space (Gianola & van Kaam, Reference Gianola and van Kaam2008).

This RKHS estimate is also the BLUP. Since all RKHS estimates are linear, it is the best estimate that can be obtained with RKHS regression, provided that the model assumptions are satisfied by the data and that the first two moments of the random effects are assumed known. In the applications, we calculated the kernel from the assumptions of BayesD2, so the RKHS estimate is nothing but the BLUP of BayesD2 (as opposed to the best predictor).

Calculation of hyper-parameters

For the calculation of the constant hyper-parameters, we assume whenever possible that additive variance, dominance variance and inbreeding depression of a random mating population are completely explained by the markers. This is not optimal because in fact, markers from low–density panels are not able to explain the total dominance variance. Thus, higher accuracies could be achieved when the parameters are chosen by a grid search or by cross validation. But we think that this is the natural way to choose the parameters. If the contribution of LD to the additive variance and the dominance variance are neglected, then expected additive variance V _AM, dominance variance V _DM and inbreeding depression _M explained by markers are

$\openup3\eqalign{ V_{{\rm AM}} \equals \tab E\left( {\mathop\sum\limits_{j \in {\cal M}} \,h_{j} \lpar {\tilde{a}}_{j} \plus \lpar q_{j} \minus p_{j} \rpar {\tilde{d}}_{j} \rpar ^{\setnum{2}} } \right)\comma \cr V_{{\rm DM}} \equals \tab E\left( {\mathop\sum\limits_{j \in {\cal M}} \,h_{j}^{\setnum{2}} {\tilde{d}}_{j}^{\setnum{2}} } \right)\comma \cr {\cal I}_{M} \equals \tab E\left( {\mathop\sum\limits_{j \in {\cal M}} \,h_{j} {\tilde{d}}_{j} } \right)\comma \cr}$

where h _j=2p _jq _j is the heterozygosity of marker j in the case of Hardy Weinberg equilibrium. Note that these formulae assume that the matrix Z _A in the model specification denotes the gene content matrix and Z _D is the indicator matrix for heterozygosity. The expectations can be calculated, which gives

(\vskip18pt(5))

$\openup3\eqalign{ V_{{\rm AM}} \equals \tab {ME}\lpar \kappa _{j}^{\setnum{2}} \rpar \lpar {\overline{{h_{o} }} E\lpar a_{j}^{\setnum{2}} \rpar \minus 2E\lpar \vert a_{j} \vert \vert d_{j} \vert \rpar {\tilde{\gamma }}_{M} \plus \gamma _{M} E\lpar d_{j}^{\setnum{2}} \rpar } \rpar\comma \cr V_{{\rm DM}} \equals \tab {ME}\lpar \kappa _{j}^{\setnum{2}} \rpar \overline{{h_{o}^{\setnum{2}} }} E\lpar d_{j}^{\setnum{2}} \rpar \comma \cr {\cal I}_{M} \equals \tab {ME}\lpar \kappa _{j} \rpar \>\overline{{h_{o} }} E\lpar d_{j} \rpar \comma \cr}\hskip-42pt$

where $\overline{{h_{o} }}$ is the average heterozygosity, $\overline{{h_{o}^{\setnum{2}} }}$ is the average squared heterozygosity, $\gamma _{M} \equals {\textstyle{1 \over M}}\sum\nolimits_{j \in {\cal M}} \,h_{j} \lpar q_{j} \minus p_{j} \rpar ^{\setnum{2}}$ and ${\tilde{\gamma }}_{M} \equals {\textstyle{1 \over M}}\sum\nolimits_{j \in {\cal M}} h_{j} \lpar q_{j} \minus p_{j} \rpar w_{j}$ .

For the model without dominance effects (BayesD0), the parameter s ² is obtained from the condition V _AM=V _A, which gives

(6)

$s^{\setnum{2}} \equals {{V_{A} } \over {M\overline{{h_{o} }} E\lpar \kappa _{j}^{\setnum{2}} \rpar }}{{v \minus 2} \over v}.$

For the calculation of the fixed hyper-parameters for models that include dominance effects it is assumed that V _A,V _D and are known or have been estimated. Three parameters are chosen such that the conditions V _AM=V _A, V _DM=V _D and _M= hold. These are s ², μ_D and s _D² for BayesD1, s ², μ_Δ, σ_Δ² for BayesD2, and s ², s _Δ, σ_Δ² for BayesD3. The formulae for the calculation of these parameters are lengthy and are given in Appendix B.

Application

Simulation

A Fisher–Wright diploid population with population size N=1000 was simulated by sampling individuals for breeding with replacement for 5000 generations. Thereafter, the effective population size decreased for 400 generations from 1000 to 100 with a fast decrease in the most recent generations according to $N_{e\comma t \minus \setnum{400}} \equals 100 \plus 900{\textstyle{{1 \minus {\rm e}^{\setnum{0} \cdot \setnum{005}t \minus \setnum{2}} } \over {1 \minus {\rm e}^{ \minus \setnum{2}} }}}$ . This formula was chosen in order to reproduce the LD-pattern that is observed in cattle (compare with the estimated historic N _e of cattle breeds, given in Villa-Angulo et al., Reference Villa-Angulo, Matukumalli, Gill, Choi, Van Tassell and Grefenstette2009). The total population size remained constant. This was achieved by reducing the number n _mt of males and increasing the number n _ft of females such that $N_{e\comma t} \equals {\textstyle{{4n_{{\rm m{\it t}}} n_{{\rm f}{t}} } \over {n_{{\rm m}{t}} \plus n_{{\rm f}{t}} }}}$ .

The genome consisted of one chromosome of 1 Morgan with a mutation rate of 5×10⁻⁸. We simulated ten populations. For each population, ten traits with the same characteristics were simulated, so in total there were 100 replicates. In generation 0, 50 SNP for each trait were randomly selected among the about 84 000 segregating SNP to become a QTL. This corresponds to 1500 QTLs in a 30 Morgan genome. Additive and dominance effect of each QTL was sampled according to Scenario 3 fitted to PL in Wellmann & Bennewitz (Reference Wellmann and Bennewitz2011) . Thus, the QTL effects were sampled from the same distribution for all traits. Alleles with large effect tend to be partially recessive or dominant with heterozygous effect above the average effect of the two homozygotes. But alleles of small effect show highly variable dominance coefficients. The sign of the additive effect a _j was chosen dependent on the allele frequency such that the contribution of the QTL to the additive variance is small, i.e. sign(a _j)=−sign((q _j−p _j)d _j). After additive and dominance effects had been sampled, they were rescaled by the same factor to obtain a heritability of h ²=V _A=0·15 for each trait. Realized dominance variance and inbreeding depression varied considerably between traits because only one chromosome was simulated. Average dominance variance and inbreeding depression were ${\overline{V} }_{D} \equals 0{\cdot}07$ and $\overline{{\cal I}} \equals 0{\cdot}6$ .

Starting with generation 1, the population was maintained without selection for five generations with N _e,t=100 but N _t=1000. The markers were identified in generation 1 based on the minor allele frequency (MAF) and on the distance to neighbouring markers. We considered three marker sets with 1500, 3000 and 6000 markers. This corresponds to 45 000, 90 000 and 180 000 markers in a 30 Morgan genome. Markers with high MAF were favoured. Average r ²-values between adjacent markers were similar for all marker sets. They ranged from 0·36 to 0·40. The marker effects were predicted once in generation 1 from 1000 individuals. Expected dominance variance V _D=0·072 and inbreeding depression =0·59 were used as parameters to predict marker effects. Thus, the same hyper-parameters were used for all traits. Predicted marker effects were used to calculate estimated BV, dominance values and GV in generations 1–5. According to the scaling argument introduced by Meuwissen (Reference Meuwissen2009), the results can be extended to a population with a 30 Morgan genome and 30 000 individuals in the estimation set.

Analysis of the simulated data sets

We compared the accuracies that are obtained by the BayesD methods with G-BLUP, BayesA, BayesC and RKHS regression. Recall that BayesC does not contain dominance effects, BayesD1 assumes conditionally independent additive effect and dominance effects, BayesD2 assumes independent absolute additive effects and dominance coefficients, and BayesD3 assumes dependent absolute additive effects and dominance coefficients such that large additive effects are associated with large dominance coefficients.

G-BLUP and BayesA assume that all markers have a non-negligible effect, so p _LD=1. For all other methods, we have chosen the scaling factor ε=0·01 and p _LD depending on the marker panel. For marker panel k=1,2,3 with M=750×2^k markers, we used p _LD=0·8×0·35^k. That is, we expected a priori that on average p _LDM/50=12×0·7^k markers are needed to capture the effect of one QTL for marker panel k. The formula for the calculation of p _LD was chosen such that the expected number of required markers approaches the number of QTL when the size of the marker panel approaches the total number of SNP in the genome. For BayesD2 and BayesD3, we used w _j=q _j−p _j. This parameter determines the probability that additive effect and dominance effect have the same sign for marker j. The degrees of freedom also had to be specified for each method. We used v=100 for G-BLUP, v=2·1 for BayesA, and v=2·5 for all other methods. The prior for the error variance was p(σ²)∝1/σ². For simplicity, Xβ=1μ and Zu=0 were assumed. For BayesD1, the MCMC chains were run for 50 000 cycles and for all other methods, they were run for 10 000 cycles. The first 50% of the cycles were discarded. We used ten cycles in the Metropolis–Hastings step to sample the additive effects. Marker effects were estimated by the posterior means. The RKHS-regression estimate of the GV was calculated as described in Section (v). The reproducing kernel is defined by equation (4) and the underlying regression model is the same as for BayesD2, so the RKHS estimate has the property that it is the BLUP of BayesD2.

Results

Table 3 shows the accuracies of predicted BV, dominance deviations (DV) and GV for generation 2 and 5, where 3000 markers per chromosome are used. It also shows the regressions b _BV, b _DV and b _GV of true BV, dominance deviations and GV on estimated values. Among all BayesD methods, BayesD3 yielded the highest accuracies and the regressions of true on predicted values are close to one. BayesD2 performed very similarly and was only slightly worse. In generation 2, BayesC yielded a 7% higher accuracy of predicted BV than G-BLUP and BayesD3 resulted in 10% higher accuracy. The accuracy of G-BLUP decreased by 10% from generation 2 to generation 5, whereas the accuracy of BayesD3 decreased only by 5%. As a consequence, BayesC had a 11% higher accuracy in generation 5, and BayesD3 had a 15% higher accuracy than G-BLUP. The methods differ very much in their ability to predict GV. The accuracy of GV in generation 2 was 15% less than the accuracy of BV if G-BLUP was used for the prediction, but it was only 3% less if BayesD3 was used. The accuracy of GV of BayesD3 exceeded that of G-BLUP by 26% in generation 2 and by 33% in generation 5. The accuracy of GV obtained from RKHS regression, exceeded that of G-BLUP by 4% in generation 2 and by 5% in generation 5. The choice of p _LD and v did not affect the RKHS estimate (not shown). The results are visualized in Fig. 2. The figure shows the accuracies of predicted BV, dominance deviations and GV in generations 1–5 for the set with 3000 markers. It can be seen that the accuracy of the dominance deviation is below the accuracy of the BV. Interestingly, the accuracy of the dominance deviation decreases only very little from generation 1 to generation 5. This suggests that markers that capture the dominance effect of a QTL must be in high LD and therefore in close proximity of the QTL.

Fig. 2. Accuracy of predicted BV (a), dominance deviations (b) and GV (c) for generations 1–5 and 3000 markers per chromosome.

Table 3. Accuracies of predicted BV, dominance deviations (DV) and GV in generations 2 and 5, regressions b_BV, b_DV and b_GV and of true on predicted values, and computation times per cycle relative to BayesA

Table 3 also shows that the BayesD methods required about twice as much computation time per cycle as the methods without dominance effects. This was expected because twice as many effects need to be predicted. The Metropolis–Hastings step that was needed to sample the additive effects in methods BayesD2 and BayesD3 increased the computation time only slightly. Figure 3 shows the mean accuracy of predicted GV in generations 1–5 when the sampler was stopped after 10, 100, 1000 or 10 000 cycles, and the first 50% of the cycles were discarded. GV were predicted from 3000 markers per chromosome. It can be seen that BayesA, BayesC, BayesD2 and BayesD3 approximately reached convergence after 1000 iterations, whereas BayesD1 may not have reached convergence, even after 10 000 iterations. The same applies for BV and dominance deviations.

Fig. 3. Mean accuracy of predicted GV in generations 1–5 calculated from 10^x iterations of the sampler for 3000 markers per chromosome.

Figure 4 shows the mean accuracy of predicted BV, dominance deviations and GV in generations 1–5 for the different marker panels. Markers from high–density panels had a smaller MAF on average than markers from low–density panels, so they were less informative. This explains why the accuracies of G-BLUP, BayesA and BayesC increase only slightly for high–density panels. It can be seen that for an accurate prediction of dominance deviations and GV, high–density marker panels are needed. The likely reason for the increased accuracy of dominance deviations for high–density marker panels is, that the QTLs are on average in higher LD with a marker. For each QTL, we calculated the maximum r ²-value between the QTL and a marker. The average maximum r ²-values were 0·49, 0·64, and 0·80 for the different marker panels.

Fig. 4. Accuracy of predicted BV (a), dominance deviations (b), and GV (c) for marker panels with 1500, 3000 and 6000 markers per chromosome. The average maximum r ² values of a QTL with a marker are shown on the x-axis for the different panels.

Discussion

New Bayesian models for the prediction of genomic BV, dominance deviations and GV have been introduced. The BayesD models outperformed BayesA and BayesC for the simulated data. BayesD3 and BayesD2, which both assume dependent additive and dominance effects, performed best. We showed that these methods not only enable an accurate prediction of BV, dominance deviations and GV, but they also lead to a smaller decrease of the accuracy of genomic BV in subsequent generations. Accuracies of genomic BV of BayesD methods were larger than the accuracies of methods that do not account for dominance. However, for an accurate prediction of dominance deviations and GV, high–density marker panels are needed. Computation time per predicted random effect was similar to BayesA. Interestingly, BayesD1 produced rather high accuracies even though the assumptions of this model were violated in the simulated data. For example, it did not take into account that dominance effects decreased rather than increased the additive variance of the population. However, it yielded smaller accuracies than the other BayesD methods and it showed a slower convergence of the Markov chain.

G-BLUP showed the strongest decrease of the accuracies in subsequent generations. This was expected because G-BLUP assumes normally distributed additive marker effects. This distribution is not heavy tailed. Therefore, more markers would be needed to capture the effect of one large QTL. These additional markers partly have a greater distance to the QTL, so recombinations can cause a greater drop in accuracy. BayesD3 showed the smallest decrease. Moreover, for the BayesD methods, the accuracy of the dominance deviations decreased only very little in subsequent generations. This suggests that a marker must be in strong LD with a QTL in order to capture the dominance effect of the QTL because otherwise, recombinations are likely to cause a faster decrease of the accuracies. As additive and dominance effects are dependent, and since both of them affect the BV of an individual, this could explain why the inclusion of dominance effects also slows the decrease in accuracy of genomic BV in subsequent generations, as shown in Fig. 2).

Since the BayesD2 estimate and the RKHS estimate (which is the BLUP of BayesD2) were derived from the same statistical model, it was expected that both methods provide similar accuracies. However, the increase in accuracies of GV over G-BLUP was much smaller for RKHS regression than for the Bayesian models. The reason is probably that RKHS regression could provide the best predictor if the GV are normally distributed, but this assumption was violated even though a relatively large number of QTLs was simulated. For RKHS regression, the increase in accuracy over G-BLUP was larger than the increases reported by other authors. Ober et al. (Reference Ober, Erbe, Long, Porcu, Schlather and Simianer2011) found that the accuracy in the validation set obtained with universal kriging was 0·013 larger than the accuracy of a genomic BLUP method. These authors chose the kernel from the family of Matérn covariance functions and additive variance and dominance variance were equal in the simulation. In our study, RKHS regression yielded a 0·027 higher accuracy than G-BLUP, although the dominance variance was only about half as large as the additive variance in our simulation. This suggests that a model-based definition of the kernel can increase the accuracies of RKHS regression estimates. A kernel that accounts for non-additive effects and uses SNP information was also proposed by Gianola & de los Campos (Reference Gianola and de los Campos2008) in analogy to the model of Henderson (Reference Henderson1985) which relies on the assumptions of Cockerham (Reference Cockerham1954) and Kempthorne (Reference Kempthorne1954) . In contrast to our model that assumes that the known genotypes are fixed and randomness of GV is due to randomness of the allelic effects, Cockerham and Kempthorne assumed implicitly that randomness of GV arises from randomness of the genotypes, and the function g that maps genotypes to GV is unknown and fixed. The joint distribution of the genotypes induces a covariance between GV that depends on the genetic architecture of the trait, i.e. on the unknown function g. Although it is often stated in the literature that g can be arbitrary, from these arguments it follows that RKHS regression makes strong assumptions about the genetic architecture of the trait because GV are inferred from the covariance structure of the GV, and the covariance structure depends on the genetic architecture. Gianola & de los Campos (Reference Gianola and de los Campos2008) stated that the choice of the kernel is indeed absolutely critical for attaining good predictions in RKHS regression.

If the number of markers is large and the prior probability p _LD of a marker for being needed to capture a QTL effect is small, then the cumulative effect of all SNP drawn from the distribution with small variance can explain a considerable part of the variance even if ∊=0·01. But it is desirable that the model assumes only a small polygenic component if the marker density is large, so ∊ should be small in this case. However, a small ∊≪0·01 considerably slows convergence of the Markov chain, so one has to compromise. Alternatively, a BayesB-type algorithm that allows for ∊=0 may be used for this model.

The conditional variances τ _j ² were introduced only in order to obtain a hierarchical model that facilitates estimation of the marker effects with an MCMC algorithm. As noted by Gianola et al. (Reference Gianola, de los Campos, Hill, Manfredi and Fernando2009) for BayesA, these parameters cannot be estimated precisely from the data because the model does not allow Bayesian learning on these parameters. However, since they have no biological interpretation, estimates of these parameters are not needed.

There are several possibilities to further generalize our model. For example, the parameter s ² could be different for each marker. That is, each marker could have its own variance. This could make sense in order to account for prior knowledge about a QTL that is in LD with the marker, or in order to account for the joint distribution of additive effects, dominance effects and allele frequencies. In the latter case, s_j ² would depend on the allele frequency of marker j. We assumed a folded t-distribution for the absolute values of the putative additive effects. As a consequence, we had to choose v>2 because otherwise the additive effects would have infinite variance. Since the degree of freedom v controls the thickness of the tails of a t-distribution, the choice of v could have a large effect on the accuracies. In this paper, this parameter was chosen by a grid search using cross validation. Alternatively, v could be treated as random and sampled with a Metropolis–Hastings step. Priors are proposed that assign a small probability to large values of v, but exclude v<2 (e.g. $p\lpar v\rpar \propto {\textstyle{1 \over {v^{\setnum{2}} }}}$ , see Rosa et al., Reference Rosa, Gianola and Padovani2004). The t-distribution could be modified to become a generalized hyperbolic distribution in order to force the variance of the additive effects to exist even for v<2.

Different models may be appropriate for application depending on the objective of a study. If the aim of a study is the exploration of the genetic architecture of a quantitative trait then allelic effects may be predicted with different models and the assumptions of the model with the best predictive ability are likely to give a good description of the genetic architecture. However, it is unknown as to how epistasis would affect the predictive ability of the models, so inclusion of epistasis would be the logical next step. Alternatively, the genetic architecture could be explored by evaluation of the posterior distribution. In this case, BayesD1 may be the method of choice because it makes weak prior assumptions. Xu (Reference Xu2003) demonstrated that improper priors with heavy tails for additive and dominance effects produce clearer signals of QTL than the normal distribution. Therefore, small values of v and p _LD could be preferable for QTL detection because this results in a more heavy–tailed distribution. If the aim of a study is the prediction of genomic BV or GV, then the model with best predictive ability should be chosen, provided that the computation time is acceptable. This is likely to be BayesD2 or BayesD3 because these models give the best fit to the genetic architectures that are suggested in the literature. Our simulation study confirms the superiority of BayesD2 and BayesD3. Similar joint distributions of additive effects and dominance effects were assumed for the simulation protocol and for BayesD3. If, contrary to our assumptions, the true joint distribution is very different for a trait (e.g. a trait with many overdominant alleles), then of course BayesD2 and BayesD3 would be not superior.

New methods have been introduced that enable computationally feasible, simultaneous and accurate prediction of BV, dominance deviations and GV for high–density marker sets. The number of females genotyped with a low–density marker set in dairy cattle breeds is increasing rapidly. High density marker genotypes or even whole genome sequences of sires and grandsires are becoming available for imputation. That is, high–density marker sets can be used for the prediction of BV and GV even though most individuals are only genotyped with a low–density marker set (Meuwissen & Goddard, Reference Meuwissen and Goddard2010a, Reference Meuwissen and Goddard2010b). Thus, the data needed to estimate dominance effects are becoming available. The conclusions drawn in this study are based on simulation experiments. The simulation protocol was designed to realistically model the dependencies between additive and dominance effects of QTLs for quantitative traits following the suggestions of Wellmann & Bennewitz (Reference Wellmann and Bennewitz2011) . Once real genomic data from traits with precise information on the genetic architecture including dominance effects become available, it should be used to validate the proposed models in the spirit of Hayes et al. (Reference Hayes, Pryce, Chamberlain, Bowman and Goddard2010) .

R. W. was supported by a grant from the Deutsche Forschungsgemeinschaft, DFG. The authors thank Christine Baes for language correction. The manuscript has benefited from the critical comments of the anonymous reviewers.

Appendix A: The Markov Chain

In this section, the Markov chain used for the prediction of the model parameters is described. In each cycle, the fixed effect β, the random effect u, the marker effects $\mathop {\tilde{\theta }}\nolimits_{\setnum{1}}\! \comma \!\, \ldots\! \, \comma \mathop {\tilde{\theta }}\nolimits_{M}$ , the indicator variables γ₁, …, γ _M , the conditional variances $\tau _{\setnum{1}}^{\setnum{2}} \comma \!\, \ldots\! \, \comma \tau _{M}^{\setnum{2}}$ , and the error variance σ² are sampled (in this order). Details are described below.

Sampling of the fixed effect β

The full conditional posterior distribution of β is

(7)

$\beta \vert \tilde{\theta }\comma u\comma \sigma ^{\setnum{2}} \comma \gamma \comma \tau ^{\setnum{2}} \comma y \sim {\cal N}\lpar \widehat\beta \comma \sigma ^{\setnum{2}} \lpar X^{\rm T} X\rpar ^{ \minus \setnum{1}} \rpar \comma$

where

$\widehat\beta \equals \lpar X^{\rm T} X\rpar ^{ \minus \setnum{1}} X^{\rm T} \lpar y \minus Zu \minus Z_{A} \tilde{a} \minus Z_{D} \tilde{d}\rpar .$

In the special case where β=μ contains only the intercept, i.e. Xβ=1μ, the full conditional posterior simplifies to

$\eqalign{ \tab \mu \vert \tilde{\theta }\comma u\comma \sigma ^{\setnum{2}} \comma \gamma \comma \tau ^{\setnum{2}} \comma y \sim\cr \tab \quad {\cal N}\left( {{1 \over n}1^{\rm T} \lpar y \minus Zu \minus Z_{A} \tilde{a} \minus Z_{D} \tilde{d}\rpar \comma {{\sigma ^{\setnum{2}} } \over n}} \right). \cr}$

Sampling of the random effect u

The full conditional posterior distribution of u is

(8)

$u\vert \tilde{\theta }\comma \beta \comma \sigma ^{\setnum{2}} \comma \gamma \comma \tau ^{\setnum{2}} \comma y \sim {\cal N}_{p} \lpar \bar{u}\comma \sigma ^{\setnum{2}} \lpar Z^{\rm T} Z \plus \sigma ^{\setnum{2}} \rmSigma ^{ \minus \setnum{1}} \rpar ^{ \minus \setnum{1}} \rpar \comma$

where

$\bar{u} \equals \lpar Z^{\rm T} Z \plus \sigma ^{\setnum{2}} \rmSigma ^{ \minus \setnum{1}} \rpar ^{ \minus \setnum{1}} Z^{\rm T} \lpar y \minus X\beta \minus Z_{A} \tilde{a} \minus Z_{D} \tilde{d}\rpar .$

Sampling of the marker effects $\mathop {\tilde{\theta }}\nolimits_{\setnum{1}} \comma \ldots \comma \mathop {\tilde{\theta }}\nolimits_{M}$

We have

(9)

$p\lpar \mathop {\tilde{a}}\nolimits_{j} \vert \mathop {\tilde{a}}\nolimits_{ \minus j} \comma \tilde{d}\comma \xi \comma \gamma \comma \tau ^{\setnum{2}} \comma y\rpar \propto f\lpar \mathop {\tilde{a}}\nolimits_{j} \rpar g_{j} \lpar \mathop {\tilde{a}}\nolimits_{j} \comma \mathop {\tilde{d}}\nolimits_{j} \rpar \psi \lpar \mathop {\tilde{a}}\nolimits_{j} \rpar \comma$

where

$\eqalign{\tab \psi \lpar \mathop {\tilde{a}}\nolimits_{j} \rpar \equals \psi _{\kappa _{j} \comma \tau _{j}^{\setnum{2}} } \lpar \mathop {\tilde{a}}\nolimits_{j} \comma \tilde{d}_{j} \rpar \cr \propto \tab \left\{ {\matrix{ 1 \hfill \tab {{\rm for}\;{\rm BayesD}0} \;\cr\tab {{\mbox {-}} \; {\rm BayesD}1\comma } \hfill \cr {{1 \over {\vert \mathop {\tilde{a}}\nolimits_{j} \vert }}{\rm exp}\left( { \minus {{\mathop {\left( {\mathop {\tilde{d}}\nolimits_{j} \minus \vert \mathop {\tilde{a}}\nolimits_{j} \vert \mu _{\rmDelta } } \right)}\nolimits^{\setnum{2}} } \over {2\sigma _{\rmDelta }^{\setnum{2}} \mathop {\tilde{a}}\nolimits_{j}^{\setnum{2}} }}} \right)} \hfill \tab {{\rm for}\;{\rm BayesD}2\comma } \hfill \cr {{1 \over {\vert \mathop {\tilde{a}}\nolimits_{j} \vert }}{\rm exp}\left( { \minus {{\mathop {\left( {\mathop {\tilde{d}}\nolimits_{j} \minus \vert \mathop {\tilde{a}}\nolimits_{j} \vert \mu _{\rmDelta } \left( {{\textstyle{{\vert \mathop {\tilde{a}}\nolimits_{j} \vert } \over {\kappa _{j} s}}}} \right)} \right)}\nolimits^{\setnum{2}} } \over {2\sigma _{\rmDelta }^{\setnum{2}} \mathop {\tilde{a}}\nolimits_{j}^{\setnum{2}} }}} \right)} \hfill \tab {{\rm for}\;{\rm BayesD}3\comma } \hfill \cr} } \right. \cr}$

and $f\,\lpar {\tilde{a}}\nolimits_{j} \rpar \equals f_{{\cal N}\lpar \mu _{f} \comma \sigma _{f}^{\setnum{2}} \rpar } \lpar {\tilde{a}}\nolimits_{j} \rpar$ is the density of a normal distribution with mean and variance

$\mu _{f} \equals {{y\prime^{\rm T} Z_{A\lpar j\rpar } } \over {Z_{A_{{\lpar j\rpar }} }^{\rm T} Z_{A_{{\lpar j\rpar }} } \plus \sigma ^{\setnum{2}} \sol\lpar \kappa _{j}^{\setnum{2}} \tau _{j}^{\setnum{2}} \rpar }}\comma$

$\sigma _{f}^{\setnum{2}} \equals {{\sigma ^{\setnum{2}} } \over {Z_{A_{{\lpar j\rpar }} }^{\rm T} Z_{A_{{\lpar j\rpar }} } \plus \sigma ^{\setnum{2}} \sol \lpar\kappa _{j}^{\setnum{2}} \tau _{j}^{\setnum{2}} \rpar }}\comma$

where

$y\prime \equals y \minus X\beta \minus Zu \minus \mathop {Z_{A} }\nolimits_{\lpar \minus j\rpar } \mathop {\tilde{a}}\nolimits_{ \minus j} \minus Z_{D} \tilde{d}.$

Since $g_{j} \lpar \mathop {\tilde{a}}\nolimits_{j} \comma \mathop {\tilde{d}}\nolimits_{j} \rpar \equals \mathop {\rm pos}\nolimits_{j} \lpar {\tilde{d}}\nolimits_{j} \rpar$ if $\mathop {\tilde{a}}\nolimits_{j} \gt 0$ and $g_{j} \lpar \mathop {\tilde{a}}\nolimits_{j} \comma \mathop {\tilde{d}}\nolimits_{j} \rpar \equals 1 \minus \mathop {\rm pos}\nolimits_{j} \lpar {\tilde{d}}\nolimits_{j} \rpar$ if $\mathop {\tilde{a}}\nolimits_{j} \lt 0$ , a distribution with density proportional to $h\lpar {\tilde{a}}\nolimits_{j} \rpar \equals f\,\lpar {\tilde{a}}\nolimits_{j} \rpar g_{j} \lpar {\tilde{a}}\nolimits_{j} \comma {\tilde{d}}\nolimits_{j} \rpar$ is a mixture of two truncated normal distributions. Random numbers $\mathop {\tilde{a}}\nolimits_{\rm cand}$ from h are needed as candidate values for the Metropolis–Hastings step that samples $\mathop {\tilde{a}}\nolimits_{j}$ . The probability p _pos that a random variable with this distribution is positive equals

$\openup3\eqalign{ p_{{\rm pos}} \!\equals\! \tab {{g_{j} \lpar 1\comma \mathop {\tilde{d}}\nolimits_{j} \rpar \int_{\setnum{0}}^{\infty } \,f\lpar {\tilde{a}}\nolimits_{j} \rpar d\mathop {\tilde{a}}\nolimits_{j} } \over {g_{j} \lpar 1\comma {\tilde{d}}\nolimits_{j} \rpar \int_{\setnum{0}}^{\infty } \,f\lpar {\tilde{a}}\nolimits_{j} \rpar d{\tilde{a}}\nolimits_{j} \plus g_{j} \lpar \minus 1\comma \mathop {\tilde{d}}\nolimits_{j} \rpar \int_{ \minus \infty }^{\setnum{0}} \,f\,\lpar{\tilde{a}}\nolimits_{j} \rpar d{\tilde{a}}\nolimits_{j} }} \cr \equals \tab {{g_{j} \lpar 1\comma \mathop {\tilde{d}}\nolimits_{j} \rpar \lpar 1 \minus F\lpar 0\rpar \rpar } \over {g_{j} \lpar 1\comma \mathop {\tilde{d}}\nolimits_{j} \rpar \lpar 1 \minus F\lpar 0\rpar \rpar \plus g_{j} \lpar \minus 1\comma \mathop {\tilde{d}}\nolimits_{j} \rpar F\lpar 0\rpar }}\comma \cr}$

where F is the cumulative distribution function with density f. Sampling of $\mathop {\tilde{a}}\nolimits_{\rm cand}$ from h proceeds as follows:

Sample I _pos from (1, p _pos).
If I _pos=0 then sample U from ${\cal U}\lsqb 0\comma F\lpar 0\rpar \rsqb$ , else sample U from ${\cal U}\lsqb F\lpar 0\rpar \comma 1\rsqb$ .
Return $\mathop {\tilde{a}}\nolimits_{\rm cand} \equals F^{ \minus \setnum{1}} \lpar U\rpar$ .

Here, ${\cal U}\lsqb a\comma b\rsqb$ denotes the uniform distribution on the interval [a, b]. The Metropolis–Hastings algorithm can be used to sample $\mathop {\tilde{a}}\nolimits_{j}$ from the full conditional posterior as follows (Chib & Greenberg, Reference Chib and Greenberg1995):

Sample $\mathop {\tilde{a}}\nolimits_{j}$ from h
For(i in 1:maxIt){
Sample $\mathop {\tilde{a}}\nolimits_{\rm cand}$ from h
Let $\alpha \equals min\left( {{\textstyle{{\psi \lpar \mathop {\tilde{a}}\nolimits_{\rm cand} \rpar } \over {\psi \lpar \mathop {\tilde{a}}\nolimits_{j} \rpar }}}\comma 1} \right)$
With probability α let $\mathop {\tilde{a}}\nolimits_{j} \equals \mathop {\tilde{a}}\nolimits_{\rm cand}$
}
Return $\mathop {\tilde{a}}\nolimits_{j}$

Note that the Metropolis–Hastings step is not needed if $\psi \lpar \mathop {\tilde{a}}\nolimits_{j} \rpar \propto 1$ .

It remains to be shown how $\mathop {\tilde{d}}\nolimits_{j}$ is sampled for models that include dominance effects. The full conditional posterior distribution of $\mathop {\tilde{d}}\nolimits_{j}$ is

(10)

$p\,\lpar {\tilde{d}}\nolimits_{j} \vert \tilde{a}\comma \mathop {\tilde{d}}\nolimits_{ \minus j} \comma \xi \comma \gamma \comma \tau ^{\setnum{2}} \comma y\rpar \tab \propto \tilde{f}\,\lpar {\tilde{d}}\nolimits_{j} \rpar g_{j} \lpar {\tilde{a}}\nolimits_{j} \comma {\tilde{d}}\nolimits_{j} \rpar \comma$

where $\tilde{f}\,\lpar {\tilde{d}}\nolimits_{j} \rpar \equals f_{{\cal N}\lpar \mu _{{\tilde{f}}} \comma \sigma _{{\tilde{f}}}^{\setnum{2}} \rpar } \lpar{\tilde{d}}\nolimits_{j} \rpar$ is the density of a normal distribution with mean and variance

$\eqalign{ \mu _{\tilde{f}} \equals \tab {{\mathop {y\prime}\nolimits^{T} Z_{D_{{\lpar j\rpar }} } \plus \sigma ^{\setnum{2}} \mu _{d} \sol \lpar\sigma _{d}^{\setnum{2}} \kappa _{j} }\rpar \over {Z_{D_{{\lpar j\rpar }} }^{\rm T} Z_{D_{{\lpar j\rpar }} } \plus \sigma ^{\setnum{2}} \sol \lpar\sigma _{d}^{\setnum{2}} \kappa _{j}^{\setnum{2}} }\rpar}\comma \cr \sigma _{\tilde{f}}^{\setnum{2}} \equals \tab {{\sigma ^{\setnum{2}} } \over {Z_{D_{{\lpar j\rpar }} }^{\rm T} Z_{D_{{\lpar j\rpar }} } \plus \sigma ^{\setnum{2}} \sol \lpar \sigma _{d}^{\setnum{2}} \kappa _{j}^{\setnum{2}}\rpar }}\comma \cr}$

where

$y\prime \equals y \minus X\beta \minus Zu \minus Z_{A} \tilde{a} \minus Z_{D_{{\lpar \minus j\rpar }} } \mathop {\tilde{d}}\nolimits_{ \minus j} .$

Since $g_{j} \lpar {\tilde{a}}\nolimits_{j} \comma {\tilde{d}}\nolimits_{j} \rpar \equals \lpar 1 \minus w_{j} {\rm sign}\lpar {\tilde{a}}\nolimits_{j} \rpar\rpar \sol 2$ if ${\tilde{d}}\nolimits_{j} \gt 0$ and $g_{j} \lpar {\tilde{a}}\nolimits_{j} \comma {\tilde{d}}\nolimits_{j} \rpar \equals \lpar 1 \plus w_{j} {\rm sign}\lpar {\tilde{a}}\nolimits_{j} \rpar \rpar \sol 2$ if ${\tilde{d}}\nolimits_{j} \lt 0$ , the full conditional posterior distribution of $\mathop {\tilde{d}}\nolimits_{j}$ is a mixture of two truncated normal distributions. We have

$\eqalign{\mathop {\tilde{p}}\nolimits_{\rm pos} \tab\equals P\lpar {\tilde{d}}\nolimits_{j} \gt 0\vert {\rm ELSE}\rpar \cr\tab\equals {{g_{j}\, \lpar {\tilde{a}}\nolimits_{j} \comma 1\rpar \lpar 1 \minus \tilde{F}\lpar 0\rpar \rpar } \over {g_{j} \lpar {\tilde{a}}\nolimits_{j} \comma 1\rpar \lpar 1 \minus \tilde{F}\lpar 0\rpar \rpar \plus g_{j} \lpar {\tilde{a}}\nolimits_{j} \comma \minus 1\rpar \tilde{F}\lpar 0\rpar }}\comma$

where $\tilde{F}$ is the cumulative distribution function with density $\tilde{f}$ . Sampling of $\mathop {\tilde{d}}\nolimits_{j}$ proceeds as follows:

Sample I _pos from ${\cal B}\lpar 1\comma {\tilde{p}}\nolimits_{\rm pos} \rpar\!\!$ .
If I _pos=0 then sample U from ${\cal U}\lsqb 0\comma \tilde{F}\lpar 0\rpar \rsqb\!\!$ , else sample U from ${\cal U}\lsqb \tilde{F}\lpar 0\rpar \comma 1\rsqb$ .
Return $\mathop {\tilde{d}}\nolimits_{j} \equals \mathop {\tilde{F}}\nolimits^{ \minus \setnum{1}} \lpar U\rpar$ .

Sampling of γ₁, …, γ_M

The full conditional posterior distribution of γ_j is

(11)

$\gamma _{j} \vert \tilde{\theta }\comma \xi \comma \gamma _{ \minus j} \comma \tau ^{\setnum{2}} \comma y \sim {\cal B}\left( {1\comma {{\omega _{\setnum{1}} p_{\rm LD} } \over {\omega _{\setnum{1}} p_{\rm LD} \plus \omega _{\setnum{0}} \lpar 1 \minus p_{\rm LD} \rpar }}} \right)\comma$

where

$\eqalign{ \omega _{\setnum{0}} \equals \tab {1 \over \varepsilon }{\rm exp}\left( { \minus {{\mathop {\tilde a_{j} }\nolimits^{\setnum{2}} } \over {2\varepsilon ^{\setnum{2}} \tau _{j}^{\setnum{2}} }}} \right)\psi _{\varepsilon \comma \tau _{j}^{\setnum{2}} } \lpar {\tilde{a}}\nolimits_{j} \comma \tilde d_{j} \rpar \comma \cr \omega _{\setnum{1}} \equals \tab {\rm exp}\left( { \minus {{\mathop {\tilde a_{j} }\nolimits^{\setnum{2}} } \over {2\tau _{j}^{\setnum{2}} }}} \right)\psi _{\setnum{1}\comma \tau _{j}^{\setnum{2}} } \lpar {\tilde{a}}\nolimits_{j} \comma \tilde d_{j} \rpar . \cr}$

Updating of γ₁, …, γ _M is only needed if p _LD<1. After sampling of γ _j , κ _j needs to be updated.

Sampling of $\tau _{\setnum{1}}^{\setnum{2}} \comma \ldots \comma \tau _{M}^{\setnum{2}}$

For BayesD1, the full conditional posterior distribution of $\tau _{j}^{\setnum{2}}$ is

(12)

$\eqalign{ \tab \tau _{j}^{\setnum{2}} \vert \tilde{\theta }\comma \xi \comma \gamma \comma \tau _{ \minus j}^{\setnum{2}} \comma y \sim\cr \tab \quad {\rm Inv} \minus \chi ^{\setnum{2}} \left( {v \plus 2\comma {{{\textstyle{{\mathop {\tilde{a}}\nolimits_{j}^{\setnum{2}} } \over {\kappa _{j}^{\setnum{2}} }}} \plus \mathop {\left( {{\textstyle{{{\textstyle{{\mathop {\tilde{d}}\nolimits_{j} } \over {\kappa _{j} }}} \minus \mu _{D} } \over {s_{D} }}}} \right)}\nolimits^{\setnum{2}} \plus vs^{\setnum{2}} } \over {v \plus 2}}} \right). \cr}$

Otherwise, the full conditional posterior distribution of $\tau _{j}^{\setnum{2}}$ is

(13)

$\tau _{j}^{\setnum{2}} \vert \tilde{\theta }\comma \xi \comma \gamma \comma \tau _{ \minus j}^{\setnum{2}} \comma y \sim {\rm Inv} \minus \chi ^{\setnum{2}} \left( {{v} \plus 1\comma {{{\textstyle{{\mathop {\tilde{a}}\nolimits_{j}^{\setnum{2}} } \over {\kappa _{j}^{\setnum{2}} }}} \plus {v}s^{\setnum{2}} } \over {{v} \plus 1}}} \right).$

Sampling of the error variance σ²

The full conditional posterior distribution of σ² is

(\vskip5pt(14))

$\eqalign{ \tab \sigma ^{\setnum{2}} \vert \tilde{\theta }\comma \beta \comma u\comma \gamma \comma \tau ^{\setnum{2}} \comma y \sim\cr \tab \quad {\rm Inv} \minus \chi ^{\setnum{2}} \left( {n \plus {v}{ \ast } \comma {{E^{\rm T} E \plus {v}{ \ast } s{ \ast}^\setnum{2}} } \over {n \plus {v}{ \ast } }}} \right)\comma \cr}$

where $E \equals y \minus X\beta \minus Zu \minus Z_{A} \tilde{a} \minus Z_{D} \tilde{d}$ .

Appendix B: Calculation of hyper-parameters

For BayesD1 with conditionally independent additive effect and dominance effects the parameters s ², μ _D and s_D need to be specified. We have

(15)

$\openup3\eqalign{ \mu _{D} \equals \tab {{\cal I} \over {M\bars{h}_{ \circ } E\lpar \kappa _{j} \rpar }}\comma \cr s^{\setnum{2}} \equals \tab {{V_{A} \minus {\textstyle{{\gamma _{M} } \over \overline{{h_{ \circ }^{\setnum{2}} }} }}V_{D} } \over {M\bars{h}_{ \circ } E\lpar \kappa _{j}^{\setnum{2}} \rpar }}{{{v} \minus 2} \over {v}}\comma \cr s_{D}^{\setnum{2}} \equals \tab {\textstyle{{V_{D} } \over {M\overline{{h_{ \circ }^{\setnum{2}} }} E\lpar \kappa _{j}^{\setnum{2}} \rpar} } \minus \mu _{D}^{\setnum{2}} \over {s^{\setnum{2}}}} {{v \minus 2} \over {v}}.}$

For BayesD2 with independent absolute additive effect and dominance coefficients the parameters s ², μ_Δ and $\sigma _{\rmDelta }^{\setnum{2}}$ are obtained as follows. Since

(16)

${{\sigma _{\rmDelta }^{\setnum{2}} } \over {\mu _{\rmDelta }^{\setnum{2}} }} \equals {{V_{D} } \over {{\cal I}^{\setnum{2}} }}{{M\bars{h}_{ \circ }^{\setnum{2}} E\lpar \kappa _{j} \rpar ^{\setnum{2}} \lambda ^{\setnum{2}} } \over {\overline{{h_{ \circ }^{\setnum{2}} }} } E\lpar \kappa _{j}^{\setnum{2}} \rpar }} \minus 1\comma$

we can calculate $K\lpar \sigma _{\rmDelta } \sol \mu _{\rmDelta } \rpar$ . We have

(17)

$\mu _{\rmDelta } \equals \tab {{ \minus b \pm \sqrt {b^{\setnum{2}} \minus 4ac} } \over {2a}}\comma$

where

$\openup3\eqalign{ a \equals \tab V_{A} \minus {{\gamma _{M} V_{D} } \over {\overline{{h_{ \circ }^{\setnum{2}} }} }}\comma \cr b \equals \tab {{2E\lpar \kappa _{j}^{\setnum{2}} \rpar {\cal I}^{\setnum{2}} K\left( {{\textstyle{{\sigma _{\rmDelta } } \over {\mu _{\rmDelta } }}}} \right)\mathop {\tilde{\gamma }}\nolimits_{M} } \over {\lambda ^{\setnum{2}} M\bars{h}_{ \circ }^{\setnum{2}} E\lpar \kappa _{j} \rpar ^{\setnum{2}} }}\comma \cr c \equals \tab {{ \minus {\cal I}^{\setnum{2}} E\lpar \kappa _{j}^{\setnum{2}} \rpar } \over {\lambda ^{\setnum{2}} M\bars{h}_{ \circ } E\lpar \kappa _{j} \rpar ^{\setnum{2}} }}. \cr}$

Then $\sigma _{\rmDelta }^{\setnum{2}} \equals \mu _{\rmDelta }^{\setnum{2}} {\textstyle{{\sigma _{\rmDelta }^{\setnum{2}} } \over {\mu _{\rmDelta }^{\setnum{2}} }}}$ can be calculated. Finally, s ² is obtained from

$s^{\setnum{2}} \equals {{V_{D} } \over {\lpar \sigma _{\rmDelta }^{\setnum{2}} \plus \mu _{\rmDelta }^{\setnum{2}} \rpar M\overline{{h_{ \circ }^{\setnum{2}} }}} E\lpar \kappa _{j}^{\setnum{2}} \rpar }}{{{v} \minus 2} \over {v}}.$

For BayesD3 with dependent absolute additive effect and dominance coefficients, the parameters s ², s _Δ and $\sigma _{\rmDelta }^{\setnum{2}}$ are calculated as follows. At first, the parameter s _Δ is determined such that the following equation holds:

(\vskip12pt (18))

$\eqalign{ \bars{h}_{ \circ } {{v} \over {{v} \minus 2}} \equals \tab {{V_{A} {\overline{h_{ \circ }^{\setnum{2}} } \minus V_{D} \gamma _{M} } \over {{\cal I}^{\setnum{2}}}}{{E\lpar \kappa _{j} \rpar ^{\setnum{2}} \bars{h}_{ \circ }^{\setnum{2}} } \over {E\lpar \kappa _{j}^{\setnum{2}} \rpar \mathop {\overline{{h_{ \circ }^{\setnum{2}} }} }}}{ME}\lpar \vert t\vert \mu _{\rmDelta } \lpar \vert t\vert \rpar \rpar ^{\setnum{2}}}\cr \tab \plus 2\mathop {\tilde{\gamma }}\nolimits_{M} E\left( {t^{\setnum{2}} \mu _{\rmDelta } \lpar \vert t\vert \rpar K\left( {{{\sigma _{\rmDelta } } \over {\mu _{\rmDelta } \lpar \vert t\vert \rpar }}} \right)} \right)\comma \cr}\hskip-18pt$

where

$\eqalign{ \sigma _{\rmDelta }^{\setnum{2}} \equals \tab \left( {{{V_{D} } \over {{\cal I}^{\setnum{2}} }}{{E\lpar \kappa _{j} \rpar ^{\setnum{2}} \bars{h}_{ \circ }^{\setnum{2}} } \over {E\lpar \kappa _{j}^{\setnum{2}} \rpar \overline{{h_{ \circ }^{\setnum{2}} }} }}{ ME}\lpar \vert t\vert \mu _{\rmDelta } \lpar \vert t\vert \rpar \rpar ^{\setnum{2}} \minus E\left( {t^{\setnum{2}} \mu _{\rmDelta } \mathop {\lpar \vert t\vert \rpar }\nolimits^{\setnum{2}} } \right)} \right) \cr \tab \times {{v \minus 2} \over {v}}. \cr}$

This was done by a grid search, where the expectations were estimated from simulated random variables t~t_v . Then σ_Δ² can be calculated and s ² is obtained from

$s^{\setnum{2}} \equals {{{\cal I}^{\setnum{2}} } \over {M^{\setnum{2}} E\lpar \kappa _{j} \rpar ^{\setnum{2}} \bars{h}_{ \circ }^{\setnum{2}} E\lpar \vert t\vert \mu _{\rmDelta } \lpar \vert t\vert \rpar \rpar ^{\setnum{2}} }}.$

Appendix C: Hilbert space for RKHS regression

This section defines the Hilbert space ${\cal H}$ for which a symmetric and positive semidefinite kernel K is reproducing. Consider the linear space S that consists of all linear combinations of functions $K\lpar x\comma \cdot \rpar \colon \Omega \to {\bb R}$ with x∊Ω. Each function f∊S can be written as $f \equals \sum _{j \equals \setnum{1}}^{N} a_{j} K\lpar x_{j} \comma \cdot \rpar$ with N∊ ${\bb N}$ , pairwise different x ₁, …, x_N ∊Ω and a∊ ${\bb R}^N$ . For two functions $g_{\setnum{1}} \equals \sum _{j \equals \setnum{1}}^{N_{\setnum{1}} } a_{j} K\lpar x_{j} \comma \cdot \rpar \comma g_{\setnum{2}} \equals \sum _{k \equals \setnum{1}}^{N_{\setnum{2}} } b_{k} K\lpar \mathop {\tilde{x}}\nolimits_{k} \comma \cdot \rpar$ , the inner product is defined as

$\left\langle {g_{\setnum{1}} \comma g_{\setnum{2}} } \right\rangle \equals \mathop\sum\limits_{j \equals \setnum{1}}^{N_{\setnum{1}} } \,\mathop\sum\limits_{k \equals \setnum{1}}^{N_{\setnum{2}} } \,a_{j} b_{k} K\lpar x_{j} \comma \mathop {\tilde{x}}\nolimits_{k} \rpar .$

S is a pre-Hilbert space with inner product 〈·, ·〉. The closure of S under the inner product 〈·, ·〉 is a Hilbert space ${\cal H}$ . It is called the native space of the kernel K. Note that ${\cal H}$ can be larger than S because it also includes all limiting functions of Cauchy sequences. The kernel K is a reproducing kernel of ${\cal H}$ . For proofs see the literature on Hilbert spaces, e.g. Shawe-Taylor & Cristianini (Reference Shawe-Taylor and Cristianini2004) .

References

Bennewitz, J. & Meuwissen, T. H. E. (2010). The distribution of QTL additive and dominance effects in porcine F2 crosses. Journal of Animal Breeding and Genetics 127, 171–179.CrossRef Google Scholar PubMed

Caballero, A. & Keightley, P. D. (1994). A pleiotropic nonadditive model of variation in quantitative traits. Genetics 138, 883–900.CrossRef Google Scholar PubMed

Calus, M. P. L. (2010). Genomic breeding value prediction: methods and procedures. Animal 4, 157–164.CrossRef Google Scholar PubMed

Calus, M. P. L., Meuwissen, T. H. E., de Roos, A. P. W. & Veerkamp, R. F. (2008). Accuracy of genomic selection using different methods to define haplotypes. Genetics 178, 553–561.Google Scholar

Charlesworth, D. & Willis, J. H. (2009). The genetics of inbreeding depression. Nature Reviews Genetics 10, 783–796.Google Scholar

Chib, S. & Greenberg, E. (1995). Understanding the Metropolis–Hastings algorithm. American Statistical Association 49, 327–335.CrossRef Google Scholar

Cockerham, C. C. (1954). An extension of the concept of partitioning hereditary variance for analysis of covariances among relatives when epistasis is present. Genetics 39, 859–882.CrossRef Google Scholar PubMed

de los Campos, G., Gianola, D. & Rosa, G. J. M. (2009). Reproducing kernel Hilbert spaces regression: a general framework for genetic evaluation. Journal of Animal Science 87, 1883–1887.CrossRef Google Scholar PubMed

Duangjinda, M., Bertrand, J. K., Misztal, I. & Druet, T. (2001). Esimation of additive and nonadditive genetic variances in Hereford, Gelbvieh, and Charolais by Method R. Journal of Animal Science 79, 2997–3001.Google Scholar

Falconer, D. S. & Mackay, T. F. C. (1996). Introduction to Quantitative Genetics. London, UK: Longman.Google Scholar

García-Dorado, A., López-Fanjul, C. & Caballero, A. (1999). Properties of spontaneous mutations affecting quantitative traits. Genetic Research Cambridge 74, 341–350.Google Scholar

Garrick, D. J., Taylor, J. F. & Fernando, R. L. (2009). Deregressing estimated breeding values and weighting information for genomic regression analyses. Genetics Selection Evolution 41, 55.Google Scholar

George, E. I. & McCulloch, R. E. (1993). Variable selection via Gibbs sampling. Journal of the American Statistical Association 88, 881–889.CrossRef Google Scholar

Gianola, D. & de los Campos, G. (2008). Inferring genetic values for quantitative traits non-parametrically. Genetic Research 90, 525–540.CrossRef Google Scholar PubMed

Gianola, D., Fernando, R. L. & Stella, A. (2006). Genomic-assisted prediction of genetic value with semiparametric procedures. Genetics 173, 1761–1776.CrossRef Google Scholar PubMed

Gianola, D. & van Kaam, J. B. C. H. M. (2008). Reproducing Kernel Hilbert spaces regression methods for genomic assisted prediction of quantitative traits. Genetics 178, 2289–2303.Google Scholar

Gianola, D., de los Campos, G., Hill, W. G., Manfredi, E. & Fernando, R. (2009). Additive genetic variability and the Bayesian alphabet. Genetics 183, 347–363.CrossRef Google Scholar PubMed

Grisart, B., Farnir, F., Karim, L., Cambisano, N., Kim, J. J., Kvasz, A., Mni, M., Simon, P., Frère, J.-M., Coppieters, W. & Georges, M. (2004). Genetic and functional confirmation of the causality of the DGAT1 K232A quantitative trait nucleotide in affecting milk yield and composition. Proceedings of the National Academy of Sciences of the United States of America 101, 2398–2403.CrossRef Google Scholar PubMed

Habier, D., Tetens, J., Seefried, F.-R., Lichtner, P. & Thaller, G. (2010). The impact of genetic relationship information on genomic breeding values in German Holstein cattle. Genetics Selection Evolution 42, 5.Google Scholar

Hayes, B. J., Bowman, P. J., Chamberlain, A. J. & Goddard, M. E. (2009). Invited review: genomic selection in dairy cattle: progress and challenges. Journal of Dairy Science 92, 433–443.CrossRef Google Scholar PubMed

Hayes, B. J., Pryce, J., Chamberlain, A. J., Bowman, P. J. & Goddard, M. E. (2010). Genetic architecture of complex traits and accuracy of genomic prediction: coat colour, milk-fat percentage, and type in holstein cattle as contrasting model traits. PLoS Genetics 6, 31001139.Google Scholar

Heffner, E. L., Sorrells, M. E. & Jannink, J.-L. (2009). Genomic selection for crop improvement. Crop Science 49, 1–12.Google Scholar

Henderson, C. R. (1985). Best linear unbiased prediction of nonadditive genetic merits in noninbred populations. Journal of Animal Science 60, 111–117.Google Scholar

Hill, W. G., Goddard, M. E. & Visscher, P. M. (2008). Data and theory point to mainly additive genetic variance for complex traits. PLoS Genetics 4, e1000008.Google Scholar

Kacser, H. & Burns, J. A. (1981). The molecular basis of dominance. Genetics 97, 639–666.CrossRef Google Scholar PubMed

Kempthorne, O. (1954). The correlation between relatives in a random mating population. Proceedings of the Royal Society of London. Series B 143, 103–113.Google Scholar

Legarra, A., Robert-Granié, C., Croiseau, P., Guillaume, F. & Fritz, S. (2011). Improved Lasso for genomic selection. Genetic Research Cambridge 93, 77–87.Google Scholar

Luan, T., Woolliams, J. A., Lien, S., Kent, M., Svendsen, M. & Meuwissen, T. H. E. (2009). The accuracy of genomic selection in norwegian red cattle assessed by cross-validation. Genetics 183, 1119–1126.Google Scholar

Meuwissen, T. H. E. (2009). Accuracy of breeding values of ‘unrelated’ individuals predicted by dense SNP genotyping. Genetics Selection Evolution 41, 35.Google Scholar

Meuwissen, T. H. E., Hayes, B. J. & Goddard, M. E. (2001). Prediction of total genetic value using genome-wide dense marker maps. Genetics 157, 1819–1829.Google Scholar

Meuwissen, T. H. E. & Goddard, M. E. (2004). Mapping multiple QTL using linkage disequilibrium and linkage analysis information and multitrait data. Genetics, Selection, Evolution 36, 261–279.Google Scholar

Meuwissen, T. H. E. & Goddard, M. E. (2010a). Accurate prediction of genetic values for complex traits by whole genome resequencing. Genetics 185, 623–631.CrossRef Google Scholar PubMed

Meuwissen, T. H. E. & Goddard, M. E. (2010b). The use of family relationships and linkage disequilibrium to impute phase and missing genotypes in up to whole-genome sequence density genotypic data. Genetics 185, 1441–1449.CrossRef Google Scholar PubMed

Misztal, I. (1997). Estimation of variance components with large-scale dominance models. Journal of Dairy Science 80, 965–974.CrossRef Google Scholar

Ober, U., Erbe, M., Long, N., Porcu, E., Schlather, M. & Simianer, H. (2011). Predicting genetic values: a Kernel-based best linear unbiased prediction with genomic data. Genetics 188, 695–708.CrossRef Google Scholar PubMed

Park, T. & Casella, G. (2008). The Bayesian Lasso. Journal of the American Statistical Association 103, 681–686.Google Scholar

Piepho, H. P. (2009). Ridge regression and extensions for genomewide selection in maize. Crop Science 49, 1165–1176.CrossRef Google Scholar

Psarakis, S. & Panaretos, J. (1990). The folded t distribution. Communications in Statistics – Theory and Methods 19, 2717–2734.Google Scholar

Rosa, G. J. M., Gianola, D. & Padovani, C. R. (2004). Bayesian longitudal data analysis with mixed models and thick-tailed distributions using MCMC. Journal of Applied Statistics 31, 855–873.Google Scholar

Serenius, T., Stalder, K. J. & Puonti, M. (2006). Impact of dominance effects on sow longevity. Journal of Animal Breeding and Genetics 123, 355–361.CrossRef Google Scholar PubMed

Shawe-Taylor, J. & Cristianini, N. (2004). Kernel Methods for Pattern Analysis. Cambridge: Cambridge University Press.CrossRef Google Scholar

Toro, M. A. & Varona, L. (2010). A note on mate allocation for dominance handling in genomic selection. Genetics Selection Evolution 42, 33.CrossRef Google Scholar PubMed

van Tassell, C. P., Misztal, I. & Varona, L. (2000). Method R estimates of additive genetic, dominance genetic, and permanent environmental fraction of variance for yield and health traits of holsteins. Journal of Dairy Science 83, 1873–1877.CrossRef Google Scholar PubMed

Verbyla, K. L., Hayes, B. J., Bowman, P. J. & Goddard, M. E. (2009). Accuracy of genomic selection using stochastic search variable selection in Australian Holstein Friesian dairy cattle. Genetics Research Cambridge 91, 307–311.Google Scholar

Verbyla, K. L., Bowman, P. J., Hayes, B. J. & Goddard, M. E. (2010). Sensitivity of genomic selection to using different prior distributions. BMC Proceedings 4 (Suppl. 1), S5.CrossRef Google Scholar PubMed

Villa-Angulo, R., Matukumalli, L. K., Gill, C. A., Choi, J., Van Tassell, C. P. & Grefenstette, J. J. (2009). High-resolution haplotype block structure in the cattle genome. BMC Genetics 10, 19.CrossRef Google Scholar PubMed

Wellmann, R. & Bennewitz, J. (2011). The contribution of dominance to the understanding of quantitative genetic variation. Genetics Research Cambridge 93, 139–154.Google Scholar

Wolc, A., Stricker, C., Arango, J., Settar, P., Fulton, J. E., O'Sullivan, N. P., Preisinger, R., Habier, D., Fernando, R., Garrick, D. J., Lamont, S. J. & Deckers, J. C. M. (2011). Breeding value prediction for production traits in layer chickens using pedigree or genomic relationships in a reduced animal model. Genetics Selection Evolution 43, 5.Google Scholar

Xu, S. (2003). Estimating polygenic effects using markers of the entire genome. Genetics 163, 789–801.CrossRef Google Scholar PubMed

Yi, N., George, V. & Allison, D. B. (2003). Stochastic search variable selection for identifying multiple quantitative trait loci. Genetics 164, 1129–1138.Google Scholar

Table 1. Table of symbols

Fig. 1. Samples drawn from the joint prior distribution of additive and dominance effects of markers with allele frequency qj=pj=0·5, where additive effects are Student t-distributed with v=2·5 degrees of freedom. The distribution specifications of BayesD1–BayesD3 are given in Section 2(ii).

Table 2. Model-specific expectations

Fig. 2. Accuracy of predicted BV (a), dominance deviations (b) and GV (c) for generations 1–5 and 3000 markers per chromosome.

Table 3. Accuracies of predicted BV, dominance deviations (DV) and GV in generations 2 and 5, regressions bBV, bDV and bGV and of true on predicted values, and computation times per cycle relative to BayesA

Fig. 3. Mean accuracy of predicted GV in generations 1–5 calculated from 10x iterations of the sampler for 3000 markers per chromosome.

Fig. 4. Accuracy of predicted BV (a), dominance deviations (b), and GV (c) for marker panels with 1500, 3000 and 6000 markers per chromosome. The average maximum r2 values of a QTL with a marker are shown on the x-axis for the different panels.

Wellmann supplementary material

PDF 136.9 KB

Article contents

Bayesian models with dominance effects for genomic evaluation of quantitative traits

Summary

Introduction

Theory

Possibilities to model the genetics of dominance

The linear regression model

The joint posterior distribution and the Markov chain

Moments of the random effects

Prediction of genotypic values

Calculation of hyper-parameters

Application

Simulation

Analysis of the simulated data sets

Results

Discussion

Appendix A: The Markov Chain

Sampling of the fixed effect β

Sampling of the random effect u

Sampling of the marker effects

Sampling of γ1, …, γM

Sampling of

Sampling of the error variance σ2

Appendix B: Calculation of hyper-parameters

Appendix C: Hilbert space for RKHS regression

References

Wellmann supplementary material

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests

Sampling of the marker effects $\mathop {\tilde{\theta }}\nolimits_{\setnum{1}} \comma \ldots \comma \mathop {\tilde{\theta }}\nolimits_{M}$

Sampling of γ₁, …, γ_M

Sampling of $\tau _{\setnum{1}}^{\setnum{2}} \comma \ldots \comma \tau _{M}^{\setnum{2}}$

Sampling of the error variance σ²