Variation in actual relationship as a consequence of Mendelian sampling and linkage

W.G. HILL; B.S. WEIR

doi:10.1017/S0016672310000480

Variation in actual relationship as a consequence of Mendelian sampling and linkage

Published online by Cambridge University Press: 12 January 2011

W.G. HILL and

B.S. WEIR

Show author details

W.G. HILL*: Affiliation:
Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, West Mains Road, Edinburgh EH9 3JT, UK
B.S. WEIR: Affiliation:
Department of Biostatistics, University of Washington, Box 357232, Seattle, WA 98195-7232, USA
*: *Corresponding author. Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, West Mains Road, Edinburgh EH9 3JT, UK. Tel: +44-(0)131-650 5705. Fax: +44-(0)131-650 6564. e-mail: [email protected]

Article contents

Summary
Introduction
General formulae for variance of genome sharing of non-inbred individuals
References

Rights & Permissions

Summary

Although the expected relationship or proportion of genome shared by pairs of relatives can be obtained from their pedigrees, the actual quantities deviate as a consequence of Mendelian sampling and depend on the number of chromosomes and map length. Formulae have been published previously for the variance of actual relationship for a number of specific types of relatives but no general formula for non-inbred individuals is available. We provide here a unified framework that enables the variances for distant relatives to be easily computed, showing, for example, how the variance of sharing for great grandparent–great grandchild, great uncle–great nephew, half uncle–nephew and first cousins differ, even though they have the same expected relationship. Results are extended in order to include differences in map length between sexes, no recombination in males and sex linkage. We derive the magnitude of skew in the proportion shared, showing the skew becomes increasingly large the more distant the relationship. The results obtained for variation in actual relationship apply directly to the variation in actual inbreeding as both are functions of genomic coancestry, and we show how to partition the variation in actual inbreeding between and within families. Although the variance of actual relationship falls as individuals become more distant, its coefficient of variation rises, and so, exacerbated by the skewness, it becomes increasingly difficult to distinguish different pedigree relationships from the actual fraction of the genome shared.

Type: Research Papers
Information: Genetics Research , Volume 93 , Issue 1 , February 2011 , pp. 47 - 64

DOI: https://doi.org/10.1017/S0016672310000480 [Opens in a new window]
Copyright: Copyright © Cambridge University Press 2011

1. Introduction

Characterizing the relationship between pairs of individuals continues to be of importance in many areas of population and quantitative genetics. Variation in genome sharing identical by descent (ibd) over the genome depends both on the pedigree and the extent to which alleles at different loci are jointly ibd. The degree of relationship might be inferred from pedigree information or it can be estimated from genetic information (Weir et al., Reference Weir, Anderson and Hepler2006; Visscher et al., Reference Visscher, Medland, Ferreira, Morley, Zhu, Cornes, Montgomery and Martin2006; Yu et al., Reference Yu, Pressoir, Briggs, Bi, Yamasaki, Doebley, McMullen, Gaut, Nielsen, Holland, Kresovich and Buckler2006), but in either case there is variation in relationship measures. A recent development has been to utilize this variability in the actual relationship to estimate the components of variance for quantitative traits from the variation in resemblance among full sibs, i.e. family members who have the same pedigree relationship (Visscher et al., Reference Visscher, Medland, Ferreira, Morley, Zhu, Cornes, Montgomery and Martin2006).

By making assumptions about the mapping function, the variation in the proportion of genome-shared ibd, or actual relationship, can be computed for different pedigrees. Formulae have been published for autosomal loci of lineal descendants (Stam & Zeven, Reference Stam and Zeven1981; Hill, Reference Hill1993a), sibs (Hill, Reference Hill1993b) and other relatives, including cousins (Guo, Reference Guo1995). Formulae have also been given for the variation of identity of full sibs for both alleles at each site (Visscher et al., Reference Visscher, Medland, Ferreira, Morley, Zhu, Cornes, Montgomery and Martin2006) and for sex-linked loci (Visscher, Reference Visscher2009).

These analyses are solely concerned with the variances of the distributions of sharing. The distribution itself or other functions of it have also been obtained. In particular, Donnelly (Reference Donnelly1983) computed the probability that the proportion shared with an ancestor exceeded zero. Bickeboller & Thompson (Reference Bickeboller and Thompson1996a, Reference Bickeboller and Thompsonb) obtained approximations for the distribution of the proportion shared between half-sibs and between offspring and parent. The full distribution has been obtained by Stefanov and colleagues for lineal descendants (Stefanov, Reference Stefanov2000, Reference Stefanov2004) and for half sibs (Ball & Stevanov, Reference Ball and Stefanov2005). Their results generally take the form of a set of equations and computer routines for numerical evaluation.

With the advent of dense genome mapping, it has become possible to estimate the actual proportion of the genome shared for pairs of relatives and to compare the observed with expected values. This has been done for full sibs by Visscher et al. (Reference Visscher, Medland, Ferreira, Morley, Zhu, Cornes, Montgomery and Martin2006, Reference Visscher, Macgregor, Benyamin, Zhu, Gordon, Medland, Hill, Hottenga, Willemsen, Boomsma, Liu, Deng, Montgomery and Martin2007), and there was generally good agreement between observed and expected sharing.

Mapping with multiple markers enables relatives to be identified among samples from the population. The ability to correctly assign relationship, to distinguish between second and third cousins, for example, depends on the sampling variance of the actual proportion of genome shared and the additional sampling due to the use of a limited number of markers. Such data arise in genome-wide association studies, for example, where up to millions of single nucleotide polymorphism (SNP) markers are genotyped on thousands of individuals, and the relationship structure of the data is an important component in determining the reliability of conclusions on trait gene identification. Genetic variances of quantitative traits can be estimated by taking advantage of the variation in genome sharing to account for phenotypic similarity both within families of full sibs (including dizygotic twins) (Visscher et al., Reference Visscher, Medland, Ferreira, Morley, Zhu, Cornes, Montgomery and Martin2006, Reference Visscher, Macgregor, Benyamin, Zhu, Gordon, Medland, Hill, Hottenga, Willemsen, Boomsma, Liu, Deng, Montgomery and Martin2007) and between families utilizing information on distant relatives not available from known relationships (Yang et al., Reference Yang, Benyamin, McEvoy, Gordon, Henders, Nyhot, Madden, Heath, Martin, Montgomery, Goddard and Visscher2010). Quantifying the degree of relationship is also an important aspect of genotype data cleaning in genome-wide association studies (Laurie et al., Reference Laurie, Doheny, Mirel, Pugh, Bierut, Bhangale, Boehm, Caporaso, Edenberg, Gabriel, Harris, Hu, Jacobs, Kraft, Landi, Lumley, Manolio, McHugh, Painter, Paschall, Rice, Rice, Zheng and Weir2010), for guarding against incorrect annotation of family membership or for modifying tests of marker trait association (Choi et al., Reference Choi, Wijsman and Weir2009). Genomic selection, which utilizes dense mapping for identifying sharing of genes among relatives, depends on there being variability in genome sharing of relatives that have the same pedigree relationship (Meuwissen et al., Reference Meuwissen, Hayes and Goddard2001), and which has major application, mainly so far in plant and animal breeding. It may be based directly on the actual genomic relationship matrix or with weighting dependent on the variance in the trait associated with particular genomic regions (Goddard, Reference Goddard2009). These activities require an appreciation of the extent of the variation in genome sharing by identity and have motivated this study.

Our objective in this paper is to consider moments of the distribution of allele sharing, and to obtain formulae that can be applied simply to any kind and degree of relationship, including direct descendants and those of half- and of full sibs. The distributions can be highly skewed, particularly when the relationship is low, and hence we also obtain formulae for the magnitude of skew of relationship. Although we restrict the analysis to the relationship among non-inbred individuals, the results apply directly to the variation in actual inbreeding of offspring of consanguineous matings and we show how to apply them.

2. General formulae for variance of genome sharing of non-inbred individuals

(i) Background theory

At any locus individuals may share zero, one or two pairs of alleles ibd with probabilities k ₀, k ₁ or k ₂. The actual ibd status can be indicated by _m, m=0, 1, 2, where _m=1 if the individuals share exactly m pairs of alleles ibd and _m=0 otherwise. The probabilities k _m depend on the pedigree structure and are the expected values of the _m. As exactly one of the _m is equal to 1 at any locus and as squaring an indicator does not change its value, their variances and covariances are

$\eqalign{\tab \mu _{\setnum{2}} \lpar \u {k} _{m} \rpar \equals {\rm Var}\lpar \u {k} _{m} \rpar \equals k_{m} \lpar 1 \minus k_{m} \rpar \comma \quad m \equals 0\comma {\rm \ }1\comma {\rm \ }2\comma \cr \tab {\rm Cov}\lpar \u {k} _{m} \comma \u {k}_{m\prime} \rpar \equals \minus k_{m} k_{m\prime} \comma \quad m \ne m\prime. \cr}$

Less detailed measures of relationship are the co-ancestry or kinship coefficient, $\theta \equals {\textstyle{1 \over 2}}k_{\setnum{2}} \plus {\textstyle{1 \over 4}}k_{\setnum{1}}$ , the probability that an allele drawn at random from one individual is ibd to a random allele from the other, and the relationship $R \equals 2\theta \equals k_{\setnum{2}} \plus {\textstyle{1 \over 2}}k_{\setnum{1}}$ . This equals Wright's (Reference Wright1922) relationship for non-inbred individuals and is also called the ‘numerator relationship’. We shall primarily use R here as we are considering an analysis of genome sharing, for R is the probability that a random allele identified in one individual is present ibd in the other. We have previously considered variation in actual coancestry (Cockerham & Weir, Reference Cockerham and Weir1983; Weir et al., Reference Weir, Cardon, Anderson, Nielsen and Hill2005) and thus in relationship. The actual relationship is $\u {R} \equals \u {k} _{\setnum{2}} \plus {\textstyle{1 \over 2}}\u {k} _{\setnum{1}}$ and this has variance

(1)

$\mu_{\setnum{2}} \lpar \u {R} \rpar \equals {\rm Var}\lpar \u {R} \rpar \equals k_{\setnum{2}} \plus {\textstyle{1 \over 4}}k_{\setnum{1}} \minus \left( {k_{\setnum{2}} \plus {\textstyle{1 \over 2}}k_{\setnum{1}} } \right)^{\setnum{2}} \equals k_{\setnum{2}} \plus {\textstyle{1 \over 4}}k_{\setnum{1}} \minus R^{\setnum{2}}.$

The quantity ${\textstyle{1 \over 4}}k_{\setnum{2}} \plus {\textstyle{1 \over {16}}}k_{\setnum{1}}$ was written as Δ by Cockerham & Weir (Reference Cockerham and Weir1983) and is the probability that two pairs of alleles at the same locus are ibd.

The inbreeding coefficient F is the probability that the two alleles carried by an individual are ibd. We have discussed the variation in actual inbreeding (Weir et al., Reference Weir, Avery and Hill1980; Cockerham & Weir, Reference Cockerham and Weir1983), with the variation in the two-allele measures θ and F expressed as a function of the ibd probability of a set of two, three or four alleles. We shall also discuss coefficients of variation of actual identity. For example,

${\rm CV}\lpar \u {R} \rpar \equals \sqrt {{\rm Var}\lpar \u {R} \rpar } \sol {E} \lpar \u {R} \rpar \equals \sqrt {{\rm Var}\lpar \u {R} \rpar } \sol R.$

In Table 1, we list values for the ks, R and their single-locus variances and covariances for some common relationships. We now consider the variances and covariances of the actual identities when that they are averaged over the genome, assuming that they have the same expected values at all loci. The results for single loci also apply if the loci are completely linked and are therefore a limiting case of the genome-average results.

When we consider the variation in sharing of relatives over the genome, we require the average over pairs of loci i, j of the covariances of the actual sharing indicators _i, _j for 0, 1 or 2 pairs of alleles. For a set of r loci $\u {k} \equals {1 \over r}\sum\nolimits_{i \equals \setnum{1}}^{r} {\u {k} _{i} }$ and

$E\lpar \u {k} ^{\setnum{2}} \rpar \equals {1 \over {r^{\setnum{2}} }}E \left( \mathop\sum\limits_{i} {\u {k} _{i}^{\setnum{2}} } \plus \mathop\sum\limits_{i \ne j} {\u {k} _{i} } \u {k} _{j} \right).$

Table 1. Expectations and variances for actual identity at individual loci

Combining the two terms in this sum and subtracting the square of the mean gives

${\rm Var} \lpar \u {k} \rpar \equals {1 \over {r^{\setnum{2}} }}E \left( \mathop\sum\limits_{i} {\mathop\sum\limits_{j} {\u {k} _{i} \u {k} _{j} } } \right) \minus k^{\setnum{2}}$

and similar arguments apply to higher moments discussed later.

(ii) Lineal descendants

If g generations separate two individuals, one being a lineal descendant of the other, k ₂=0 and $k_{\setnum{1}} \equals \lpar {\textstyle{1 \over 2}}\rpar ^{g \minus \setnum{1}}$ . For a parent and offspring pair (g=1, e.g. A and D in Fig. 1), ₁=k ₁=1 and Var(₁)=0. For linked gametic loci i, j the only way both values can be equal to one in subsequent generations (e.g. G, J) is if there has been no recombination in the descent from ancestor to descendant. The expected value of their product is therefore

${E\lpar }\u {k} _{\setnum{1}i} \u {k} _{\setnum{1}j} {\rm \rpar } \equals \left( {{\textstyle{1 \over 2}}\lpar 1 \minus c_{ij} \rpar } \right)^{g \minus \setnum{1}} \comma$

where c _ij is the recombination fraction between loci. For convenience, we will drop the ij subscript on c _ij. The covariance of these two variables is

(2)

${\rm Cov\lpar }\u {k} _{\setnum{1}i} \comma \u {k} _{\setnum{1}j} {\rm \rpar } \equals \left( {{\textstyle{1 \over 2}}\lpar 1 \minus c\rpar } \right)^{g \minus \setnum{1}} \minus \left( {\textstyle{1 \over 4}}\right) ^{g \minus \setnum{1}}.$

Note that this covariance is zero if the loci are unlinked and c=0·5, or if one individual is the offspring of the other and g=1. Setting c=0 gives the variance k _1i(1−k _1i) as the two loci are then transmitted as a unit.

Fig. 1. Examples of relationship

* If M and N are also full sibs.

For allele sharing over the whole genome, suppose there are infinitely many loci along a chromosome of length l and further suppose Haldane's (Reference Haldane1919) mapping function holds so that $\lpar 1 \minus c\rpar \equals {\textstyle{1 \over 2}}\lpar 1 \plus {\rm e}^{\minus\setnum{2}d} \rpar$ , where d is the map length between loci i, j. Therefore, from eqn (2),

${\rm Cov\lpar }\u {k} _{\setnum{1}i} \comma \u {k} _{\setnum{1}j} {\rm \rpar } \equals \left( {{\textstyle{1 \over 4}}} \right)^{g \minus \setnum{1}} \left[ {\lpar 1 \plus {\rm e}^{ \minus \setnum{2}d} \rpar ^{g \minus \setnum{1}} \minus 1} \right].$

The variance of allele sharing over the whole chromosome is the average of all the covariances and this can be calculated as an integral by letting x, y be the positions of pairs of loci:

(3)

$\eqalign{\tab{\rm Var}_{{\rm Lin\comma }g}\, {\rm \lpar }\u {k} _{\setnum{1}} {\rm \rpar } \equals\cr\tab\quad{2 \over {l^{\setnum{2}} }}\left( {{1 \over 4}} \right)^{g \minus \setnum{1}}\!\! \int_{x \equals \setnum{0}}^{l} {\int_{y \equals \setnum{0}}^{x} {\left[ {\lpar 1 \plus {\rm e}^{ \minus \setnum{2}\lpar x \minus y\rpar } \rpar ^{g \minus \setnum{1}} \minus 1} \right]} } {\rm d}y \, {\rm d}x\hskip-2.5pt$

(Stam & Zeven, Reference Stam and Zeven1981; Hill, Reference Hill1993a).

As we use this function repeatedly and more generally subsequently, we define

(4a)

$\eqalign{\phi _{n} {\rm \lpar }l{\rm \rpar } \equals {2 \over {l^{\setnum{2}} }}\left( {{1 \over 4}} \right)^{n} \int_{x \equals \setnum{0}}^{l} {\int_{y \equals \setnum{0}}^{x} {\left[ {\lpar 1 \plus {\rm e}^{ \minus \setnum{2}\lpar x \minus y\rpar } \rpar ^{n} \minus 1} \right]} } {\rm d}y \ {\rm d}x$

(4b)

$\eqalign{\equals\tab \left\{ {\matrix{\displaystyle{1 \over {2l^{\setnum{2}} }}\left( {{1 \over 4}} \right)^{\!n} \mathop\sum\limits_{r \equals \setnum{1}}^{n} {\left(\! {\matrix{ n \cr r \cr} } \!\right)} \left[ {{{2rl \minus 1 \plus {\rm e}^{ \minus \setnum{2}rl} } \over {r^{\setnum{2}} }}} \right]\comma \quad n \ges 1\comma\cr 0\comma\hfill n \equals 0 \right.}\hfill$

(Hill, Reference Hill1993a). At the limits, for $l \to 0\comma {\rm \ }\phi _{n} \lpar l\rpar \to \lpar {\textstyle{1 \over 2}}\rpar ^{n}\times\lsqb 1 \minus \lpar {\textstyle{1 \over 2}}\rpar ^{n} \rsqb$ and for l→∞, ɸ_n(l)→0. The variance of the chromosome-sharing variable $\u {{k}} _{\setnum{1}}$ for lineal relatives g generations apart can then be expressed as ${\rm Var}_{{\rm Lin\comma }g}\, \lpar \u {k} _{\setnum{1}} \comma l\rpar \equals \phi _{g \minus \setnum{1}} \left( l \right)$ . Also ${\rm Var}_{{\rm Lin\comma }g}\, \lpar \u {R} \comma l\rpar \equals {\textstyle{{\rm 1} \over {\rm 4}}}{\rm Var}_{{\rm Lin\comma }g} \lpar \u {k} _{\setnum{1}} {\rm \comma }l\rpar$ and ${\rm Var}_{{\rm Lin\comma }g}\, \lpar \u {\theta } \comma l\rpar \equals {\textstyle{{\rm 1} \over {{\rm 16}}}}{\rm Var}_{{\rm Lin\comma }g}\, \lpar \u {k} _{\setnum{1}} {\rm \comma }l\rpar$ . The coefficient of variation (CV) of ₁ is given by

$\eqalign{\tab{\rm CV}_{{\rm Lin\comma }g}\, \lpar \u {k} _{\setnum{1}} \comma l\rpar \equals 2^{g \minus \setnum{1}} \sqrt {\phi _{g \minus \setnum{1}} \left( l \right)} \cr\tab\quad\equals {1 \over l}\left\{ {{1 \over 2}\mathop\sum\limits_{r \equals \setnum{1}}^{g \minus \setnum{1}} {\left( {\matrix{ {g \minus 1} \cr r \cr} } \right)} \left[ {{{2rl \minus 1 \plus {\rm e}^{ \minus \setnum{2}rl} } \over {r^{\setnum{2}} }}} \right]} \right\}^{\setnum{1}\sol \setnum{2}} \comma \quad g \ges 2$

(Visscher, Reference Visscher2009) and is the same for and $\skew3\u {\theta }$ . For a whole genome comprising K chromosomes of lengths l ₁, l ₂, …, l _K and total map length $L \equals \sum\nolimits_{i \equals \setnum{1}}^{K} {l_{i} }$ , the variance is

(5)

${\rm Var}_{{\rm Lin\comma }g}\, \lpar \u {k} _{\setnum{1}} \rpar \equals {1 \over {L^{\setnum{2}} }}\mathop\sum\limits_{i \equals \setnum{1}}^{K} {l_{i}^{\setnum{2}} } \phi _{g \minus \setnum{1}} \left( {l_{i} } \right).$

We now evaluate the variance of genome sharing or relationship among collateral relatives and their descendants using eqns (3) and (4). Results are summarized in Box 1.

Box 1. Summary of formulae for variances of genome sharing. $R \equals \left( {{\textstyle{1 \over 2}}} \right)^{g}$

A. Unilineal relatives (k ₂=0 and ${\rm Var\lpar }\u {R} \comma l\rpar \equals {\textstyle{1 \over 4}}{\rm Var\lpar }\u {k} _{\setnum{1}} \comma l\rpar$ )

Lineal descendants

${\rm Var}_{{\rm Lin}\comma g}\, \lpar \u {k} _{\setnum{1}} \comma l\rpar \equals \phi _{g\minus\setnum{1}} \lpar l\rpar.$

Examples: g=1 for parent–offspring (when Var_Lin,g (₁, l)=0), g=2 for grandparent–grandoffspring.

Half-sibs and their descendants

${\rm Var}_{{\rm HS} \comma g}\, \lpar \u {k} _{\setnum{1}} \comma l\rpar \equals 4\phi _{g} \lpar l\rpar \minus 2\phi _{g \minus \setnum{1}} \lpar l\rpar \plus {\textstyle {1 \over 2}}\phi _{g \minus \setnum{2}} \lpar l\rpar.$

Examples: g=2 for half sibs, g=3 for half uncle-nephew, g=4 for half cousins.

Descendants of full sibs

Uncle–nephew and nephew's descendants

${\rm Var}_{{\rm UN}\comma g}\, \lpar \u {k} _{\setnum{1}} \comma l\rpar \equals 8\phi _{g \plus \setnum{1}} \lpar l\rpar \minus 4\phi _{g} \lpar l\rpar \plus {\textstyle {1 \over 2}}\phi _{g \minus \setnum{1}} \lpar l\rpar \plus {\textstyle {1 \over 4}}\phi _{g \minus \setnum{2}} \lpar l\rpar.$

Examples: g=2 for uncle-nephew, g=3 for great uncle-great nephew.

Cousins and descendants

${\rm Var}_{{\rm C}\comma g}\, \lpar \u {k} _{\setnum{1}} \comma l\rpar \equals 8\phi _{g \plus \setnum{1}} \lpar l\rpar \minus 4\phi _{g} \lpar l\rpar \plus \textstyle{3 \over 2}\phi _{g \minus \setnum{1}} \lpar l\rpar \minus {1 \over 2}\phi _{g \minus \setnum{2}} \lpar l\rpar \plus {1 \over 8}\phi _{g \minus \setnum{3}} \lpar l\rpar.$

Examples: g=3 for (first) cousins, g=5 for second cousins or cousins twice removed.

B. Bilineal relatives (k ₂≠0)

Full sibs

${\rm Var}_{{\rm FS}}\, \lpar {\u {R} \comma l} \rpar \equals 2\phi _{\setnum{2}} \left( l \right) \minus \phi _{\setnum{1}} \left( l \right).$

$\eqalign{\tab {\rm Var}_{{\rm FS}}\, \lpar {\u {k} _{\setnum{2}} \comma l} \rpar \equals {\rm Var}_{{\rm FS}}\, \lpar {\u {k} _{\setnum{0}} \comma l} \rpar \equals 16\phi _{\setnum{4}} \lpar l \rpar \minus 16\phi _{\setnum{3}} \lpar l \rpar \plus 8\phi _{\setnum{2}} \lpar l \rpar \minus 2\phi _{\setnum{1}} \lpar l \rpar\comma \cr \tab{\rm Var}_{{\rm FS}}\, \lpar {\u {k} _{\setnum{1}} \comma l} \rpar \equals 4{\rm Var}_{{\rm FS}}\, \lpar {\u {k} _{\setnum{2}} \comma l} \rpar \minus 4{\rm Var}_{{\rm FS}}\, \lpar \u {R} \comma l\rpar \comma \cr \tab {\rm Cov}_{{\rm FS}}\, \lpar {\u {k} _{\setnum{2}} \comma \u {k} _{\setnum{1}} \comma l} \rpar \equals {\rm Cov}_{{\rm FS}}\, \lpar {\u {k} _{\setnum{1}} \comma \u {k} _{\setnum{0}} \comma l} \rpar \equals \minus 2{\rm Var}_{{\rm FS}}\, \lpar {\u {k} _{\setnum{2}} \comma l} \rpar \plus 2{\rm Var}_{{\rm FS}}\, \lpar \u {R} \comma l\rpar \comma \cr \tab {\rm Cov}_{{\rm FS}}\, \lpar {\u {k} _{\setnum{2}} \comma \u {k} _{\setnum{0}} \comma l} \rpar \equals {\rm Var}_{{\rm FS}}\, \lpar {\u {k} _{\setnum{2}} \comma l} \rpar \minus 2{\rm Var}_{{\rm FS}}\, \lpar \u {R} \comma l\rpar. \cr}$

Double first cousins

${\rm Var}_{{\rm DFC}} \,\lpar \u {R} \comma l\rpar \equals 4\phi _{\setnum{4}} \left( l \right) \minus 2\phi _{\setnum{3}} \left( l \right) \plus {\textstyle{3 \over 4}}\phi _{\setnum{2}} \left( l \right) \minus {\textstyle{1 \over 4}}\phi _{\setnum{1}} \left( l \right)\comma$

$\eqalign{\tab {\rm Var}_{{\rm DFC}}\, \lpar \u {k} _{\setnum{2}} \comma l\rpar \equals 64\phi _{\setnum{8}} \lpar l\rpar \minus 64\phi _{\setnum{7}} \lpar l\rpar \plus 40\phi _{\setnum{6}} \lpar l\rpar \minus 20\phi _{\setnum{5}} \lpar l\rpar \plus {\textstyle{{33} \over 4}}\phi _{\setnum{4}} \lpar l\rpar \minus {\textstyle{5 \over 2}}\phi _{\setnum{3}} \lpar l\rpar \plus {\textstyle{5 \over 8}}\phi _{\setnum{2}} \lpar l\rpar \minus {\textstyle{1 \over 8}}\phi _{\setnum{1}} \lpar l\rpar \comma \cr \tab {\rm Var}_{{\rm DFC}}\, \lpar {\u {k} _{\setnum{1}} \comma l} \rpar \equals 4{\rm Var}_{{\rm DFC}}\, \lpar {\u {k} _{\setnum{2}} \comma l} \rpar\comma \cr \tab {\rm Var}_{{\rm DFC}}\, \lpar {\u {k} _{\setnum{0}} \comma l} \rpar \equals {\rm Var}_{{\rm DFC}}\, \lpar {\u {k} _{\setnum{2}} \comma l} \rpar \plus {\rm 2Var}_{{\rm DFC}}\, \lpar {\u {R} \comma l} \rpar\comma \cr \tab {\rm Cov}_{{\rm DFC}}\, \lpar {\u {k} _{\setnum{2}} \comma \u {k} _{\setnum{1}} \comma l} \rpar \equals \minus 2{\rm Var}_{{\rm DFC}}\, \lpar {\u {k} _{\setnum{2}} \comma l} \rpar \plus {\rm Var}_{{\rm DFC}}\, \lpar {\u {R} \comma l} \rpar\comma \cr \tab {\rm Cov}\lpar {\u {k} _{\setnum{2}} \comma \u {k} _{\setnum{0}} \comma l} \rpar \equals {\rm Var}_{{\rm DFC}}\, \lpar {\u {k} _{\setnum{2}} \comma l} \rpar \minus {\rm Var}_{{\rm DFC}}\, \lpar {\u {R} \comma l} \rpar\comma \cr\tab {\rm Cov}\lpar {\u {k} _{\setnum{1}} \comma \u {k} _{\setnum{0}} \comma l} \rpar \equals \minus 2{\rm Var}_{{\rm DFC}}\, \lpar {\u {k} _{\setnum{2}} \comma l} \rpar \minus {\rm Var}_{{\rm DFC}}\, \lpar {\u {R} \comma l} \rpar. \cr}$

(iii) Half-sibs and their descendants

(a) General formulation

Just as for lineal relatives, half-sibs (e.g. D and E in Fig. 1) and their descendants can have only one or zero pairs of ibd alleles at a locus. Formulae for variances of sharing ibd for half-sibs were given by Hill (Reference Hill1993b) and Guo (Reference Guo1995), but we generalize these here in order to include subsequent generations.

The probability that half-sibs share one pair of alleles is ${E}\lpar \u {k} _{\setnum{1}} \rpar \equals k_{\setnum{1}} \equals {\textstyle{1 \over 2}}$ and the probability that they share zero pairs is $k_{\setnum{0}} \equals {\textstyle{1 \over 2}}$ , so ${\rm Var}\lpar \u {k} _{\setnum{1}} \rpar \equals {\textstyle{1 \over 4}}$ . Half-sibs share one pair of alleles at each of loci i, j only if they both receive the same non-recombinant or the same recombinant haplotype from their common parent. Therefore,

(6)

${E}\lpar \u {k} _{\setnum{1}i} \u {k} _{\setnum{1}j} \rpar \equals {\textstyle{1 \over 2}}\left( {1 \minus c} \right)^{\setnum{2}} \plus {\textstyle{1 \over 2}}c^{\setnum{2}}$

and the covariance of the allele-sharing indicators is

(7)

${\rm Cov\lpar }\u {k} _{\setnum{1}i} \comma \u {k} _{\setnum{1}j} \rpar \equals {\textstyle{1 \over 2}}\lpar 1 \minus c\rpar ^{\setnum{2}} \plus {\textstyle{1 \over 2}}c^{\setnum{2}} \minus {\textstyle{1 \over 4}} \equals {\textstyle{1 \over 4}}\lpar 1 \minus 2c\rpar ^{\setnum{2}}$

showing that the covariance of s for unlinked loci is zero.

When we consider relationships across generations, for example, half-uncle nephew, the probability that these share haplotypes is proportional to ${\textstyle{1 \over 2}}\lpar 1 \minus c\rpar$ of the probability that the half-sibs share haplotypes. For half-sibs and other relatives who are not lineal descendents, the probability of sharing is not simply proportional to powers of (1−c) but involve others such as c ² as shown in eqn (6). In order to generalize formulae across generations, we find it convenient to express all powers of c in terms of $b \equals {\textstyle{1 \over 2}}\lpar 1 \minus c\rpar$ as

(8)

$c^{n} \equals \left[ {1 \minus 2\left( {{1 \over 2}\lpar 1 \minus c} \rpar\right)} \right]^{n} \equals \mathop\sum\limits_{i \equals \setnum{0}}^{n} {\left( {\matrix{ n \cr i \cr} } \right)} \left( { \minus 2b} \right)^{i}.$

Therefore, from eqn (6), for half-sibs

$E\lpar \u {k} _{\setnum{1}i} \u {k} _{\setnum{1}j} \rpar \equals 4b^{\setnum{2}} \minus 2b \plus {\textstyle{1 \over 2}}.$

This is a specific example of expressions which appear in all succeeding analyses, and so we consider the general form

(9)

$E\lpar \u {k} _{\setnum{1}i} \u {k} _{\setnum{1}j} \rpar \equals \mathop\sum\limits_{n} {a_{n} b^{n} }.$

For unlinked loci, $b \equals {\textstyle{1 \over 4}}$ , the s are independent and (9) gives the product of the expected values of _li and _lj, so

${\rm Cov\lpar }\u {k} _{\setnum{1}i} \comma \u {k} _{\setnum{1}j} \rpar \equals \mathop\sum\limits_{n}{a_{n} \left\lfloor {b^{n} \minus \left( {{\textstyle{1 \over 4}}} \right)^{n} } \right\rfloor }.$

Expressed in terms of map positions x, y for these loci, $b \equals {\textstyle{1 \over 4}}\lpar 1 \plus {\rm e}^{ \minus \setnum{2}\lpar x \minus y\rpar } \rpar$ and

${\rm Cov\lpar }\u {k} _{\setnum{1}i} \comma \u {k} _{\setnum{1}j} \rpar \equals \mathop\sum\limits_{n} {a_{n} \left( {{\textstyle{1 \over 4}}} \right)^{n} \left\lfloor {\left( {1 \plus {\rm e}^{ \minus \setnum{2}\lpar x \minus y\rpar } } \right)^{n} \minus 1} \right\rfloor }.$

Using eqns (3) and (4), we obtain

(10)

${\rm Var\lpar }\u {k} _{\setnum{1}} \comma l\rpar \equals \mathop\sum\limits_{n} {a_{n} \phi _{n} \lpar l\rpar }.$

Applying this methodology to half-sibs,

${\rm Var}_{{\rm HS}} \lpar \u {k} _{\setnum{1}} \comma l\rpar \equals 4\phi _{\setnum{2}} \lpar l\rpar \minus 2\phi _{\setnum{1}} \lpar l\rpar \plus {\textstyle{1 \over 2}}\phi _{\setnum{0}} \lpar l\rpar \equals 4\phi _{\setnum{2}} \lpar l\rpar \minus 2\phi _{\setnum{1}} \lpar l\rpar \comma$

because ɸ₀(l)=0. Also

(11)

${\rm Var}_{{\rm HS}}\, \lpar \u {R} \comma l\rpar \equals \phi _{\setnum{2}} \left( l \right) \minus {\textstyle{1 \over 2}}\phi _{\setnum{1}} \left( l \right).$

(b) Half-uncle nephew and descendants

The probability that half-uncle and nephew (e.g. D and H in Fig. 1; or, implicit here and subsequently, half-aunt and nephew or niece, etc.) share one pair of alleles ibd is $k_{\setnum{1}} \equals {\textstyle{1 \over 4}}$ . They share a pair of alleles ibd at loci i and j only if H receives from its parent E the non-recombinant haplotype that carries alleles from B, the common parent of D and E. Therefore

$E\lpar\u {k} _{\setnum{1}i} \u {k} _{\setnum{1}j} \rpar \equals {\textstyle{1 \over 2}}\lpar 1 \minus c\rpar \left[ {{\textstyle{1 \over 2}}\lpar 1 \minus c\rpar ^{\setnum{2}} \plus {\textstyle{1 \over 2}}c^{\setnum{2}} } \right] \equals 4b^{\setnum{3}} \minus 2b^{\setnum{2}} \plus {\textstyle{1 \over 2}}b$

and immediately, by using (9) and (10),

${\rm Var}_{{\rm HUN}}\, \lpar \u {k} _{\setnum{1}} \comma l\rpar \equals 4\phi _{\setnum{3}} \left( l \right) \minus 2\phi _{\setnum{2}} \left( l \right) \plus {\textstyle{1 \over 2}}\phi _{\setnum{1}} \left( l \right).$

We generalize the formulae with reference to pairs of relatives that are g generations apart, i.e. their pedigree relationship is $\lpar {\textstyle{1 \over 2}}\rpar ^{g}$ . Thus, g=2 for half sibs (and grandparent–grandoffspring, as above), g=3 for half-uncle nephew and g=4 for half-cousins (G and H in Fig. 1) and for half-great uncle nephew (D and K). The one-locus allele sharing indicator has expectation E(₁)=(0·5)^g−1 and those for two loci reduce by a proportion ${\textstyle{1 \over 2}}\lpar 1 \minus c\rpar$ each generation as the g meioses are independent. Hence

${\rm Var}_{{\rm HS\comma g}}\, \lpar \u {k} _{\setnum{1}} \comma l\rpar \equals 4\phi _{g} \left( l \right) \minus 2\phi _{g \minus \setnum{1}} \left( l \right) \plus {\textstyle{1 \over 2}}\phi _{g \minus \setnum{2}} \left( l \right).$

Setting g=2 and noting that ɸ₀(l)=0 provide the half-sib result. Note also that the variances are the same for any collateral and lineal offspring of half-sibs that have the same relationship, e.g. half-cousins and half great uncle–great nephew.

(iv) Lineal descendants of full-sibs

We now discuss the relationships between full sibs and their lineal descendants and among these descendants, where it is still the case that only one or zero pairs of alleles might be ibd, i.e. k ₂=0. We defer to the next section a treatment of full sibs and of bilineal relatives in general where k ₂>0. Note, however, that since the maternal and paternal transmissions are independent,

${\rm Var}_{{\rm FS}}\, \lpar \u {R} \comma l\rpar \equals 2\phi _{\setnum{2}} \left( l \right) \minus \phi _{\setnum{1}} \left( l \right)\comma$

i.e. twice that for half-sibs (eqn (11)) (Hill, Reference Hill1993b; Guo, Reference Guo1995).

(a) Uncle–nephew

In Fig. 1, E and F are full sibs and I is the offspring of F and a nephew of E. At any locus, they can share one or zero pairs of alleles with probabilities $k_{\setnum{1}} \equals k \tab _{\setnum{0}} \equals {\textstyle{1 \over 2}}$ . They can share a pair of alleles ibd at loci i and j in two ways: either I receives a non-recombinant haplotype from F, and E, F both carry copies of that haplotype which might themselves be both recombinant or non-recombinant from one of their parents; or I receives a recombinant haplotype from F, and E, F receive ibd alleles at i from one parent and ibd alleles at j from the other. So

(12)

$\eqalign{ {E\lpar }\u {k} _{\setnum{1}i} \u {k} _{\setnum{1}j} \rpar \tab \equals \lpar 1 \minus c\rpar \left[ {{\textstyle{1 \over 2}}\lpar 1 \minus c\rpar ^{\setnum{2}} \plus {\textstyle{1 \over 2}}c^{\setnum{2}} } \right] \plus {\textstyle{1 \over 4}}c \cr \tab \equals 8b^{\setnum{3}} \minus 4b^{\setnum{2}} \plus {\textstyle{1 \over 2}}b \plus {\textstyle{1 \over 4}}.\cr}$

Integrating over a chromosome of length l and using (9) and (10)

$\eqalign{\tab {\rm Var}_{{\rm UN}}\, \lpar \u {k} _{\setnum{1}} \comma l\rpar \equals 8\phi _{\setnum{3}} \left( l \right) \minus 4\phi _{\setnum{2}} \left( l \right) \plus {\textstyle{1 \over 2}}\phi _{\setnum{1}} \left( l \right)\comma \cr \tab {\rm Var}_{{\rm UN}}\, \lpar \u {R} \comma l\rpar \equals 2\phi _{\setnum{3}} \left( l \right) \minus \phi _{\setnum{2}} \left( l \right) \plus {\textstyle{1 \over 8}}\phi _{\setnum{1}} \left( l \right). \cr}$

These results are not the same as those for half-sibs, even though the single-locus probabilities k ₀, k ₁ are the same nor are they twice the value for half-uncle nephew.

(b) Uncle and descendants of a nephew

For great-uncle nephew (e.g. E and L in Fig. 1) and further descendents of the nephew, results are obtained immediately from (12) as the expressions are multiplied by further coefficients b. Hence, if they are g generations apart

${\rm Var}_{{\rm UN\comma }g}\, \lpar \u {R} \comma l\rpar \equals 2\phi _{g \plus \setnum{1}} \left( l \right) \minus \phi _{g} \left( l \right) \plus {\textstyle{1 \over 8}}\phi _{g \minus \setnum{1}} \left( l \right) \plus {\textstyle{1 \over {16}}}\phi _{g \minus \setnum{2}}\, \lpar l \rpar.$

This reduces to the uncle–nephew case (where $R \equals {\textstyle{1 \over 4}}$ ) for g=2 and to full sibs for g=1 (provided we set ɸ_n(l)=0, n⩽0).

(c) Cousins

In Fig. 1, E and F are full sibs, and so their respective offspring H and I are (first or full) cousins. They may share one or zero pairs of alleles ibd with probabilities $k_{\secnum{1}} \equals {\textstyle{1 \over 4}}$ and $k_{\setnum{0}} \equals {\textstyle{3 \over 4}}$ . The haplotypes that they receive from their sibling parents may each be non-recombinant, with probability (1−c)², in which case they carry ibd alleles at each locus with probability $\lsqb {\textstyle{1 \over 4}}\lpar 1 \minus c\rpar ^{\setnum{2}} \plus {\textstyle{1 \over 4}}c^{\setnum{2}} \rsqb$ . Alternatively, the haplotypes that they receive from their sibling parents may each be recombinant, with probability c ², in which case they carry ibd alleles at each locus with probability ${\textstyle{1 \over 8}}$ . Therefore,

(13)

$\eqalign{ {\rm Pr\lpar }\u {k} _{\setnum{1}i} \u {k} _{\setnum{1}j} \rpar \tab\equals {\textstyle{1 \over 2}}\lpar 1 \minus c\rpar ^{\setnum{2}} \lsqb {\textstyle{1 \over 2}}\lpar 1 \minus c\rpar ^{\setnum{2}} \plus {\textstyle{1 \over 2}}c^{\setnum{2}} \rsqb \plus {\textstyle{1 \over 8}}c^{\setnum{2}} \cr \tab \equals 8b^{\setnum{4}} \minus 4b^{\setnum{3}} \plus {\textstyle{3 \over 2}}b^{\setnum{2}} \minus {\textstyle{1 \over 2}}b \plus {\textstyle{1 \over 8}} \cr}$

and hence

(14)

$\eqalign{\tab {\rm Var}_{{\rm FC}}\, \lpar \u {k} _{\setnum{1}} \comma l\rpar \equals 8\phi _{\setnum{4}} \left( l \right) \minus 4\phi _{\setnum{3}} \left( l \right) \plus {\textstyle{3 \over 2}}\phi _{\setnum{2}} \left( l \right) \minus {\textstyle{1 \over 2}}\phi _{\setnum{1}} \left( l \right)\comma \cr \tab {\rm Var}_{{\rm FC}}\, \lpar \u {R} \comma l\rpar \equals 2\phi _{\setnum{4}} \left( l \right) \minus \phi _{\setnum{3}} \left( l \right) \plus {\textstyle{3 \over 8}}\phi _{\setnum{2}} \left( l \right) \minus {\textstyle{1 \over 8}}\phi _{\setnum{1}} \left( l \right). \cr}$

Note that the variances differ from those for great uncle–great nephew, although they have the same relationship parameters k ₁ and R.

(d) Descendants of cousins

In Fig. 1, H and L are cousins once removed. An individual shares a haplotype with the offspring of a cousin only if the cousin transmits it without recombination. Hence, the joint probability of sharing is b times that for cousins. Setting g=3 for cousins $\lpar R \equals {\textstyle{1 \over 8}}\rpar$ , so g=4 for cousins once removed, g=5 for second cousins and for cousins twice removed and g=6 for third cousins, The variances are

$\eqalign{{\rm Var}_{{\rm C\comma }g}\, \lpar \u {k} _{\setnum{1}} \comma l\rpar\equals \tab 8\phi _{g \plus \setnum{1}} \left( l \right) \minus 4\phi _{g} \left( l \right) \plus {\textstyle{3 \over 2}}\phi _{g \minus \setnum{1}} \left( l \right)\cr\tab \minus {\textstyle{1 \over 2}}\phi _{g \minus \setnum{2}} \left( l \right) \plus {\textstyle{1 \over 8}}\phi _{g \minus \setnum{3}} \left( l \right)\comma \cr{\rm Var}_{{\rm C\comma }g}\, \lpar \u {R} \comma l\rpar \equals \tab 2\phi _{g \plus \setnum{1}} \left( l \right) \minus \phi _{g} \left( l \right) \plus {\textstyle{3 \over 8}}\phi _{g \minus \setnum{1}} \left( l \right)\cr\tab \minus {\textstyle{1 \over 8}}\phi _{g \minus \setnum{2}} \left( l \right) \plus {\textstyle{1 \over {32}}}\phi _{g \minus \setnum{3}} \left( l \right)\comma \cr}$

and also Var_C,1 (, l)=Var_FS (, l).

(v) Bilineal relatives

(a) General methodology

Bilineal relatives can receive identical alleles from each of the two different pedigrees. Full sibs have two parents in common and each may transmit identical alleles to the sibs. Double first cousins have two pairs of grandparents in common, and each pair may transmit identical alleles to the cousins. It is convenient to refer to the two pedigrees as ‘maternal’ and ‘paternal’, although this may not be the case for double first cousins. In Fig. 1, E and F are full sibs and can receive identical alleles from each of their parents B and C. If M and N are also full sibs, then H and I are double first cousins who may receive ibd alleles from both sets of grandparents, namely B, C and the parents of M, N.

Using superscripts m, p for maternal and paternal events in order to extend the previous definitions of actual identity indicators, the required indicators can be partitioned as

$\eqalign{\tab \u {k} _{\setnum{2}} \equals \u {k}_{\setnum{1}}^{\hskip3pt\rm m} \u {k} _{\setnum{1}}^{\hskip3pt\rm p} \comma \cr \tab \u {k} _{\setnum{1}} \equals \u {k} _{\setnum{1}}^{\hskip3pt\rm m} \lpar {1 \minus \u {k} _{\setnum{1}}^{\hskip3pt\rm p} } \rpar \plus \lpar {1 \minus \u {k} _{\setnum{1}}^{\hskip3pt\rm m} } \rpar\u {k} _{\setnum{1}}^{\hskip3pt\rm p} \comma \cr \tab \u {k} _{\setnum{0}} \equals \lpar {1 \minus \u {k} _{\setnum{1}}^{\hskip3pt\rm m} } \rpar\lpar {1 \minus \u {k} _{\setnum{1}}^{\hskip3pt\rm p} } \rpar. \cr}$

As we assume no inbreeding, $\u {k} _{\setnum{1}}^{\hskip3pt\rm m}$ and $\u {k} _{\setnum{1}}^{\hskip3pt\rm p}$ are independent and have expected values denoted α^m=k ₁^m and α^p=k ₁^p. Therefore, k ₂=α^mα^p, k ₁=α^m(1−α^p)+(1−α^m)α^p and k ₀=(1−α^m)(1−α^p). For full sibs, for example, $\alpha ^{\rm m} \equals \alpha ^{\rm p} \equals {\textstyle{1 \over 2}}$ , $k_{\setnum{2}} \equals k_{\setnum{0}} \equals {\textstyle{1 \over 4}}$ and $k_{\setnum{1}} \equals {\textstyle{1 \over 2}}$ .

Hence, the variance of the actual relationship, $\u {R} \equals {\textstyle{1 \over 2}}\left( {\u {k} _{\setnum{1}}^{\hskip3pt\rm m} \plus \u {k} _{\rm \setnum{1}}^{\hskip3pt\rm p} } \right)$ , can be written in an alternative form to eqn (1) as

(15)

${\rm Var}\lpar \u {R} \rpar \equals {\textstyle {1 \over 4}}\lsqb \alpha ^{\rm m} \lpar 1 \minus \alpha ^{\rm m} \rpar \plus \alpha ^{\rm p} \lpar 1 \minus \alpha ^{\rm p} \rpar \rsqb.$

The sharing of either or both maternal and paternal alleles can extend to each of the two loci, i and j, and we introduce the expected products

$\beta ^{\rm m} \equals {E}\lpar \u {k} _{{\rm l}i} ^{\hskip4pt\rm m} \u {k} _{{\rm l}j} ^{\hskip4pt\rm m} \rpar \comma \quad \beta ^{\rm p} \equals {E}\lpar \u {k} ^{\hskip3pt\rm p} _{{\rm l}i} \u {k} ^{\hskip3pt\rm p} _{{\rm l}j} \rpar.$

For full sibs, these values are each the same as for sharing of alleles transmitted from their common parent to half-sibs (eqn (6)), $\beta ^{\rm m} \equals \beta ^{\rm p} \equals {\textstyle{1 \over 2}}\lsqb \lpar 1 \minus c\rpar ^{\setnum{2}} \plus c^{\setnum{2}} \rsqb$ .

As maternal and paternal alleles are inherited independently,

${E\lpar }\u {k} _{\setnum{1}i}^{\hskip4pt\rm m} \u {k} _{\setnum{1}i}^{\hskip4pt\rm p} \rpar \equals {E\lpar }\u {k} _{\setnum{1}i}^{\hskip4pt\rm m} \u {k} _{\setnum{1}j}^{\hskip4pt\rm p} \rpar \equals { E\lpar }\u {k} _{\setnum{1}j}^{\hskip4pt\rm m} \u {k} _{\setnum{1}i}^{\hskip4pt\rm p} \rpar \equals \lpar \u {k} _{\setnum{1}j}^{\hskip4pt\rm m} \u {k} _{\setnum{1}j}^{\hskip4pt\rm p} \rpar \equals \alpha ^{\rm m} \alpha ^{\rm p}.$

The expected product of sharing two pairs of alleles at two loci for bilineal relatives is

$\eqalign{ E\lpar {\u {k} _{\setnum{2}i} \u {k} _{\setnum{2}j} } \rpar \tab \equals {E}\lpar {\u {k} _{\setnum{1}i}^{\hskip4pt\rm m} \u {k} _{\setnum{1}i}^{\hskip4pt\rm p} \u {k} _{\setnum{1}j}^{\hskip4pt\rm m} \u {k} _{\setnum{1}j}^{\hskip4pt\rm p} } \rpar \cr \tab \equals {E}\lpar {\u {k} _{\setnum{1}i}^{\hskip4pt\rm m} \u {k} _{\setnum{1}j}^{\hskip4pt\rm m} } \rpar\ {E}\lpar {\u {k} _{\setnum{1}i}^{\hskip4pt\rm p} \u {k} _{\setnum{1}j}^{\hskip4pt\rm p} } \rpar \equals \beta ^{\rm m} \beta ^{\rm p} \cr}$

and the covariance of the double-sharing indicators is

$\eqailgn{{\rm Cov}\lpar {\u {k} _{\setnum{2}i} \comma \u {k} _{\setnum{2}j} } \rpar \equals \tab {E}\lpar {\u {k} _{\setnum{2}i} \hats{k}_{\setnum{2}j} } \rpar \minus {E}\lpar {\u {k} _{\setnum{2}i} } \rpar{E}\lpar {\u {k} _{\setnum{2}j} } \rpar \equals \beta ^{\rm m} \beta ^{\rm p} \minus \lpar \alpha ^{\rm m} \alpha ^{\rm p} \rpar ^{\setnum{2}}}.$

For the other covariances, we note that terms such as $\lsqb {E}\lpar \u {k} _{i}^{\hskip3pt\rm m} \u {k} _{j}^{\hskip3pt\rm p} \rpar \minus {E}\lpar \u {k} _{i}^{\hskip3pt\rm m} \rpar {E}\lpar \u {k} _{j}^{\hskip3pt\rm p} \rpar \rsqb$ contribute zero, whereas terms such as ${\lsqb E}\lpar \u {k} _{\rm i}^{\hskip3pt\rm m} \u {k} _{j}^{\hskip3pt\rm m} \rpar \minus {E}\lpar \u {k} _{i}^{\hskip3pt\rm m} \rpar E\lpar \u {k} _{j}^{\hskip3pt\rm m} \rpar \rsqb$ contribute (β_ij^m−α_i^mα_j^m). The remaining covariances are obtained similarly. The covariance Cov(_li, _lj) comprises four terms: from sharing of both paternal alleles but neither maternal allele and vice versa, and from sharing of paternal but not maternal alleles at the first locus and of maternal but not paternal alleles at the second locus and vice versa. It is convenient to define ω^m=β^m−(α^m)² and ω^p=β^p−(α^p)².

We obtain

${\rm Cov}\lpar {\u {k} _{\setnum{2}i} \comma \u {k} _{\setnum{2}j} } \rpar \equals \lpar \alpha ^{\rm m} \rpar ^{\setnum{2}} \omega ^{\rm p} \plus \lpar \alpha ^{\rm p} \rpar ^{\setnum{2}} \omega ^{\rm m} \plus \omega ^{\rm m} \omega ^{\rm p}$

and also

(16)

$\eqalign{\tab {\rm Cov}\lpar {\u {k} _{\setnum{1}i} \comma \u {k} _{\setnum{1}j} } \rpar \equals \lpar 1 \minus 2\alpha ^{\rm m} \rpar ^{\setnum{2}} \omega ^{\rm p} \plus \lpar 1 \minus 2\alpha ^{\rm p} \rpar ^{\setnum{2}} \omega ^{\rm m} \plus 4\omega ^{\rm m} \omega ^{\rm p} \comma \cr \tab {\rm Cov}\lpar {\u {k} _{\setnum{0}i} \comma \u {k} _{\setnum{0}j} } \rpar \equals \lpar 1 \minus \alpha ^{\rm m} \rpar ^{\setnum{2}} \omega ^{\rm p} \plus \lpar 1 \minus \alpha ^{\rm p} \rpar ^{\setnum{2}} \omega ^{\rm m} \plus \omega ^{\rm m} \omega ^{\rm p} \comma \cr \tab {\rm Cov}\lpar {\u {k} _{\setnum{2}i} \comma \u {k} _{\setnum{1}j} } \rpar \plus {\rm Cov}\lpar {\u {k} _{\setnum{1}i} \comma \u {k} _{\setnum{2}j} } \rpar \equals 2\alpha ^{\rm m} \lpar 1 \minus 2\alpha ^{\rm m} \rpar \omega ^{\rm p} \cr\tab\quad\plus 2\alpha ^{\rm p} \lpar 1 \minus 2\alpha ^{\rm p} \rpar \omega ^{\rm m} \minus 4\omega ^{\rm m} \omega ^{\rm p} \comma \cr \tab {\rm Cov}\lpar {\u {k} _{\setnum{2}i} \comma \u {k} _{\setnum{0}j} } \rpar \plus {\rm Cov}\lpar {\u {k} _{\setnum{0}i} \comma \u {k} _{\setnum{2}j} } \rpar \equals \minus 2\alpha ^{\rm m} \lpar 1 \minus \alpha ^{\rm m} \rpar \omega ^{\rm p} \cr\tab\quad\minus 2\alpha ^{\rm p} \lpar 1 \minus \alpha ^{\rm p} \rpar \omega ^{\rm m} \plus 2\omega ^{\rm m} \omega ^{\rm p} \comma \cr \tab {\rm Cov}\lpar {\u {k} _{\setnum{1}i} \comma \u {k} _{\setnum{0}j} } \rpar \plus {\rm Cov}\lpar {\u {k} _{\setnum{0}i} \comma \u {k} _{\setnum{1}j} } \rpar \equals \minus 2\lpar 1 \minus \alpha ^{\rm m} \rpar \lpar 1 \minus 2\alpha ^{\rm m} \rpar \omega ^{\rm p} \cr\tab\quad\minus 2\lpar 1 \minus \alpha ^{\rm p} \rpar \lpar 1 \minus 2\alpha ^{\rm p} \rpar \omega ^{\rm m} \minus 4\omega ^{\rm m} \omega ^{\rm p}. \cr}\hskip-30pt$

Note that these six expressions sum to zero, as ₀+₁+₂=1 at each locus. For unlinked loci, β^m=(α^m)² and β^p=(α^p)², all these expressions (16) are zero. For completely linked loci, β^m=α^m and β^p=α^p, the covariances reduce to the variances and covariances of the single-locus indicators.

Averaging over just two loci, i, j:

$\u {R} \equals {\textstyle{1 \over 2}}\lpar \u {k} _{\setnum{2}i} \plus \u {k} _{\setnum{2}j} \rpar \plus {\textstyle{1 \over 4}}\lpar \u {k} _{\setnum{1}i} \plus \u {k} _{\setnum{1}j} \rpar.$

Using one-locus results and the two-locus covariances in this case

${\rm Var }\lpar \u {R} \rpar \equals {\textstyle{1 \over 8}}\lsqb \omega ^{\rm m} \plus \omega ^{\rm p} \plus \alpha ^{\rm m} \lpar 1 \minus \alpha ^{\rm m} \rpar \plus \alpha ^{\rm p} \lpar 1 \minus \alpha ^{\rm p} \rpar \rsqb.$

As expected, this does not involve the product $\omega ^{\rm m} \omega ^{\rm p}$ (or, equivalently, β^m β^p) because the maternal and paternal alleles are transmitted independently. For unlinked loci, the variance is half the single-locus value shown in eqn (15).

(b) Full sibs

For full sibs, α^m=α^p=α=½, $\beta ^{\rm m} \equals \beta ^{\rm p} \equals {\textstyle{1 \over 2}}\lsqb \lpar 1 \minus c\rpar ^{\setnum{2}} \plus c^{\setnum{2}} \rsqb \equals 4b^{\setnum{2}} \minus 2b \plus {\textstyle{1 \over 2}}$ . Therefore $\beta ^{\rm m} \beta ^{\rm p} \equals 16b^{\setnum{4}} \minus 16b^{\setnum{3}} \plus 8b^{\setnum{2}} \minus 2b \plus {\textstyle{1 \over 4}}$ , which equals 1/16 when $c \equals {\textstyle{1 \over 2}}$ . Using ${\rm Cov}\lpar {\u {k} _{\setnum{2}i} \comma \u {k} _{\setnum{2}j} } \rpar \equals \beta ^{\rm m} \beta ^{\rm p} \minus \lpar \alpha ^{\rm m} \alpha ^{\rm p} \rpar ^{\setnum{2}}$ from eqns (16) and integrating over a chromosome of length l:

$\eqalign{\tab {\rm Var}_{{\rm FS}} \,\lpar {\u {k} _{\setnum{2}} \comma l} \rpar \equals {\rm Var}_{{\rm FS}}\, \lpar {\u {k} _{\setnum{0}} \comma l} \rpar \equals 16\phi _{\setnum{4}} \lpar l \rpar \minus 16\phi _{\setnum{3}} \lpar l \rpar \cr\tab\quad\hskip48pt\plus 8\phi _{\setnum{2}} \lpar l \rpar \minus 2\phi _{\setnum{1}} \lpar l \rpar\comma \cr \tab{\rm Var}_{{\rm FS}}\, \lpar {\u {k} _{\setnum{1}} \comma l} \rpar \equals 64\phi _{\setnum{4}} \lpar l \rpar \minus 64\phi _{\setnum{3}} \lpar l \rpar \plus 24\phi _{\setnum{2}} \lpar l \rpar \minus 4\phi _{\setnum{1}} \lpar l \rpar\comma\cr\tab{\rm Cov}_{{\rm FS}}\, \lpar {\u {k} _{\setnum{2}} \comma \u {k} _{\setnum{1}} \comma l} \rpar \equals {\rm Cov}_{{\rm FS}}\, \lpar {\u {k} _{\setnum{1}} \comma \u {k} _{\setnum{0}} \comma l} \rpar \equals \minus 32\phi _{\setnum{4}} \lpar l \rpar \cr\tab\quad\hskip64pt\plus 32\phi _{\setnum{3}} \lpar l \rpar \minus 12\phi _{\setnum{2}} \lpar l \rpar \plus 2\phi _{\setnum{1}} \lpar l \rpar\comma \cr \tab{\rm Cov}_{{\rm FS}}\, \lpar {\u {k} _{\setnum{2}} \comma \u {k} _{\setnum{0}} \comma l} \rpar \equals 16\phi _{\setnum{4}} \lpar l \rpar \minus 16\phi _{\setnum{3}} \lpar l \rpar \plus 4\phi _{\setnum{2}} \lpar l \rpar. \cr}$

An alternative summary of these expressions is given in Box 1. The variance of the actual relationship for full sibs can be obtained from these results, and is ${\rm Var}_{{\rm FS}}\, \lpar \u {R} \comma l\rpar \equals 2\phi _{\setnum{2}} \left( l \right) \minus \phi _{\setnum{1}} \left( l \right)$ , i.e. twice that for half-sibs, as noted previously. The variance of ₂ was derived by Visscher et al. (Reference Visscher, Medland, Ferreira, Morley, Zhu, Cornes, Montgomery and Martin2006), who also pointed out that ${\rm Cov}_{{\rm FS}}\, \lpar \u {R} \comma \u{k}_{\setnum{2}} \comma l\rpar \equals {\rm Var}_{{\rm FS}}\, \lpar \u {R} \comma l\rpar$ . The regression of ₂ on is therefore 1·0. The genetic covariance of phenotypes of quantitative traits of relatives (ignoring epistasis) is given by RV _A+k ₂V _D, where V _A and V _D are the additive and dominance variances (Falconer & Mackay, Reference Falconer and Mackay1996) and traditionally pedigree relationships are used. Estimates of the additive genetic and dominance variances free of environmental covariances for quantitative traits can be obtained by regressing the resemblance of trait values of full sibs to their actual genome shared, $\u {R} V_{\rm A} \plus \u {k} _{\setnum{2}} V_{\rm D}$ , if dense markers are available. The estimates of V _A and V _D are therefore highly correlated, however (Visscher et al., Reference Visscher, Medland, Ferreira, Morley, Zhu, Cornes, Montgomery and Martin2006).

(c) Double first cousins

For double first cousins $\alpha ^{\rm m} \equals \alpha ^{\rm p} \equals {\textstyle{1 \over 4}}$ and, utilizing the results for descendants of first cousins (eqn (13)), ${E}\lpar \u {k} _{\setnum{1}i} \u {k} _{\setnum{1}j} \rpar \equals 8b^{\setnum{4}} \minus 4b^{\setnum{3}} \plus {\textstyle{3 \over 2}}b^{\setnum{2}} \minus {\textstyle{1 \over 2}}b \plus {\textstyle{1 \over 8}}$ , it follows that

$\eqalign{{\rm Var}_{{\rm DFC}}\, \lpar \u {k} _{\setnum{2}} \comma l\rpar \equals\tab 64\phi _{\setnum{8}} \lpar l\rpar \minus 64\phi _{\setnum{7}} \lpar l\rpar \plus 40\phi _{\setnum{6}} \lpar l\rpar \minus 20\phi _{\setnum{5}} \lpar l\rpar \cr\tab\plus {\textstyle{{33} \over 4}}\phi _{\setnum{4}} \lpar l\rpar \minus {\textstyle{5 \over 2}}\phi _{\setnum{3}} \lpar l\rpar \plus {\textstyle{5 \over 8}}\phi _{\setnum{2}} \lpar l\rpar \minus {\textstyle{1 \over 8}}\phi _{\setnum{1}} \lpar l\rpar.$

The other variances and covariances can be expressed simply (Box 1) in terms of Var_DFC (₂, l) and ${\rm Var}_{{\rm FC}}\, {\rm \lpar }\u {k} _{\setnum{1}} \comma l \rpar \equals 8\phi _{\setnum{4}} \left( l \right) \minus 4\phi _{\setnum{3}} \left( l \right)\plus {\textstyle{3 \over 2}}\phi _{\setnum{2}} \left( l \right) \minus {\textstyle{1 \over 2}}\phi _{\setnum{1}} \left( l \right)$ .

The variance of the actual relationship is double that of first cousins:

$\eqalign{{\rm Var}_{{\rm DFC}}\, \lpar \u {R} \comma l\rpar \equals\tab 2{\rm Var}_{{\rm FC}} \lpar \u {R} \comma l\rpar \equals 4\phi _{\setnum{4}} \left( l \right) \minus 2\phi _{\setnum{3}} \left( l \right) \cr\tab\plus {\textstyle{3 \over 4}}\phi _{\setnum{2}} \left( l \right) \minus {\textstyle{1 \over 4}}\phi _{\setnum{1}} \left( l \right).$

Also ${\rm Cov}_{{\rm DFC}}\, \lpar \u {R} \comma \u {k} _{\setnum{2}} \comma l\rpar \equals {\textstyle{1 \over 2}}{\rm Var}_{{\rm FC}}\, \lpar \u {R} \comma l\rpar$ , and so the regression of ₂ on ₂ is one-half.

(d) Mothers full sibs, fathers first cousins

The method that we have established allows for asymmetry in the two pedigrees that lead to sets of identical alleles for a pair of relatives. If, for example, the mothers are full sibs and the fathers are first cousins $\alpha ^{\rm m} \equals {\textstyle{1 \over 2}}$ , $\alpha ^{\rm p} \equals {\textstyle{1 \over 4}}$ , $\beta ^{\rm m} \equals 4b^{\setnum{2}} \minus 2b \plus {\textstyle{1 \over 2}}$ and $\beta ^{\rm p} \equals 8b^{\setnum{4}} \minus 4b^{\setnum{3}} \plus {\textstyle{3 \over 2}}b^{\setnum{2}} \minus {\textstyle{1 \over 2}}b \plus {\textstyle{1 \over 8}}$ . The results then follow.

(vi) Sex-related phenomena

(a) Differences in map length between sexes

In the analysis we have assumed that the map distance is the same in both sexes. Typically, however, the sexes differ in map length, i.e. in the rate of recombination per unit of physical length of the genome. For humans, the autosomal map length in females is 44 M approximately and in males 28 M (Kong et al., Reference Kong, Murphy, Raj, He, White and Matise2004), with the male/female ratio ranging among autosomes from 57 to 85%, typically differing more for the longer chromosomes. We quantify the impact on the variation in genome sharing on the sex through which transmission occurs.

It would be possible to restructure the analysis and specify a ratio of map to physical length for each chromosome and integrate an extension to eqn (4) over physical rather than map length. For maintaining the same notation as previously, however, we simply assume that the sex-averaged map length for a particular chromosome is l, but the map length in females is given by l(1+λ) and in males by l(1−λ). Initially we take a more general approach, and assume that the map length for transmissions at generation i is given by la _i and that recombination fractions between any pair of sites are functions of la _i. Thus, for a pair of loci d M apart on the sex-averaged linkage map and assuming Haldane's mapping function, their recombination fraction is ${\textstyle{1 \over 2}}\lpar 1 \minus {\rm e}^{ \minus \setnum{2}da_{i} } \rpar$ , 0<d<l, at generation i. We consider lineal relationships.

Equations (4 a) and (4b) for φ_n(l) can now be generalized:

(17a)

$\eqalign{\phi _{n}^{ \circ } \lpar l\semi a_{\setnum{1}} {\rm \comma }\ldots{\rm \comma }a_{n} \rpar \tab\equals {2 \over {l^{\setnum{2}} }}\left( {{1 \over 4}} \right)^{n} \int_{x \equals \setnum{0}}^{l} {\int_{y \equals \setnum{0}}^{x} {\left[ {\prod\limits_{i \equals \setnum{1}}^{n} {\lpar 1 \plus {\rm e}^{ \minus \setnum{2}\lpar x \minus y\rpar a_{i} } \rpar \minus 1} } \right]} } {\rm d}y \ {\rm d}x\cr \tab \equals {2 \over {l^{\setnum{2}} }}\left( {1 \over 4} \right)^{n} \int_{x \equals \setnum{0}}^{l} \int_{y \equals \setnum{0}}^{x} {\left[ \mathop\sum\limits_{\delta _{\setnum{1}} \equals \setnum{0}}^{\setnum{1}} \mathop\sum\limits_{\delta _{\setnum{2}} \equals \setnum{0}}^{\setnum{1}}\cdots \mathop\sum\limits_{\delta _{n} \equals \setnum{0}}^{\setnum{1}} {{\rm e}^{ \minus \setnum{2}\lpar x \minus y\rpar \mathop\sum_{{i \equals \setnum{1}}}^{n} {a_{i} } \delta _{{_{i}} }}} \right]} {\rm d}y \ {\rm d}x\comma\quad\mathop\sum_{i \equals \setnum{1}}^{n} {\delta _{i} } \ne 0\comma}$

(17b)

$\hskip64pt\equals {1 \over {2l^{\setnum{2}} }}\left( {{1 \over 4}} \right)^{n} \left[ {\mathop\sum\limits_{\delta _{\setnum{1}} \equals \setnum{0}}^{\setnum{1}} {\mathop\sum\limits_{\delta _{\setnum{2}} \equals \setnum{0}}^{\setnum{1}} \cdots {\mathop\sum\limits_{\delta _{n} \equals \setnum{0}}^{\setnum{1}} {\left( {{{2l} \over {\sum\nolimits_{i \equals \setnum{1}}^{n} {a_{i} } \delta _{_{i} } }} \minus {1 \over {\lpar \sum\nolimits_{i \equals \setnum{1}}^{n} {a_{i} } \delta _{_{i} } \rpar ^{\setnum{2}} }} \plus {{{\rm e}^{ \minus \setnum{2}l\mathop\sum\nolimits_{{i \equals \setnum{1}}}^{n} {a_{i} } \delta _{{_{i} }} } } \over {\lpar \sum\nolimits_{i \equals \setnum{1}}^{n} {a_{i} } \delta _{_{i} } \rpar ^{\setnum{2}} }}} \right)} } } } \right]\comma \quad \mathop\sum_{i \equals \setnum{1}}^{n} {\delta _{i} } \ne 0\comma$

If a _i=1 for all i, eqns (17a) reduce to (4a) and (17b) to (4b).

Although (17b) can be used directly, we now simplify for the case where there are just two values of a _i, namely 1±λ. Assume that m of the n transmissions are through males, with n−m correspondingly through females, and extend the definition of φ_n(l) accordingly as φ*_n,m(l, λ). The sequence in which male or female transmissions occur does not matter. The expansion of the summations in (17b) involves terms with $r \equals \sum\nolimits_{i \equals \setnum{1}}^{n} {\delta _{i} }$ terms in the sum $\sum\nolimits_{i \equals \setnum{1}}^{n} {a_{i} } \delta _{i}$ and of these r there are, say, s transmissions through males, where max(0, r−n+m)⩽s⩽min(m, r). Hence $\sum\nolimits_{i \equals \setnum{1}}^{n} {a_{i} } \delta _{i} \equalsr \plus \lpar r \minus 2s\rpar \lambda \equals \rho$ , say, and

(18)

$\eqalign{\phi \ast _{n\comma m} \lpar l\comma \lambda \rpar \equals\tab {1 \over {2l^{\setnum{2}} }}\left( {{1 \over 4}} \right)^{n} \!\mathop\sum\limits_{r \equals \setnum{1}}^{n} {\mathop\sum\limits_{s \equals \max \lpar \setnum{0}\comma r \minus n \plus m\rpar }^{\min \lpar m\comma r\rpar } {\left(\! {\matrix{ m \cr s \cr} } \right)} } \left(\! {\matrix{ {n \minus m} \cr {r \minus s} \cr} } \right)\cr\tab\times\left[ {2\rho \minus {1 \over {\rho ^{\setnum{2}} }} \plus {{{\rm e}^{ \minus \setnum{2}l\rho } } \over {\rho ^{\setnum{2}} }}} \right]\comma$

i.e. ρ replaces r in (4) and hypergeometric coefficients in s are introduced. The n generations here do not include that of the initial transmission from parent to offspring, but those starting with the subsequent transmission to grandoffspring, so

${\rm Var}_{{\rm Lin\comma }g\comma m}\, {\rm \lpar }\u {k} _{\setnum{1}} \comma l\comma \lambda {\rm \rpar } \equals \phi \ast _{g \minus \setnum{1}\comma m}\, \left( {l\comma \lambda } \right).$

For collateral relatives and their offspring, the general formulation can be extended. For example, for a pair of paternal half-sibs

$\eqalign{{\rm Var}_{{\rm HS}}\, \lpar \u {R} \comma l\comma \lambda \rpar \tab\equals \phi _{\setnum{2}} \left( {l\lpar 1 \minus \lambda \rpar } \right) \minus {\textstyle{1 \over 2}}\phi _{\setnum{1}} \left( {l\lpar 1 \minus \lambda \rpar } \right)\cr\tab \equals \phi \ast _{\setnum{2}\comma \setnum{2}} \lpar l\comma \lambda \rpar \minus {\textstyle{1 \over 2}}\phi \ast _{\setnum{1}\comma \setnum{1}} \lpar l\comma \lambda \rpar$

and for a pair of half-cousins, whose mothers were paternal half-sibs,

${\rm Var}_{{\rm HC}}\, \lpar \u {R} \comma l\comma \lambda \rpar \equals \phi \ast _{\setnum{4}\comma \setnum{2}} \lpar l\comma \lambda \rpar \minus {\textstyle{1 \over 2}}\phi \ast _{\setnum{3}\comma \setnum{1}} \lpar l\comma \lambda \rpar \plus {\textstyle{1 \over 8}}\phi \ast _{\setnum{2}\comma \setnum{0}} \lpar l\comma \lambda \rpar.$

As both sexes of parents contribute to resemblance among full sibs, the differences in map length have much impact only in later generations.

For humans, λ averages approximately 0·25, and we illustrate the calculations for a chromosome with l=1 M. For n=2, i.e. great grandparent–great grandoffspring (with the sex of the great grandparent irrelevant), φ_2,0*(1, 0·25)=φ₂(1·25)=0·0833, φ_2,1*(1, 0·25)=0·0954 and φ_2,2*(1, 0·25)=φ₂(0·75)=0·1088. The corresponding standard deviations (SDs) of k ₁ are 0·289, 0·309 and 0·330, describing subsequent transmissions twice through females, once through each sex, and twice through males, respectively. For n=4, m=0, 1, …, 4, φ_n,m*(1, 0·25)=0·0197, 0·0214, 0·0231, 0·0251 and 0·0272, respectively.

It is straightforward to evaluate eqn (18) directly. These examples illustrate, however, that linear interpolations can provide good approximations. One alternative is to interpolate on φ using (1−m/n)φ_n(l(1+λ))+(m/n)φ_n(l(1−λ)), which for the example above for n=4 and m=1, 2 and 3 gives 0·0214, 0·0235 and 0·0253, respectively. Another is to interpolate on l using φ_n(l(1+(1−2m/n)λ)), for which corresponding values are 0·0212, 0·0229 and 0·0249.

(b) Sex limited recombination

For species such as Drosophila melanogaster there is no recombination in males, so autosomes are transmitted intact to the offspring and the variance in sharing with and among their descendants is increased. The probability that a parental pair of genes is transmitted to an offspring is ${\textstyle{1 \over 2}}\lpar 1 \minus c\rpar$ through a female and ${\textstyle{1 \over 2}}$ through a male. If m of the n=g−1 transmissions to descendants after the first generation (as the sex of the ancestor is not relevant) are through a male,

$\eqalign{\tab {\rm cov}\lpar {\u {k} _{\setnum{1}i} \comma \u {k} _{\setnum{1}j} } \rpar\equals \lpar {\textstyle{1 \over 2}}\rpar ^{m} \lsqb {\textstyle{1 \over 2}}\lpar 1 \minus c\rpar \rsqb ^{n \minus m} \minus \lpar {\textstyle{1 \over 4}}\rpar ^{n} \cr\tab\quad \equals \lpar {\textstyle{1 \over 2}}\rpar ^{m} \lcub \lsqb {\textstyle{1 \over 2}}\lpar 1 \minus c\rpar \rsqb ^{n \minus m} \minus \lpar {\textstyle{1 \over 4}}\rpar ^{n \minus m} \rcub \plus \lpar {\textstyle{1 \over 4}}\rpar ^{n \minus m} \lsqb \lpar {\textstyle{1 \over 2}}\rpar ^{m} \minus \lpar {\textstyle{1 \over 4}}\rpar ^{m} \rsqb. \cr}$

Hence,

$\eqalign{{\rm Var}_{{\rm Lin\lpar SL\rpar \ }g\comma m}\, \lpar \u {k} _{\setnum{1}} \comma l\rpar\equals\tab \lpar {\textstyle{1 \over 2}}\rpar ^{m} \lcub \phi _{g \minus \setnum{1} \minus m} \lpar l\rpar \cr\tab\plus\lpar {\textstyle{1 \over 4}}\rpar ^{g \minus \setnum{1} \minus m} \lsqb 1 \hskip-1\minus \hskip-1\lpar {\textstyle{1 \over 2}}\rpar ^{m} \rsqb \rcub.$

To take another example: for full sibs, the probability of sharing is ${\textstyle{1 \over 2}}$ for genes from their father and ${\textstyle{1 \over 2}}\lsqb \lpar 1 \minus c\rpar ^{\setnum{2}} \plus c^{\setnum{2}} \rsqb$ from their mother. Therefore, by summing components for maternal and paternal half-sibs, ${\rm Var}_{{\rm FS\lpar SL\rpar }} \lpar \u {R} \comma l\rpar \equals \phi _{\setnum{2}} \lpar l\rpar \minus {\textstyle{1 \over 2}}\phi _{\setnum{1}} \lpar l\rpar \plus {\textstyle{1 \over {16}}}$ .

(c) Sex chromosomes

Previous formulae apply for the autosomes and we now consider the sex chromosomes (assuming mammalian X, Y sex determination and ignoring the pseudo-autosomal region). For the Y chromosome, father and son share a genome exactly and there is no variation in sharing. Father and son do not share an X chromosome, and so for lineal descendants any male–male transmission in the pathway results in no sharing of descendant with the ancestor. A daughter receives a copy of her father's X chromosome without sampling, and so any male to female transmission reduces by one the number of generations of sampling in eqn (2). Son and daughter receive an X from their mother with recombination as for the autosomes. We consider only the case of full sibs in detail, but sampling variances for genome sharing on the X chromosome can be deduced for any relationship. Visscher (Reference Visscher2009) gives further discussion for sex-linked chromosomes.

We retain the k ₁^m, k ₁^p notation for the ibd of maternal or paternal alleles, adding a subscript to indicate X-linkage. For two full brothers, k _1X^p is not defined and $k_{\setnum{1}\rm X}^{\rm m}\equals{\textstyle{1 \over 2}}$ , the same as k ₁ for half-sibs; _2X=0, _1X=_1X^m and _0X=1−_1X^m. Integrating over the X chromosome of length l _X gives Var_BB (_1X)=4φ₂(l _X)−2φ₁(l _X), using the autosomal result for half-sibs (11). For a sister and brother, k _1X^p is still not defined and k _1X^m is as for half-sibs with a value of ${\textstyle{1 \over 2}}$ . Hence, Var_BS (_1X)=Var_BB (_1X). For two sisters, _1X^p=1 and _1X^m is as for half-sibs. From the previous results, _2X=_1X^m, _1X=1−_1X^m and _0X=0; therefore, Var_SS (_2X)=Var_SS (_1X)=−Cov_SS (_2X, _1X)=4φ₂(l _X)−2φ₁(l _X).

(vii) Examples

Examples of the SDs of actual proportion of genome shared ( $\u {k} _{\setnum{2}} \plus {\textstyle{1 \over 2}}\u {k} _{\setnum{1}}$ ) as a function of map length for single chromosomes are given in Fig. 2 a for descendants of full sibs. It is noticeable that there remains a substantial variation even for the longest chromosomes illustrated (4 Morgans), i.e. longer than most chromosomes in most species. Although the sd becomes smaller as the individuals become less related, the CV becomes larger (Fig. 2 b) (Visscher, Reference Visscher2009). Indeed the CV exceeds unity for all but close relationships, even for chromosomes of map length 2 M.

Fig. 2. (a) SD and (b) CV of actual relationship (proportion of genome shared, ${\u {\it R}} \equals \u {\it k} _{\setnum{2}} \plus {\textstyle{1 \over 2}}\u {\it k} _{\setnum{1}}$ ), for a single chromosome as a function of map length and relationship for full sibs (FS) and their descendants: uncle nephew (UN), cousins (C), cousins once removed (C1R), second cousins (2C), second cousins once removed (2C1R) and third cousins (3C).

Comparisons between lineal descendants and those of half- and full sibs are given in Fig. 3 for two examples of relationship. With complete linkage the variance depends only on relationship (Table 1). Although the differences are quite small, with increasing map length the variance declines less rapidly with increased chromosome length for lineal descendants than for those involving half sibs, which in turn show a faster decline than descendants of full sibs (Fig. 3). This is presumably because the latter can be ibd at a pair of loci on a pair of recombinant chromosomes: terms in c ² appearing in eqns (6) and (13), for example, but not in (2). Great uncle–nephew and first cousins, which have the same relationship, differ in the variance of sharing, but not very much (Fig. 3).

Fig. 3. SD of actual relationship (proportion of genome shared, $\u {\it R} \equals \u {\it k} _{\setnum{2}} \plus {\textstyle{1 \over 2}}\u {\it k} _{\setnum{1}}$ ), for a single chromosome as a function of map length and relationship for three different pedigrees for two different pedigree relationships: R=0·125: great grandparent–great grandoffspring (GGPGGO), half-uncle–nephew (HUN), great uncle–great nephew (GUGN), cousins (C); and R=0·03125: greatgreatgreat grandparent–GGGGoffspring (G4PG4O), half–cousins once removed (HC1R) and second cousins (2C).

For a mammalian or avian genome with multiple chromosomes, the variation and skew are reduced. Taking data for human autosomes from Kong et al. (Reference Kong, Murphy, Raj, He, White and Matise2004), we assumed that the 22 chromosomes could be grouped into six classes each of 2–8 chromosomes, each member of which was of similar map and genome length, as follows: (1–2) 2·75 M, (3–6) 2·10 M, (7–12) 1·75 M, (13–20) 1·25 M, (21–22) 0·75 M. Results are given in Table 2 for a wide range of relationships. The results are, however, little different from what would be expected from the same number of chromosomes each of the average map length, as shown by an example in the last column of Table 2 and as pointed out previously (Hill, Reference Hill1993a; Visscher, Reference Visscher2009). The average chromosomal length is about 1·6 Morgans, so with 22 chromosomes, the SD, CV and skew of sharing are approximately 20% of those for individual chromosomes.

Table 2. sd of actual relationship $\lpar \u {R} \equals \u {k} _{\setnum{2}} \plus {\textstyle{1 \over 2}}\u {k} _{\setnum{1}} \rpar$ for a model human genome for different pedigree relationships (R=2θ)

^a P–O, parent–offspring; GnP–GnO, great(n)grandparent – great(n)grandoffspring; H, half; UN, uncle–nephew; GUGN, great uncle–great nephew; C, 2C, 3C first, second, third cousin; 1R, once removed.

^bc SD() computed assuming 22 chromosomes: ^bwith differing map lengths, total 35·9 M (see text), ^ceach of length Σ35·9/22=1·63 M.

3. Skew of the distribution of genome sharing

(i) Methods

The methods that we have used for evaluating the variance of actual identity can be extended for dealing with higher moments, although the algebra becomes increasingly prohibitive. Here, we consider the magnitude of skew, initially giving formulae for individual genes.

The third central moment of an allele sharing indicator variable _m, m=0, 1, 2, is

$\eqalign{ E\lsqb \lpar \u {k} _{m} \minus k_{m} \rpar ^{\setnum{3}} \rsqb \tab \equals {E}\lpar \u {k} _{m}^{\setnum{3}} \rpar \minus 3k_{m} {\rm Var}\lpar \u {k} _{m} \rpar \minus k_{m}^{\setnum{3}} \cr \tab \equals k_{m} \lpar 1 \minus k_{m} \rpar \lpar 1 \minus 2k_{m} \rpar \cr}$

and the corresponding skew coefficient is

$\gamma _{\setnum{1}} \lpar \u {k} _{m} \rpar \equals {{\mu _{\setnum{3}} \lpar \u {k} _{m} \rpar } \over {\lsqb \mu _{\setnum{2}} \lpar \u {k} _{m} \rpar \rsqb ^{\setnum{3}\sol \setnum{2}} }} \equals {{\lpar 1 \minus 2k_{m} \rpar } \over {\sqrt {k_{m} \lpar 1 \minus k_{m} \rpar } }}.$

The _ms are symmetrically distributed if they are equal 0·5 and positively skewed if less than 0·5. The third central moment of the actual relationship can be shown to be

$\mu _{\setnum{3}} \lpar \u {R} \rpar \equals E{\rm \lsqb }\lpar \u {R} \minus R\rpar ^{\setnum{3}} \rsqb \equals \lpar 1 \minus 2R\rpar \lsqb R\lpar 1 \minus R\rpar \minus {\textstyle{3 \over 8}}k_{\setnum{1}} \rsqb$

For lineal descendants, i.e. k ₂=0, γ₁()=γ₁() and the distribution of actual relationship or co-ancestry is symmetric if k ₁=0·5, e.g. grandparent–grand offspring, half–sibs and uncle–nephew. The distribution of R is also symmetric for full sibs.

For evaluating the skew in genome sharing, we extend the methods used in order to compute the variance in actual relationship, but in view of the complexity of the analysis, restrict it to the case of lineal descendants (i.e. k ₂=0 at all loci). Thus, we evaluate E(₁³) as an average over r loci, where r becomes infinitely large:

$E\lpar \u {k} _{\setnum{1}}^{\setnum{3}} \rpar \equals {1 \over {r^{\setnum{3}} }}E\left( \mathop\sum\limits_{h} {\mathop\sum\limits_{i} {\mathop\sum\limits_{j} {\u {k} _{\setnum{1}h} \u {k} _{\setnum{1}i} \u {k} _{\setnum{1}j} } } } \right)$

Consider the expected value of allele sharing E(_1h_1i_1j) at three loci h, i, j so ordered along a chromosome. A three-locus haplotype is transmitted intact from parent to offspring with probability ${\textstyle{1 \over 2}}\lpar 1 \minus c_{\setnum{1}} \rpar \lpar 1 \minus c_{\setnum{2}} \rpar$ , where c ₁ and c ₂ are the recombination fractions between loci h, i and i, j, respectively. The probability is ${\textstyle{1 \over 8}}$ if the loci are unlinked. The probability of allele sharing for three-linked loci between two individuals, one of which is a g-generation lineal descendent of the other, is therefore

(19)

$E\lpar \u {k} _{\setnum{1}h} \u {k} _{\setnum{1}i} \u {k} _{\setnum{1}j} \rpar \equals \left( {{\textstyle{1 \over 2}}} \right)^{g \minus \setnum{1}} \lpar 1 \minus c_{\setnum{1}} \rpar ^{g \minus \setnum{1}} \lpar 1 \minus c_{\setnum{2}} \rpar ^{g \minus \setnum{1}}.$

This equation extends the two-locus result in eqn (2) and can be evaluated over each chromosome by invoking Haldane's mapping function to write recombination fractions in terms of map lengths and integrating:

(20)

$\openup3\eqalign{\hskip-3pt E\lpar \u {k} _{\setnum{1}h} \u {k} _{\setnum{1}i} \u {k} _{\setnum{1}j} \rpar \minus k_{\setnum{1}}^{\setnum{3}} \equals\tab {6 \over {l^{\setnum{3}} }}\left( {{1 \over 2}} \right)^{\setnum{3}\lpar g \minus \setnum{1}\rpar }\!\!\int_{\setnum{0}}^{l}\!\! {\int_{\setnum{0}}^{x}\!\! {\int_{\setnum{0}}^{y}\!\! {\lsqb \lpar 1 \plus {\rm e}^{ \minus \setnum{2}\lpar x \minus y\rpar } \rpar ^{g \minus \setnum{1}}}}}\cr\tab\times \lpar 1 \plus {\rm e}^{ \minus \setnum{2}\lpar y \minus z\rpar } \rpar ^{g \minus \setnum{1}} \minus 1\rsqb \, {\rm d}z \, {\rm d}y \, {\rm d}x.$

As the analysis has also to deal with descendants of collateral relatives, we generalize the integration, illustrating the process for half-sibs. The probability that a pair of half-sibs share an allele ibd at each of the three loci is

$\openup3\eqalign{E\lpar \u {k} _{\setnum{1}h} \u {k} _{\setnum{1}i} \u {k} _{\setnum{1}j} \rpar \equals\tab {\textstyle{1 \over 2}}\lcub \lsqb 1 \minus c_{\setnum{1}} \rpar \lpar 1 \minus c_{\setnum{2}} \rpar \rsqb ^{\setnum{2}} \plus \lsqb \lpar 1 \minus c_{\setnum{1}} \rpar c_{\setnum{2}} \rsqb ^{\setnum{2}} \cr\tab\plus \lsqb c_{\setnum{1}} \lpar 1 \minus c_{\setnum{2}} \rpar \rsqb ^{\setnum{2}} \plus \lsqb c_{\setnum{1}} c_{\setnum{2}} \rsqb ^{\setnum{2}} \rcub.$

In order to evaluate this expression, we expand it in terms of (1−c ₁) and (1−c ₂):

(21)

$\openup3\scale96%\eqalign{ {E}\lpar \u {k} _{\setnum{1}h} \u {k} _{\setnum{1}i} \u {k} _{\setnum{1}j} \rpar \equals \tab {\textstyle{1 \over 2}}\lsqb 4\lpar 1 \minus c_{\setnum{1}} \rpar ^{\setnum{2}} \lpar 1 \minus c_{\setnum{2}} \rpar ^{\setnum{2}} \minus 4\lpar 1 \minus c_{\setnum{1}} \rpar ^{\setnum{2}} \lpar 1 \minus c_{\setnum{2}} \rpar ^{\setnum{1}}\cr\tab \minus 4\lpar 1 \minus c_{\setnum{1}} \rpar ^{\setnum{1}} \lpar 1 \minus c_{\setnum{2}} \rpar ^{\setnum{2}} \plus 4\lpar 1 \minus c_{\setnum{1}} \rpar ^{\setnum{1}} \lpar 1 \minus c_{\setnum{2}} \rpar ^{\setnum{1}}\cr \tab \plus 2\lpar 1 \minus c_{\setnum{1}} \rpar ^{\setnum{2}} \lpar 1 \minus c_{\setnum{2}} \rpar ^{\setnum{0}} \plus 2\lpar 1 \minus c_{\setnum{1}} \rpar ^{\setnum{0}} \lpar 1 \minus c_{\setnum{2}} \rpar ^{\setnum{2}} \cr \tab \minus 2\lpar 1 \minus c_{\setnum{1}} \rpar ^{\setnum{1}} \lpar 1 \!\minus c_{\setnum{2}} \rpar ^{\setnum{0}}\! \minus 2\lpar 1 \minus c_{\setnum{1}} \rpar ^{\setnum{0}} \lpar 1 \minus c_{\setnum{2}} \rpar ^{\setnum{1}} \plus 1\rsqb. \cr}$

As some terms have different exponents for (1−c ₁) and (1−c ₂), we redefine the integral more generally than shown in eqn (20), and the exponents are not generation numbers per se. We express (1−c ₁)^m(1−c ₂)ⁿ in terms of map distances and define

$\openup3\eqalign{ \rmPhi _{m\comma n} \lpar l\rpar \tab\equals{6 \over {l^{\setnum{3}} }}\left( {{1 \over 2}} \right)^{m \plus n} \int_{\setnum{0}}^{l}\!\! {\int_{\setnum{0}}^{x} \!\!{\int_{\setnum{0}}^{y} {\lsqb \lpar 1 \plus {\rm e}^{ \minus \setnum{2}\lpar x \minus y\rpar } \rpar ^{m}} } }\cr\tab\quad\times \lpar 1 \plus {\rm e}^{ \minus \setnum{2}\lpar y \minus z\rpar } \rpar ^{n} \minus 1\rsqb\, {\rm d}z \, {\rm d}y \, {\rm d}x$

(22)

$\quad\equals \left( {{1 \over 2}} \right)^{m \plus n} \left\{ 1 \plus \mathop\sum\limits_{i \equals \setnum{1}}^{m} {\left(\! {\matrix{ m \cr i \cr} } \right)} {{2i^{\setnum{2}} l^{\setnum{2}} \minus 2il \plus 1 \minus {\rm e}^{ \minus \setnum{2}il} } \over {8i^{\setnum{3}} }}}\right.\cr\tab\quad\plus \mathop\sum\limits_{j \equals \setnum{1}}^{n} {\left( {\matrix{ n \cr j \cr} } \right)} {{2j^{\setnum{2}} l^{\setnum{2}} \minus 2jl \plus 1 \minus {\rm e}^{ \minus \setnum{2}jl} } \over {8j^{\setnum{3}} }}\cr\tab\quad\plus \mathop\sum\limits_{i \equals \setnum{1}}^{\min \lpar m\comma n\rpar } {\left( {\matrix{ m \cr i \cr} } \right)} \left( {\matrix{ n \cr i \cr} } \right){{\lpar il \minus 1\rpar \lpar 1 \minus {\rm e}^{ \minus \setnum{2}il} \rpar } \over {4i^{\setnum{3}} }} \cr \tab\quad \plus \mathop\sum\limits_{i \equals \setnum{1}}^{m} {\mathop\sum\limits_{j \equals \setnum{1}\comma i \ne j}^{n} {\left( {\matrix{ m \cr i \cr} } \right)} \left( {\matrix{ n \cr j \cr} } \right)}\cr\tab\quad\times {{2ijl \minus i \minus j \plus \lpar i^{\setnum{2}} {\rm e}^{ \minus \setnum{2}jl} \minus j^{\setnum{2}} {\rm e}^{ \minus \setnum{2}il} \rpar \sol \lpar i \minus j\rpar } \over {8i^{\setnum{2}} j^{\setnum{2}} }}\bigg\} \comma \cr}$

where the summation terms are included only when the upper limits exceed zero. Note that Φ_m,n(l)=Φ_n,m(l). Despite its complex appearance, eqn (22) is quick and easy to compute.

For lineal descendants that are g generations apart, the increase in the joint allele sharing probability over that for unlinked loci is therefore

${E}\lpar \u {k} _{\setnum{1}h} \u {k} _{\setnum{1}i} \u {k} _{\setnum{1}j} \rpar \minus k_{\setnum{1}}^{\setnum{3}} \equals \lpar {\textstyle{1 \over 2}}\rpar ^{g \minus \setnum{1}} {\rm \ }\rmPhi _{g \minus \setnum{1}\comma g \minus \setnum{1}} \lpar l\rpar$

and for half-sibs, from eqn (19), it is

$\eqalign{ E\lpar \u {k} _{\setnum{1}h} \u {k} _{\setnum{1}i} \u {k} _{\setnum{1}j} \rpar \minus k_{\setnum{1}}^{\setnum{3}} \equals\tab {\textstyle{1 \over 2}}\lsqb 4\rmPhi _{\setnum{2}\comma\! \setnum{2}} \lpar l\rpar \minus 4\rmPhi _{\setnum{2}\comma\! \setnum{1}} \lpar l\rpar \minus 4\rmPhi _{\setnum{1}\comma\! \setnum{2}} \lpar l\rpar \cr\tab\plus 4\rmPhi _{\setnum{1}\comma\! \setnum{1}} \lpar l\rpar \plus 2\rmPhi _{\setnum{2}\comma\! \setnum{0}} \lpar l\rpar \plus 2\rmPhi _{\setnum{0}\comma\! \setnum{2}} \lpar l\rpar \cr \tab \minus 2\rmPhi _{\setnum{1}\comma\! \setnum{0}} \lpar l\rpar \minus 2\rmPhi _{\setnum{0}\comma\! \setnum{1}} \lpar l\rpar \plus \rmPhi _{\setnum{0}\comma\! \setnum{0}} \lpar {\rm}l\rpar \rsqb \cr \tab \hskip-9pt\equals {\textstyle{1 \over 2}}\lsqb 4\rmPhi _{\setnum{2}\comma\! \setnum{2}} \lpar l\rpar \minus 8\rmPhi _{\setnum{2}\comma\! \setnum{1}} \lpar l\rpar \plus 4\rmPhi _{\setnum{1}\comma\! \setnum{1}} \lpar l\rpar \cr\tab\plus 4\rmPhi _{\setnum{2}\comma\! \setnum{0}} \lpar l\rpar \minus 4\rmPhi _{\setnum{1}\comma\! \setnum{0}} \lpar l\rpar \plus \rmPhi _{\setnum{0}\comma\! \setnum{0}} \lpar l\rpar \rsqb \cr}$

For subsequent generations, e.g. half-cousins, the formulae can be simply extended by methods similar to those used previously for pairs of loci and therefore have the same basic form. These and other results, including those for full sibs and their descendants, are given in Box 2.

Box 2. Summary of formulae for skew of genome sharing

Lineal descendants, where g=2 is grandparent–grandoffspring (k ₂=0)

$E\lpar \u {k} _{\setnum{1}h} \u {k} _{\setnum{1}i} \u {k} _{\setnum{1}j} \rpar \minus k_{\setnum{1}}^{\setnum{3}} \equals \lpar {\textstyle{1 \over 2}}\rpar ^{g \minus \setnum{1}} \rmPhi _{g \minus \setnum{1}\comma g \minus \setnum{1}} \lpar l\rpar.$

Half-sibs and their descendants, where g=2 for half-sibs (k ₂=0)

$\eqalign{ {E}\lpar \u {k} _{\setnum{1}h} \u {k} _{\setnum{1}i} \u {k} _{\setnum{1}j} \rpar \minus k_{\setnum{1}}^{\setnum{3}} \equals \tab \lpar {\textstyle{1 \over 2}}\rpar ^{g \minus \setnum{1}} \lsqb 4\rmPhi _{g\comma g\comma } \lpar l\rpar \minus 8\rmPhi _{g\comma g \minus \setnum{1}}\lpar l\rpar \plus 4\rmPhi _{g \minus \setnum{1}\comma g \minus \setnum{1}} \lpar l\rpar \plus 4\rmPhi _{g\comma g \minus \setnum{2}} \lpar l\rpar \minus 4\rmPhi _{g \minus \setnum{1}\comma g {\minus \setnum{2}}} \lpar l\rpar \plus \rmPhi _{g \minus \setnum{2}\comma g{\minus \setnum{2}}}} \lpar l\rpar \rsqb. \cr}$

Full sibs and their descendants

The actual relationship and also ₁ for full sibs are symmetrically distributed (Table 1) although the non-central moments are non-zero. The third moment of ₂ and of ₀ for full sibs is

$\eqalign{ E\lpar \u {k} _{\setnum{2}h} \u {k} _{\setnum{2}i} \u {k} _{\setnum{2}j} \rpar \minus k_{\setnum{2}}^{\setnum{3}} \equals \tab E\lpar \u {k} _{\setnum{0}h} \u {k} _{\setnum{0}i} \u {k} _{\setnum{0}j} \rpar \minus k_{\setnum{0}}^{\setnum{3}} \equals {\textstyle{1 \over 4}}\lsqb 16\rmPhi _{\setnum{4}\comma \setnum{4}} \lpar l\rpar \minus 64\rmPhi _{\setnum{4}\comma\! \setnum{3}} \lpar l\rpar \plus 64\rmPhi _{\setnum{4}\comma\! \setnum{2}} \lpar l\rpar \cr \tab \plus 64\rmPhi _{\setnum{3}\comma\! \setnum{3}} \lpar l\rpar \minus 32\rmPhi _{\setnum{4}\comma\! \setnum{1}} \lpar l\rpar \minus 128\rmPhi _{\setnum{3}\comma\! \setnum{2}} \lpar l\rpar \plus 8\rmPhi _{\setnum{4}\comma\! \setnum{0}} \lpar l\rpar \plus 64\rmPhi _{\setnum{3}\comma\! \setnum{1}} \lpar l\rpar \plus 64\rmPhi _{\setnum{2}\comma\! \setnum{2}} \lpar l\rpar \cr \tab \minus 16\rmPhi _{\setnum{3}\comma\! \setnum{0}} \lpar l\rpar \minus 64\rmPhi _{\setnum{2}\comma\! \setnum{1}} \lpar l\rpar \plus 16\rmPhi _{\setnum{2}\comma\! \setnum{0}} \lpar l\rpar \plus 16\rmPhi _{\setnum{1}\comma\! \setnum{1}} \lpar l\rpar \minus 8\rmPhi _{\setnum{1}\comma\! \setnum{0}} \lpar l\rpar \plus \rmPhi _{\setnum{0}\comma\! \setnum{0}} \lpar l\rpar \rsqb. \cr}$

Uncle–nephew (g=2) and descendants (k ₂=0)

$\eqalign{ E\lpar \u {k} _{\setnum{1}h} \u {k} _{\setnum{1}i} \u {k} _{\setnum{1}j} \rpar \minus k_{\setnum{1}}^{\setnum{3}} \equals \tab \lpar {\textstyle{1 \over 2}}\rpar ^{g} \lsqb 16\rmPhi _{g \plus \setnum{1}\comma g \plus \setnum{1}} \lpar l\rpar \minus 48\rmPhi _{g \plus \setnum{1}\comma g} \lpar l\rpar \plus 24\rmPhi _{g \plus \setnum{1}\comma g{\minus\setnum{1}}} \lpar l\rpar \plus 40\rmPhi _{g\comma g} \lpar l\rpar \cr \tab \plus 4\rmPhi _{g\comma g{\minus \setnum{2}}} \lpar l\rpar \minus 44\rmPhi _{g\comma g{\minus\setnum{1}}} \lpar l\rpar \plus 13\rmPhi _{g{\minus \setnum{1}\comma g{\minus \setnum{1}}}} \lpar l\rpar \minus 4\rmPhi _{g{\minus \setnum{1}\comma g{\minus \setnum{2}}}} \lpar l\rpar \plus \rmPhi _{g{\minus \setnum{2}\comma g{\minus \setnum{2}} }}\lpar l\rpar \rsqb. \cr$

Cousins (g=3) and descendants (k ₂=0)

$\eqalign{ E\lpar \u {k} _{\setnum{1}h} \u {k} _{\setnum{1}i} \u {k} _{\setnum{1}j} \rpar \minus k_{\setnum{1}}^{\setnum{3}} \equals \tab \left( {{\textstyle{1 \over 2}}} \right)^{g} \lsqb 8\rmPhi _{g \plus \setnum{1}\comma g \plus \setnum{1}} \lpar l\rpar \minus 48\rmPhi _{g \plus \setnum{1}\comma g} \lpar l\rpar \plus 56\rmPhi _{g \plus \setnum{1}\comma g \minus \setnum{1}} \lpar l\rpar \plus 72\rmPhi _{g\comma g} \lpar l\rpar \minus 32\rmPhi _{g \plus \setnum{1}\comma g \minus \setnum{2}} \lpar l\rpar \cr \tab\minus 160\rmPhi _{g\comma g \minus \setnum{1}} \lpar l\rpar \plus 8\rmPhi _{g \plus \setnum{1}\comma g \minus \setnum{3}} \lpar l\rpar \plus 80\rmPhi _{g\comma g \minus \setnum{2}} \lpar l\rpar \plus 87\rmPhi _{g \minus \setnum{1}\comma g \minus \setnum{1}} \lpar l\rpar \minus 16\rmPhi _{g\comma g \minus \setnum{3}} \lpar l\rpar\cr \tab \minus 84\rmPhi _{g \minus \setnum{1}\comma g \minus \setnum{2}} \lpar l\rpar \plus 16\rmPhi _{g \minus \setnum{1}\comma g \minus \setnum{3}} \lpar l\rpar \plus 20\rmPhi _{g \minus \setnum{2}\comma g \minus \setnum{2}} \lpar l\rpar \minus 8\rmPhi _{g \minus \setnum{2}\comma g \minus \setnum{3}} \lpar l\rpar \plus \rmPhi _{g \minus \setnum{3}\comma g \minus \setnum{3}} \lpar l\rpar \rsqb {\rm \ }{\rm.} \cr}$

For multiple chromosomes that have the same genome content and map length, the skew and variances would be the same for each, and the skewness for whole-genome actual allele sharing would decrease with the square root of the number of chromosomes.

(ii) Examples

The magnitude of skew, expressed as the skew coefficient, is illustrated for single chromosomes in Fig. 4 for a wide range of descendants of full sibs and for alternative ancestry, respectively. The magnitude of the skew rises as relationships become smaller, as expected since it is (1−2k)/√[k(1−k)] for single or completely linked loci. Thus, for second cousins, for example, the skew coefficient exceeds 2 even for long chromosomes.

Fig. 4. Skewness of actual relationship (proportion of genome shared) for a single chromosome as a function of map length and relationship for (a) descendants of full sibs (as Fig. 2), and (b) for different pedigrees for two different degrees of relationships (as Fig. 3). For full sibs and uncle–nephew there is no skew.

4. Variation in actual inbreeding

If an individual's parents are related, it is inbred. At a locus i, the actual inbreeding _i takes values of 0 (alleles not ibd) or 1 (alleles ibd). It has expectation E(_i)=F, where F is the pedigree inbreeding, which in turn equals the co-ancestry, $\theta \equals {\textstyle{1 \over 2}}R$ , of its parents. The variance of _i in a population of similarly inbred but independent individuals is F(1−F). Slate et al. (Reference Slate, David, Dodds, Veenvliet, Glass, Broad and McEwan2004) analyse the correlation between multi-locus heterozygosity, a function of actual inbreeding, and the pedigree inbreeding, and show how weak this correlation is. Their analysis does not incorporate linkage, however.

For the genome as a whole, the actual inbreeding of an individual is the proportion of its genome which is ibd, with E()=F. Linkage affects variation in the actual relationship of individuals with the same pedigree relationship and also therefore increases variation in the actual inbreeding of their offspring. We use an example to show how it can be computed. Individuals E and F in Fig. 1 are full sibs, and so if they had mated for producing an offspring X, the expected inbreeding coefficient of X would be 0·25. If B is a male, then M is a paternal half sib of X, N is a maternal half sib of X, and their offspring H and I are cousins. The gametes transmitted by E to H and to X have the same random distribution as do those transmitted by F to I and X. Hence, the distribution of of X is identical to the distribution of ₁ of H and I, who are cousins in this example. From eqn (14) or Box 1 (descendants of full sibs with g=3), ${\rm Var}_{{\rm FS}}\, \lpar \u {F} \comma l\rpar\! \equals\! {\rm Var}_{{\rm FC}}\, \lpar \u {k} _{\setnum{1}} \comma l\rpar \equals 8\phi _{\setnum{4}} \left( l \right) \minus 4\phi _{\setnum{3}} \left( l \right)\plus {\textstyle{3 \over 2}}\phi _{\setnum{2}} \left( l \right) \minus {\textstyle{1 \over 2}}\phi _{\setnum{1}} \left( l \right)$ , which also equals 4Var_FC (, l) and $16{\rm Var}_{{\rm FC}}\, \lpar \u {\theta } \comma l\rpar$ . Skew coefficients for the actual inbreeding can be obtained similarly.

The arguments do not depend (although the detailed results do) on the relationship among the parents, and can be regarded as a consequence of extending the co-ancestry concept to identity at multiple loci. We are using a quantity, the ‘genomic coancestry’, which for a pair of individuals Y and Z is the proportion of the genome-shared ibd between a random gamete from Y and a random gamete from Z. Thus, genomic coancestry describes genomes transmitted from individuals, whereas genome sharing (k) describes genomes that are in individuals. Actual inbreeding depends on the genomic coancestry of the two gametes one individual receives; genome sharing and actual relationship depend on the genomic coancestry of the gametes two different individuals receive. For example, the variation of of offspring of cousin matings is the same as that of ₁ of second cousins, as both are the variance in the genomic coancestry of cousins.

The results for variances, SD, CV and skew of actual relationship given in the Figures and Tables can therefore also be applied directly to actual inbreeding. For example, from Table 2 the SD of of offspring of full sib matings in humans is 2×0·0218=0·0436 (from item C) and 0·0240 (from item 2C) for offspring of cousins, with the CV of the latter being 0·0240/0·0625=0·384.

The above result applies to the variation in actual inbreeding among a group of unrelated individuals whose parents all have the same pedigree, e.g. are full sibs. In any population there is variation in pedigree inbreeding which also contributes to the total variance in actual inbreeding. The expected variation and distribution of shared segments in any population therefore depend on the population size and mating system, and relevant results for closed populations have been published (Bennett, Reference Bennett1954; Franklin, Reference Franklin1977; Stam, Reference Stam1980; Weir et al., Reference Weir, Avery and Hill1980).

The variation in actual inbreeding can be partitioned into two components, that between families, i.e. the covariance in actual inbreeding of (e.g. full sib) family members, and the variation in actual inbreeding among (e.g. full sib) family members. When we consider just pedigree inbreeding the variance between families is the variance of the co-ancestry from pedigree of the parents, which equals one-quarter of the pedigree relationship of the parents, and there is no variation in pedigree inbreeding within families.

Hence, for full sib matings, for example, ${\rm VarB}_{{\rm FS}}\, \lpar \u {I} \comma l\rpar \equals {\textstyle{1 \over 4}}{\rm Var}_{{\rm FS}} \lpar \u {R} \comma l\rpar$ . The variance within families can be obtained by difference, and so from the above results for full sib matings,

$\eqalign{ {\rm VarW}_{{\rm FS}}\, \lpar \u {I} {\rm \comma }l\rpar \tab \equals {\rm Var}_{{\rm FS}}\, \lpar \u {I} \comma l\rpar \minus {\rm VarB}_{{\rm FS}}\, \lpar \u {I} \comma l\rpar \cr \tab \equals {\rm 4Var}_{\rm C} \,\lpar \u {R} \comma l\rpar \minus {\textstyle{{\rm 1} \over {\rm 4}}}{\rm Var}_{{\rm FS}}\, \lpar \u {R} \comma l\rpar. \cr}$

This can also be regarded as the variance in genomic coancestry of full sibs less the variance in genomic co-ancestry between their parents.

As an example, using results from Table 2 for the human genome as a whole, Var_FS (, L)=4(0·0218)²=0·00191, VarB_FS (, L)=(0·0392)²/4=0·00038 and VarW_FS (, L)=0·00152, with corresponding SD equal to 0·0436, 0·0196 and 0·0390, respectively. In Table 3, we list relevant relationships and results. It is seen that the variation is substantial and is primarily within families (exclusively within families for selfing and parent–offspring matings of non-inbred individuals). For example, for cousin matings of humans, the mean F is 0·0625 and the SD within families is predicted to be 0·0214.

Table 3. SD of actual inbreeding for a model human genomeFootnote ^a for matings of relatives

^a Differing map lengths as in Table 2, except as ^c.

^b Relationship of non-inbred offspring with the same genomic coancestry as the inbred offspring.

^c For a model maize genome of 10 chromosomes each of 1 M.

Estimation of inbreeding depression is usually undertaken by regression of phenotype on pedigree inbreeding. The method can be enhanced by using dense marker data in order to infer the proportion of the offspring genotype that is ibd from the parents and hence actual inbreeding (Slate et al., Reference Slate, David, Dodds, Veenvliet, Glass, Broad and McEwan2004). By undertaking the analysis within families, confounding environmental effects can be eliminated, with the method being analogous to that of Visscher et al. (Reference Visscher, Medland, Ferreira, Morley, Zhu, Cornes, Montgomery and Martin2006) for estimating heritability within families, but focused on means rather than variances. The design is likely to be most useful for species such as pigs that have large families. Christensen et al. (Reference Christensen, Fredholm, Wintero, Jorgensen and Andersen1996) undertook such an analysis, but had only 21 markers available for estimating actual inbreeding (which they refer to as ‘realized inbreeding’).

5. Discussion

We have shown how to compute the variation and skew in the proportion of genomes shared for diverse kinds of relatives. As theoretical papers have shown previously (Hill, Reference Hill1993a, Reference Hillb; Guo, Reference Guo1995; Visscher, Reference Visscher2009), and anticipated by analyses of junctions and the distribution as a whole, the variance can be high, illustrated most clearly by the coefficient of variation (Fig. 2 b) and skew (Fig. 4) for increasingly distant relatives.

As the CV is large for single chromosomes each of the average length of those of humans (c. 1·6 M) (Fig. 2 b), exceeding two for second cousins or more distant relatives (Fig. 2 b), there is substantial overlap in the amount of sharing of quite different pedigree relationship classes. Further, there is substantial positive skew in the distribution over the whole genome for these and more distant relatives, such that individuals with low-pedigree relationship may share much more genome than expected.

In identifying distant relatives in a sample of individuals on which dense SNP data are available, information on potential relationship is available both from estimates of the mean proportion shared and from the variation among chromosomes. That this variation is substantially illustrated by the CVs of actual relationship (Fig. 2 b), which can greatly exceed unity. Distant relatives are expected to share little or none of the genome of a common ancestor ibd for some chromosomes and a non-negligible amount for others. Indeed, our results for variance in sharing of single chromosomes among pairs of individuals also apply to the variation in sharing among chromosomes of the same length between the same individuals. How best to use such an information has not, in so far as we know, been investigated.

The problem of inferring pedigree relationship from actual relationship (as measured by genome shared) is illustrated in Fig. 5 using the human model genome example. Information on, for example, the distribution of the lengths of shared segments, which will tend to be shorter for distant relatives, also needs to be taken into account, following, for example, the work of Fisher (Reference Fisher1954 and earlier), Bennett (Reference Bennett1953), Stam (Reference Stam1980) and Thompson (Reference Thompson2008) which is based, inter alia, on analysis of junctions. Although the distribution of lengths of shared genome that include the end of the chromosome can be computed, there is no general approach that is simple to apply. While it is quite clear that developing methodology using the distributions of chromosome lengths and the numbers of chromosomes for which there is no sharing would be of some interest and potential practical value in establishing pedigree relationship, for example, in forensic situations, such an analysis is beyond the scope of this paper.

Fig. 5. Distribution of actual genome sharing (₁) for samples of ‘human’ genomes for different degrees of pedigree relationship of descendants of full sibs (as Fig. 2) (10 000 replicates each).

Inferring the presence of genes of large effect under selection from shared segments of the genome or for mapping disease genes by comparing allele sharing proportions between affected and unaffected individuals has potential importance, but our results do not give much ground for optimism in its use because the sampling error is so high.

Estimates using dense markers of the variance in actual genome sharing of human full sibs were obtained by Visscher et al. (Reference Visscher, Medland, Ferreira, Morley, Zhu, Cornes, Montgomery and Martin2006, Reference Visscher, Macgregor, Benyamin, Zhu, Gordon, Medland, Hill, Hottenga, Willemsen, Boomsma, Liu, Deng, Montgomery and Martin2007), and, in general, there was good agreement: for example, the observed mean and SD of the proportion of the autosomal genome shared ( $\u {k} _{\setnum{2}} \plus {\textstyle{1 \over 2}}\u {k} _{\setnum{1}}$ ), were 0·498±0·036 compared with expectation 0·5±0·039, and the corresponding figures for ₂ were 0·248±0·040 observed and 0·25±0·044 expected. The discrepancy was explained by the fact that identical sections could be missed as a limited number of microsatellite markers were used in these studies, averaging 400–600 per individual for the whole genome (Visscher et al., Reference Visscher, Medland, Ferreira, Morley, Zhu, Cornes, Montgomery and Martin2006, Reference Visscher, Macgregor, Benyamin, Zhu, Gordon, Medland, Hill, Hottenga, Willemsen, Boomsma, Liu, Deng, Montgomery and Martin2007). We offer further illustration in Fig. 6, using data kindly supplied by Dr M. Marazita. Coefficients of ibd were estimated using SNP data obtained for a whole-genome association analysis of dental caries. Relationship classes were inferred from pedigree information with software developed by Dr Cecelia Laurie and the methods of this paper were used for calculating the SDs of ₀ and ₁. For each pair of related individuals in the study (pedigree R>1/32), the estimated IBD coefficients (₀ and ₁) were plotted, along with predicted ‘error bars’ of two sds each side of the expected values. For display purposes, these bars were offset from the line k ₀+k ₁=1 in the cases for which k ₂=0. We did not perform any statistical tests for inferred relationships; the error bars reflect only Mendelian sampling and linkage, and the effects of using sample allele frequencies on variation in estimated ibd coefficients will be discussed elsewhere.

Fig. 6. Estimated ibd coefficients, ₀ and ₁, from SNP data for individuals with known pedigree relationship (PO denotes parent-offspring, DFC double first cousins, other symbols as Figs 2 and 3), together with predicted ‘error bars’ of two SD about expectation. Bars are offset from k ₀+k ₁=1 if k ₂=0.

The main objective of this paper was to provide general formulae for computing the variance of shared sites. Obviously there are many other avenues to pursue, but these require different techniques.

We are grateful to Peter Visscher for many helpful comments on previous drafts and to Jinliang Wang for a useful suggestion. This work was supported in part by NIH grants R01 GM075091 and HGU0044446, and by the USS. David Crosslin, University of Washington, plotted the figures. Mary L. Marazita, University of Pittsburgh, consented to inclusion of Fig. 6 that displays results from her study of Dental Caries (supported by NIH grants U01-DE018904 and R01-DE014899, and NIH contract HHSN268200782096C to the Center for Inherited Disease Research for genotyping) as part of the GENEVA project (Cornelis et al., Reference Cornelis, Agrawal, Cole, Hansel, Barnes, Beaty, Bennett, Bierut, Boerwinkle, Doheny, Feenstra, Feingold, Fornage, Haiman, Harris, Hayes, Heit, Hu, Kang, Laurie, Ling, Teri, Manolio, Marazita, Mathias, Mirel, Paschall, Pasquale, Pugh, Rice, Udren, van Dam, Wang, Wiggs, Williams and Yu2010). The paper is dedicated to the memory of Piet Stam for his pioneering work in multi-locus ibd.

References

Ball, F. & Stefanov, V. T. (2005). Evaluation of identity-by-descent probabilities for half-sibs on continuous genome. Mathematical Biosciences 196, 215–225.CrossRef Google Scholar PubMed

Bennett, J. H. (1953). Junctions in inbreeding. Genetica 26, 392–406.CrossRef Google Scholar PubMed

Bennett, J. H. (1954). The distribution of heterogeneity upon inbreeding. Journal of the Royal Statistical Society, Series B 16, 88–99.Google Scholar

Bickeboller, H. & Thompson, E. A. (1996 a). Distribution of genome shared IBD by half-sibs: approximation by the Possion clumping heuristic. Theoretical Population Biology 50, 66–90.CrossRef Google Scholar

Bickeboller, H. & Thompson, E. A. (1996 b). The probability distribution of the amount of an individuals's genome surviving to the following generation. Genetics 143, 1043–1049.CrossRef Google Scholar

Choi, Y., Wijsman, E. & Weir, B. S. (2009). Case-control association testing in the presence of unknown relationships. Genetic Epidemiology 33, 668–678.CrossRef Google Scholar PubMed

Christensen, K., Fredholm, M., Wintero, A. K., Jorgensen, J. N. & Andersen, S. (1996). Joint effect of 21 marker loci and effect of realized inbreeding on growth in pigs. Animal Science 62, 541–546.CrossRef Google Scholar

Cockerham, C. C. & Weir, B. S. (1983). Variance of actual inbreeding. Theoretical Population Biology 23, 85–109.CrossRef Google Scholar PubMed

Cornelis, M. C., Agrawal, A., Cole, J. W., Hansel, N. H., Barnes, K. C., Beaty, T. H., Bennett, S. N., Bierut, L. J., Boerwinkle, E., Doheny, K. F., Feenstra, B., Feingold, E., Fornage, M., Haiman, C. A., Harris, E. L., Hayes, M. G., Heit, J. A., Hu, F. B., Kang, J. H., Laurie, C. C., Ling, H., Teri, A., Manolio, T. A., Marazita, M. L., Mathias, R. A., Mirel, D. B., Paschall, J., Pasquale, L. R., Pugh, E. W., Rice, J. P., Udren, J., van Dam, R. M., Wang, X., Wiggs, J. L., Williams, K. & Yu, K. (2010). The Gene, Environment Association Studies Consortium (GENEVA): Maximizing the knowledge obtained from GWAS by collaboration across studies of multiple conditions. Genetic Epidemiology 34, 364–372.CrossRef Google Scholar PubMed

Donnelly, K. P. (1983). The probability that related individuals share some section of the genome identical by descent. Theoretical Population Biology 23, 34–64.CrossRef Google Scholar PubMed

Falconer, D. S. and Mackay, T. F. C. (1996). Introduction to Quantitative Genetics 4th ed. Harlow, Essex: Longman.Google Scholar

Fisher, R. A. (1954). A fuller theory of ‘Junctions’ in inbreeding. Heredity 8, 187–197.CrossRef Google Scholar

Franklin, I. R. (1977). The distribution of the proportion of the genome which is homozygous by descent in inbred individuals. Theoretical Population Biology 11, 60–80.CrossRef Google Scholar PubMed

Goddard, M. (2009). Genomic selection: prediction of accuracy and maximisation of long term response. Genetica 136, 245–257.CrossRef Google Scholar PubMed

Guo, S.-W. (1995). Proportion of genome shared identical by descent by relatives: concept, computation, and applications. American Journal of Human Genetics 56, 1468–1476.Google Scholar PubMed

Haldane, J. B. S. (1919). The combination of linkage values, and the calculation of distances between the loci of linked factors. Journal of Genetics 8, 99–309.Google Scholar

Hill, W. G. (1993 a). Variation in genetic composition in backcrossing programs. Journal of Heredity 84, 212–213.CrossRef Google Scholar

Hill, W. G. (1993 b). Variation in genetic identity within kinships. Heredity 71, 652–653.CrossRef Google Scholar

Kong, X., Murphy, K., Raj, T., He, C., White, P. S. & Matise, T. C. (2004). A combined physical-linkage map of the human genome. American Journal of Human Genetics 75, 1143–1148.CrossRef Google Scholar

Laurie, C. C., Doheny, K. F., Mirel, D. B., Pugh, E. W., Bierut, L. J., Bhangale, T., Boehm, F., Caporaso, N. E., Edenberg, H. J., Gabriel, S. B., Harris, E. L., Hu, F. B., Jacobs, K. B., Kraft, P., Landi, M. T., Lumley, T., Manolio, T., McHugh, C., Painter, I., Paschall, J., Rice, J. P., Rice, K. M., Zheng, X. & Weir, B. S., for the GENEVA Investigators. (2010). Quality control and quality assurance in genotypic data for genome-wide association studies. Genetic Epidemiology 34, 591–602.CrossRef Google Scholar PubMed

Meuwissen, T. H. E., Hayes, B. J. & Goddard, M. E. (2001). Prediction of total genetic value using genome-wide dense marker maps. Genetics 157, 1819–1829.CrossRef Google Scholar PubMed

Slate, J., David, P., Dodds, K. G., Veenvliet, B. A., Glass, B. C., Broad, T. E. & McEwan, J. C. (2004). Understanding the relationship between the inbreeding coefficient and multilocus heterozygosity: theoretical expectations and empirical data. Heredity 93, 255–265.CrossRef Google Scholar PubMed

Stam, P. (1980). The distribution of the fraction of the genome identical by descent in finite populations. Genetical Research 35, 131–155.CrossRef Google Scholar

Stam, P. & Zeven, A. C. (1981). The theoretical proportion of the donor genome in near-isogenic lines of self fertilizers bred by backcrossing. Euphytica 30, 227–238.CrossRef Google Scholar

Stefanov, V. T. (2000). Distribution of genome shared identical by descent by two individuals in grandparent-type relationship. Genetics 156, 1403–1410.CrossRef Google Scholar PubMed

Stefanov, V. T. (2004). Distribution of the amount of genetic material from a chromosome segment surviving to the following generation. Journal of Applied Probability 41, 345–354.CrossRef Google Scholar

Thompson, E. A. (2008). The IBD process along four chromosomes. Theoretical Population Biology 73, 369–373.CrossRef Google Scholar PubMed

Visscher, P. M. (2009). Whole genome approaches to quantitative genetics. Genetica 136, 351–358.CrossRef Google Scholar PubMed

Visscher, P. M., Macgregor, S., Benyamin, B., Zhu, G., Gordon, S., Medland, S., Hill, W. G., Hottenga, J.-J., Willemsen, G., Boomsma, D. I., Liu, Y.-Z., Deng, H.-W., Montgomery, G. W. & Martin, N. G. (2007). Genome partitioning of genetic variation for height from 11,214 sibling pairs. American Journal of Human Genetics 81, 1104–1110.CrossRef Google Scholar

Visscher, P. M., Medland, S. E., Ferreira, M. A. R., Morley, K. I., Zhu, G., Cornes, B. K., Montgomery, G. W. & Martin, N. G. (2006). Assumption-free estimation of heritability from genome-wide identity-by-descent sharing between full siblings. PLoS Genetics 2, e41. doi: 10.1371/journal.pgen.0020041CrossRef Google Scholar PubMed

Weir, B. S., Anderson, A. D. & Hepler, A. B. (2006). Genetic relatedness analysis: modern data and new challenges. Nature Reviews Genetics 7, 771–780.CrossRef Google Scholar PubMed

Weir, B. S., Avery, P. J. & Hill, W. G. (1980). Effect of mating structure on variation in inbreeding. Theoretical Population Biology 18, 396–429.CrossRef Google Scholar

Weir, B. S., Cardon, L. R., Anderson, A. D., Nielsen, D. M. & Hill, W. G. (2005). Measures of human population structure show heterogeneity among genomic regions. Genome Research 15, 1468–1476.CrossRef Google Scholar PubMed

Wright, S. (1922). Coefficients of inbreeding and relationship. American Naturalist 56, 330–338.CrossRef Google Scholar

Yang, J., Benyamin, B., McEvoy, B. P., Gordon, S., Henders, A. K., Nyhot, D. R., Madden, P. A., Heath, A. C., Martin, N. G., Montgomery, G. W., Goddard, M. E. & Visscher, P. M. (2010). Common SNPs explain a large proportion of the heritability for human height. Nature Genetics 42, 565–569.CrossRef Google Scholar PubMed

Yu, J. M., Pressoir, G., Briggs, W. H., Bi, I. V., Yamasaki, M., Doebley, J. F., McMullen, M. D., Gaut, B. S., Nielsen, D. M., Holland, J. B., Kresovich, S. & Buckler, E. S. (2006). A unified mixed-model method for association mapping that accounts for multiple levels of relatedness. Nature Genetics 38, 203–208.CrossRef Google Scholar PubMed

Table 1. Expectations and variances for actual identity at individual loci

Fig. 1. Examples of relationship

Fig. 2. (a) SD and (b) CV of actual relationship (proportion of genome shared, {\u {\it R}} \equals \u {\it k} _{\setnum{2}} \plus {\textstyle{1 \over 2}}\u {\it k} _{\setnum{1}} ), for a single chromosome as a function of map length and relationship for full sibs (FS) and their descendants: uncle nephew (UN), cousins (C), cousins once removed (C1R), second cousins (2C), second cousins once removed (2C1R) and third cousins (3C).

Fig. 3. SD of actual relationship (proportion of genome shared, \u {\it R} \equals \u {\it k} _{\setnum{2}} \plus {\textstyle{1 \over 2}}\u {\it k} _{\setnum{1}} ), for a single chromosome as a function of map length and relationship for three different pedigrees for two different pedigree relationships: R=0·125: great grandparent–great grandoffspring (GGPGGO), half-uncle–nephew (HUN), great uncle–great nephew (GUGN), cousins (C); and R=0·03125: greatgreatgreat grandparent–GGGGoffspring (G4PG4O), half–cousins once removed (HC1R) and second cousins (2C).

Table 2. sd of actual relationship \lpar \u {R} \equals \u {k} _{\setnum{2}} \plus {\textstyle{1 \over 2}}\u {k} _{\setnum{1}} \rpar for a model human genome for different pedigree relationships (R=2θ)

Table 3. SD of actual inbreeding for a model human genomea for matings of relatives

Fig. 5. Distribution of actual genome sharing (1) for samples of ‘human’ genomes for different degrees of pedigree relationship of descendants of full sibs (as Fig. 2) (10 000 replicates each).

Fig. 6. Estimated ibd coefficients, 0 and 1, from SNP data for individuals with known pedigree relationship (PO denotes parent-offspring, DFC double first cousins, other symbols as Figs 2 and 3), together with predicted ‘error bars’ of two SD about expectation. Bars are offset from k0+k1=1 if k2=0.

Article contents

Variation in actual relationship as a consequence of Mendelian sampling and linkage

Summary

1. Introduction

2. General formulae for variance of genome sharing of non-inbred individuals

(i) Background theory

(ii) Lineal descendants

Box 1. Summary of formulae for variances of genome sharing.

A. Unilineal relatives (k 2=0 and )

Lineal descendants

Half-sibs and their descendants

Descendants of full sibs

B. Bilineal relatives (k 2≠0)

Full sibs

Double first cousins

(iii) Half-sibs and their descendants

(a) General formulation

(b) Half-uncle nephew and descendants

(iv) Lineal descendants of full-sibs

(a) Uncle–nephew

(b) Uncle and descendants of a nephew

(c) Cousins

(d) Descendants of cousins

(v) Bilineal relatives

(a) General methodology

(b) Full sibs

(c) Double first cousins

(d) Mothers full sibs, fathers first cousins

(vi) Sex-related phenomena

(a) Differences in map length between sexes

(b) Sex limited recombination

(c) Sex chromosomes

(vii) Examples

3. Skew of the distribution of genome sharing

(i) Methods

Box 2. Summary of formulae for skew of genome sharing

(ii) Examples

4. Variation in actual inbreeding

5. Discussion

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests

Box 1. Summary of formulae for variances of genome sharing. $R \equals \left( {{\textstyle{1 \over 2}}} \right)^{g}$

A. Unilineal relatives (k ₂=0 and ${\rm Var\lpar }\u {R} \comma l\rpar \equals {\textstyle{1 \over 4}}{\rm Var\lpar }\u {k} _{\setnum{1}} \comma l\rpar$ )

B. Bilineal relatives (k ₂≠0)