Estimating genomic relationships of metafounders across and within breeds using maximum likelihood, pseudo-expectation–maximization maximum likelihood and increase of relationships
release_abxenrbxszhvnag2fghpgh2j7u
by
Andres Legarra,
Matias Bermann,
Quanshun Mei,
Ole F. Christensen
2024 Volume 56, Issue 1
Abstract
<jats:title>Abstract</jats:title><jats:sec>
<jats:title>Background</jats:title>
The theory of "metafounders" proposes a unified framework for relationships across base populations within breeds (e.g. unknown parent groups), and base populations across breeds (crosses) together with a sensible compatibility with genomic relationships. Considering metafounders might be advantageous in pedigree best linear unbiased prediction (BLUP) or single-step genomic BLUP. Existing methods to estimate relationships across metafounders <jats:inline-formula><jats:alternatives><jats:tex-math>$${\varvec{\Gamma}}$$</jats:tex-math><mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML">
<mml:mrow>
<mml:mi>Γ</mml:mi>
</mml:mrow>
</mml:math></jats:alternatives></jats:inline-formula> are not well adapted to highly unbalanced data, genotyped individuals far from base populations, or many unknown parent groups (within breed per year of birth).
</jats:sec><jats:sec>
<jats:title>Methods</jats:title>
We derive likelihood methods to estimate <jats:inline-formula><jats:alternatives><jats:tex-math>$${\varvec{\Gamma}}$$</jats:tex-math><mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML">
<mml:mrow>
<mml:mi>Γ</mml:mi>
</mml:mrow>
</mml:math></jats:alternatives></jats:inline-formula>. For a single metafounder, summary statistics of pedigree and genomic relationships allow deriving a cubic equation with the real root being the maximum likelihood (ML) estimate of <jats:inline-formula><jats:alternatives><jats:tex-math>$${\varvec{\Gamma}}$$</jats:tex-math><mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML">
<mml:mrow>
<mml:mi>Γ</mml:mi>
</mml:mrow>
</mml:math></jats:alternatives></jats:inline-formula>. This equation is tested with Lacaune sheep data. For several metafounders, we split the first derivative of the complete likelihood in a term related to <jats:inline-formula><jats:alternatives><jats:tex-math>$${\varvec{\Gamma}}$$</jats:tex-math><mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML">
<mml:mrow>
<mml:mi>Γ</mml:mi>
</mml:mrow>
</mml:math></jats:alternatives></jats:inline-formula>, and a second term related to Mendelian sampling variances. Approximating the first derivative by its first term results in a pseudo-EM algorithm that iteratively updates the estimate of <jats:inline-formula><jats:alternatives><jats:tex-math>$${\varvec{\Gamma}}$$</jats:tex-math><mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML">
<mml:mrow>
<mml:mi>Γ</mml:mi>
</mml:mrow>
</mml:math></jats:alternatives></jats:inline-formula> by the corresponding block of the <jats:bold>H</jats:bold>-matrix. The method extends to complex situations with groups defined by year of birth, modelling the increase of <jats:inline-formula><jats:alternatives><jats:tex-math>$${\varvec{\Gamma}}$$</jats:tex-math><mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML">
<mml:mrow>
<mml:mi>Γ</mml:mi>
</mml:mrow>
</mml:math></jats:alternatives></jats:inline-formula> using estimates of the rate of increase of inbreeding (<jats:inline-formula><jats:alternatives><jats:tex-math>$$\Delta F$$</jats:tex-math><mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML">
<mml:mrow>
<mml:mi>Δ</mml:mi>
<mml:mi>F</mml:mi>
</mml:mrow>
</mml:math></jats:alternatives></jats:inline-formula>), resulting in an expanded <jats:inline-formula><jats:alternatives><jats:tex-math>$${\varvec{\Gamma}}$$</jats:tex-math><mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML">
<mml:mrow>
<mml:mi>Γ</mml:mi>
</mml:mrow>
</mml:math></jats:alternatives></jats:inline-formula> and in a pseudo-EM+<jats:inline-formula><jats:alternatives><jats:tex-math>$$\Delta F$$</jats:tex-math><mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML">
<mml:mrow>
<mml:mi>Δ</mml:mi>
<mml:mi>F</mml:mi>
</mml:mrow>
</mml:math></jats:alternatives></jats:inline-formula> algorithm. We compare these methods with the generalized least squares (GLS) method using simulated data: complex crosses of two breeds in equal or unsymmetrical proportions; and in two breeds, with 10 groups per year of birth within breed. We simulate genotyping in all generations or in the last ones.
</jats:sec><jats:sec>
<jats:title>Results</jats:title>
For a single metafounder, the ML estimates of the Lacaune data corresponded to the maximum. For simulated data, when genotypes were spread across all generations, both GLS and pseudo-EM(+<jats:inline-formula><jats:alternatives><jats:tex-math>$$\Delta F$$</jats:tex-math><mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML">
<mml:mrow>
<mml:mi>Δ</mml:mi>
<mml:mi>F</mml:mi>
</mml:mrow>
</mml:math></jats:alternatives></jats:inline-formula>) methods were accurate. With genotypes only available in the most recent generations, the GLS method was biased, whereas the pseudo-EM(+<jats:inline-formula><jats:alternatives><jats:tex-math>$$\Delta F$$</jats:tex-math><mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML">
<mml:mrow>
<mml:mi>Δ</mml:mi>
<mml:mi>F</mml:mi>
</mml:mrow>
</mml:math></jats:alternatives></jats:inline-formula>) approach yielded more accurate and unbiased estimates.
</jats:sec><jats:sec>
<jats:title>Conclusions</jats:title>
We derived ML, pseudo-EM and pseudo-EM+<jats:inline-formula><jats:alternatives><jats:tex-math>$$\Delta F$$</jats:tex-math><mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML">
<mml:mrow>
<mml:mi>Δ</mml:mi>
<mml:mi>F</mml:mi>
</mml:mrow>
</mml:math></jats:alternatives></jats:inline-formula> methods to estimate <jats:inline-formula><jats:alternatives><jats:tex-math>$${\varvec{\Gamma}}$$</jats:tex-math><mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML">
<mml:mrow>
<mml:mi>Γ</mml:mi>
</mml:mrow>
</mml:math></jats:alternatives></jats:inline-formula> in many realistic settings. Estimates are accurate in real and simulated data and have a low computational cost.
</jats:sec>
In application/xml+jats
format
Archived Files and Locations
application/pdf
2.1 MB
file_hmeqvbavmnh45ktbbyv3bdlzaa
|
gsejournal.biomedcentral.com (publisher) web.archive.org (webarchive) |
Open Access Publication
In DOAJ
In ISSN ROAD
In Keepers Registry
ISSN-L:
0999-193X
access all versions, variants, and formats of this works (eg, pre-prints)
Crossref Metadata (via API)
Worldcat
SHERPA/RoMEO (journal policies)
wikidata.org
CORE.ac.uk
Semantic Scholar
Google Scholar