Estimating genomic relationships of metafounders across and within breeds using maximum likelihood, pseudo-expectation–maximization maximum likelihood and increase of relationships release_abxenrbxszhvnag2fghpgh2j7u

by Andres Legarra, Matias Bermann, Quanshun Mei, Ole F. Christensen

Published in Genetics Selection Evolution by Springer Science and Business Media LLC.

2024   Volume 56, Issue 1

Abstract

<jats:title>Abstract</jats:title><jats:sec> <jats:title>Background</jats:title> The theory of "metafounders" proposes a unified framework for relationships across base populations within breeds (e.g. unknown parent groups), and base populations across breeds (crosses) together with a sensible compatibility with genomic relationships. Considering metafounders might be advantageous in pedigree best linear unbiased prediction (BLUP) or single-step genomic BLUP. Existing methods to estimate relationships across metafounders <jats:inline-formula><jats:alternatives><jats:tex-math>$${\varvec{\Gamma}}$$</jats:tex-math><mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"> <mml:mrow> <mml:mi>Γ</mml:mi> </mml:mrow> </mml:math></jats:alternatives></jats:inline-formula> are not well adapted to highly unbalanced data, genotyped individuals far from base populations, or many unknown parent groups (within breed per year of birth). </jats:sec><jats:sec> <jats:title>Methods</jats:title> We derive likelihood methods to estimate <jats:inline-formula><jats:alternatives><jats:tex-math>$${\varvec{\Gamma}}$$</jats:tex-math><mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"> <mml:mrow> <mml:mi>Γ</mml:mi> </mml:mrow> </mml:math></jats:alternatives></jats:inline-formula>. For a single metafounder, summary statistics of pedigree and genomic relationships allow deriving a cubic equation with the real root being the maximum likelihood (ML) estimate of <jats:inline-formula><jats:alternatives><jats:tex-math>$${\varvec{\Gamma}}$$</jats:tex-math><mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"> <mml:mrow> <mml:mi>Γ</mml:mi> </mml:mrow> </mml:math></jats:alternatives></jats:inline-formula>. This equation is tested with Lacaune sheep data. For several metafounders, we split the first derivative of the complete likelihood in a term related to <jats:inline-formula><jats:alternatives><jats:tex-math>$${\varvec{\Gamma}}$$</jats:tex-math><mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"> <mml:mrow> <mml:mi>Γ</mml:mi> </mml:mrow> </mml:math></jats:alternatives></jats:inline-formula>, and a second term related to Mendelian sampling variances. Approximating the first derivative by its first term results in a pseudo-EM algorithm that iteratively updates the estimate of <jats:inline-formula><jats:alternatives><jats:tex-math>$${\varvec{\Gamma}}$$</jats:tex-math><mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"> <mml:mrow> <mml:mi>Γ</mml:mi> </mml:mrow> </mml:math></jats:alternatives></jats:inline-formula> by the corresponding block of the <jats:bold>H</jats:bold>-matrix. The method extends to complex situations with groups defined by year of birth, modelling the increase of <jats:inline-formula><jats:alternatives><jats:tex-math>$${\varvec{\Gamma}}$$</jats:tex-math><mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"> <mml:mrow> <mml:mi>Γ</mml:mi> </mml:mrow> </mml:math></jats:alternatives></jats:inline-formula> using estimates of the rate of increase of inbreeding (<jats:inline-formula><jats:alternatives><jats:tex-math>$$\Delta F$$</jats:tex-math><mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"> <mml:mrow> <mml:mi>Δ</mml:mi> <mml:mi>F</mml:mi> </mml:mrow> </mml:math></jats:alternatives></jats:inline-formula>), resulting in an expanded <jats:inline-formula><jats:alternatives><jats:tex-math>$${\varvec{\Gamma}}$$</jats:tex-math><mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"> <mml:mrow> <mml:mi>Γ</mml:mi> </mml:mrow> </mml:math></jats:alternatives></jats:inline-formula> and in a pseudo-EM+<jats:inline-formula><jats:alternatives><jats:tex-math>$$\Delta F$$</jats:tex-math><mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"> <mml:mrow> <mml:mi>Δ</mml:mi> <mml:mi>F</mml:mi> </mml:mrow> </mml:math></jats:alternatives></jats:inline-formula> algorithm. We compare these methods with the generalized least squares (GLS) method using simulated data: complex crosses of two breeds in equal or unsymmetrical proportions; and in two breeds, with 10 groups per year of birth within breed. We simulate genotyping in all generations or in the last ones. </jats:sec><jats:sec> <jats:title>Results</jats:title> For a single metafounder, the ML estimates of the Lacaune data corresponded to the maximum. For simulated data, when genotypes were spread across all generations, both GLS and pseudo-EM(+<jats:inline-formula><jats:alternatives><jats:tex-math>$$\Delta F$$</jats:tex-math><mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"> <mml:mrow> <mml:mi>Δ</mml:mi> <mml:mi>F</mml:mi> </mml:mrow> </mml:math></jats:alternatives></jats:inline-formula>) methods were accurate. With genotypes only available in the most recent generations, the GLS method was biased, whereas the pseudo-EM(+<jats:inline-formula><jats:alternatives><jats:tex-math>$$\Delta F$$</jats:tex-math><mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"> <mml:mrow> <mml:mi>Δ</mml:mi> <mml:mi>F</mml:mi> </mml:mrow> </mml:math></jats:alternatives></jats:inline-formula>) approach yielded more accurate and unbiased estimates. </jats:sec><jats:sec> <jats:title>Conclusions</jats:title> We derived ML, pseudo-EM and pseudo-EM+<jats:inline-formula><jats:alternatives><jats:tex-math>$$\Delta F$$</jats:tex-math><mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"> <mml:mrow> <mml:mi>Δ</mml:mi> <mml:mi>F</mml:mi> </mml:mrow> </mml:math></jats:alternatives></jats:inline-formula> methods to estimate <jats:inline-formula><jats:alternatives><jats:tex-math>$${\varvec{\Gamma}}$$</jats:tex-math><mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"> <mml:mrow> <mml:mi>Γ</mml:mi> </mml:mrow> </mml:math></jats:alternatives></jats:inline-formula> in many realistic settings. Estimates are accurate in real and simulated data and have a low computational cost. </jats:sec>
In application/xml+jats format

Archived Files and Locations

application/pdf   2.1 MB
file_hmeqvbavmnh45ktbbyv3bdlzaa
gsejournal.biomedcentral.com (publisher)
web.archive.org (webarchive)
Read Archived PDF
Preserved and Accessible
Type  article-journal
Stage   published
Date   2024-05-02
Language   en ?
Container Metadata
Open Access Publication
In DOAJ
In ISSN ROAD
In Keepers Registry
ISSN-L:  0999-193X
Work Entity
access all versions, variants, and formats of this works (eg, pre-prints)
Catalog Record
Revision: 8701c647-3f6f-41b5-a6a4-bf8b3b32f7e4
API URL: JSON