A rapid, accurate approach to inferring pedigrees in endogamous populations [article]

Cole M Williams, Brooke Scelza, Christopher R Gignoux, Brenna M Henn
2020 bioRxiv   pre-print
Accurate reconstruction of pedigrees from genetic data remains a challenging problem. Pedigree inference algorithms are often trained only on urban European-descent families, which are comparatively 'outbred' compared to many other global populations. Relationship categories can be difficult to distinguish (e.g. half-sibships versus avuncular) without external information. Furthermore, published soft- ware cannot accommodate endogamous populations where there may be reticulations within a
more » ... ee (i.e. inbreeding) or elevated haplotype sharing. We design a simple, rapid algorithm which initially uses only high-confidence first degree relationships to seed a machine learning step based on the number of identical by descent segments. Additionally, we define a new statistic to polarize individuals to ancestor versus descendant generation. We test our approach in a sample of 700 individuals from northern Namibia, sampled from an endogamous population. Due to a culture of concurrent relationships in this population, there is a high proportion of half-sibships. We accurately identify first through third degree relationships for all categories, including half-sibships, half-avuncular-ships etc. Accurate reconstruction of pedigrees holds promise for tracing allele frequency trajectories, improved phasing and other population genomic questions.
doi:10.1101/2020.02.25.965376 fatcat:gbrtf2yepbawjgi5tiplebzsnm