Cookies on this website

We use cookies to ensure that we give you the best experience on our website. If you click 'Accept all cookies' we'll assume that you are happy to receive all cookies and you won't see this message again. If you click 'Reject all non-essential cookies' only necessary cookies providing core functionality such as security, network management, and accessibility will be enabled. Click 'Find out more' for information on how to change your cookie settings.

Increasingly efficient methods for inferring the ancestral origin of genome regions are needed to gain insights into genetic function and history as biobanks grow in scale. Here we describe two near-linear time algorithms to learn ancestry harnessing the strengths of a Positional Burrows-Wheeler Transform. SparsePainter is a faster, sparse replacement of previous model-based 'chromosome painting' algorithms to identify recently shared haplotypes, whilst PBWTpaint uses further approximations to obtain lightning-fast estimation optimized for genome-wide relatedness estimation. The computational efficiency gains of these tools for fine-scale local ancestry inference offer the possibility to analyse large-scale genomic datasets using different approaches. Application to the UK Biobank shows that haplotypes better represent ancestries than principal components, whilst linkage-disequilibrium of ancestry identifies signals of recent changes to population-specific selection for many genomic regions associated with immune responses, suggesting avenues for understanding the pathogen-immune system interplay on a historical timescale.

Original publication

DOI

10.1038/s41467-025-57601-3

Type

Journal

Nat Commun

Publication Date

20/03/2025

Volume

16

Keywords

Humans, Haplotypes, Algorithms, Linkage Disequilibrium, Polymorphism, Single Nucleotide, Genome, Human, Selection, Genetic, United Kingdom, Genome-Wide Association Study, Genetics, Population, Immunity, Models, Genetic