Skip to main content

Research Repository

Advanced Search

kalis: a modern implementation of the Li & Stephens model for local ancestry inference in R

Aslett, Louis J. M.; Christ, Ryan R.

kalis: a modern implementation of the Li & Stephens model for local ancestry inference in R Thumbnail


Authors

Ryan R. Christ



Abstract

Background: Approximating the recent phylogeny of N phased haplotypes at a set of variants along the genome is a core problem in modern population genomics and central to performing genome-wide screens for association, selection, introgression, and other signals. The Li & Stephens (LS) model provides a simple yet powerful hidden Markov model for inferring the recent ancestry at a given variant, represented as an N×N distance matrix based on posterior decodings. Results: We provide a high-performance engine to make these posterior decodings readily accessible with minimal pre-processing via an easy to use package kalis, in the statistical programming language R. kalis enables investigators to rapidly resolve the ancestry at loci of interest and developers to build a range of variant-specific ancestral inference pipelines on top. kalis exploits both multi-core parallelism and modern CPU vector instruction sets to enable scaling to hundreds of thousands of genomes. Conclusions: The resulting distance matrices accessible via kalis enable local ancestry, selection, and association studies in modern large scale genomic datasets.

Citation

Aslett, L. J. M., & Christ, R. R. (2024). kalis: a modern implementation of the Li & Stephens model for local ancestry inference in R. BMC Bioinformatics, 25(1), Article 86. https://doi.org/10.1186/s12859-024-05688-8

Journal Article Type Article
Acceptance Date Feb 1, 2024
Online Publication Date Feb 28, 2024
Publication Date Feb 28, 2024
Deposit Date Apr 17, 2024
Publicly Available Date Apr 17, 2024
Journal BMC Bioinformatics
Electronic ISSN 1471-2105
Publisher BioMed Central
Peer Reviewed Peer Reviewed
Volume 25
Issue 1
Article Number 86
DOI https://doi.org/10.1186/s12859-024-05688-8
Keywords Li & Stephens model, High performance computation, R package, Hidden Markov model, Probabilistic haplotype model, Genomics
Public URL https://durham-repository.worktribe.com/output/2292412

Files





You might also like



Downloadable Citations