Dr Louis Aslett louis.aslett@durham.ac.uk
Associate Professor
kalis: a modern implementation of the Li & Stephens model for local ancestry inference in R
Aslett, Louis J. M.; Christ, Ryan R.
Authors
Ryan R. Christ
Abstract
Background: Approximating the recent phylogeny of N phased haplotypes at a set of variants along the genome is a core problem in modern population genomics and central to performing genome-wide screens for association, selection, introgression, and other signals. The Li & Stephens (LS) model provides a simple yet powerful hidden Markov model for inferring the recent ancestry at a given variant, represented as an N×N distance matrix based on posterior decodings. Results: We provide a high-performance engine to make these posterior decodings readily accessible with minimal pre-processing via an easy to use package kalis, in the statistical programming language R. kalis enables investigators to rapidly resolve the ancestry at loci of interest and developers to build a range of variant-specific ancestral inference pipelines on top. kalis exploits both multi-core parallelism and modern CPU vector instruction sets to enable scaling to hundreds of thousands of genomes. Conclusions: The resulting distance matrices accessible via kalis enable local ancestry, selection, and association studies in modern large scale genomic datasets.
Citation
Aslett, L. J. M., & Christ, R. R. (2024). kalis: a modern implementation of the Li & Stephens model for local ancestry inference in R. BMC Bioinformatics, 25(1), Article 86. https://doi.org/10.1186/s12859-024-05688-8
Journal Article Type | Article |
---|---|
Acceptance Date | Feb 1, 2024 |
Online Publication Date | Feb 28, 2024 |
Publication Date | Feb 28, 2024 |
Deposit Date | Apr 17, 2024 |
Publicly Available Date | Apr 17, 2024 |
Journal | BMC Bioinformatics |
Electronic ISSN | 1471-2105 |
Publisher | BioMed Central |
Peer Reviewed | Peer Reviewed |
Volume | 25 |
Issue | 1 |
Article Number | 86 |
DOI | https://doi.org/10.1186/s12859-024-05688-8 |
Keywords | Li & Stephens model, High performance computation, R package, Hidden Markov model, Probabilistic haplotype model, Genomics |
Public URL | https://durham-repository.worktribe.com/output/2292412 |
Files
Published Journal Article
(2.1 Mb)
PDF
Publisher Licence URL
http://creativecommons.org/licenses/by/4.0/
You might also like
Ethical considerations of use of hold-out sets in clinical prediction model management
(2024)
Journal Article
ANCA-associated vasculitis in Ireland: a multi-centre national cohort study
(2022)
Journal Article
Downloadable Citations
About Durham Research Online (DRO)
Administrator e-mail: dro.admin@durham.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2024
Advanced Search