Menno J. Jong
SambaR: An R package for fast, easy and reproducible population‐genetic analyses of biallelic SNP data sets
Jong, Menno J.; Jong, Joost F.; Hoelzel, A. Rus; Janke, Axel
Abstract
SNP data sets can be used to infer a wealth of information about natural populations, including information about their structure, genetic diversity, and the presence of loci under selection. However, SNP data analysis can be a time-consuming and challenging process, not in the least because at present many different software packages are needed to execute and depict the wide variety of mainstream population-genetic analyses. Here, we present SambaR, an integrative and user-friendly R package which automates and simplifies quality control and population-genetic analyses of biallelic SNP data sets. SambaR allows users to perform mainstream population-genetic analyses and to generate a wide variety of ready to publish graphs with a minimum number of commands (less than 10). These wrapper commands call functions of existing packages (including adegenet, ape, LEA, poppr, pcadapt and StAMPP) as well as new tools uniquely implemented in SambaR. We tested SambaR on online available SNP data sets and found that SambaR can process data sets of over 100,000 SNPs and hundreds of individuals within hours, given sufficient computing power. Newly developed tools implemented in SambaR facilitate optimization of filter settings, objective interpretation of ordination analyses, enhance comparability of diversity estimates from reduced representation library SNP data sets, and generate reduced SNP panels and structure-like plots with Bayesian population assignment probabilities. SambaR facilitates rapid population genetic analyses on biallelic SNP data sets by removing three major time sinks: file handling, software learning, and data plotting. In addition, SambaR provides a convenient platform for SNP data storage and management, as well as several new utilities, including guidance in setting appropriate data filters. The SambaR source script, manual and example data set are distributed through GitHub: https://github.com/mennodejong1986/SambaR.
Citation
Jong, M. J., Jong, J. F., Hoelzel, A. R., & Janke, A. (2021). SambaR: An R package for fast, easy and reproducible population‐genetic analyses of biallelic SNP data sets. Molecular Ecology Resources, 21(4), https://doi.org/10.1111/1755-0998.13339
Journal Article Type | Article |
---|---|
Acceptance Date | Jan 12, 2021 |
Online Publication Date | Jan 27, 2021 |
Publication Date | 2021 |
Deposit Date | Sep 20, 2021 |
Publicly Available Date | Sep 20, 2021 |
Journal | Molecular Ecology Resources |
Print ISSN | 1755-098X |
Electronic ISSN | 1755-0998 |
Publisher | Wiley |
Peer Reviewed | Peer Reviewed |
Volume | 21 |
Issue | 4 |
DOI | https://doi.org/10.1111/1755-0998.13339 |
Public URL | https://durham-repository.worktribe.com/output/1241113 |
Files
Published Journal Article
(3.9 Mb)
PDF
Publisher Licence URL
http://creativecommons.org/licenses/by/4.0/
Copyright Statement
This is an open access article under the terms of the Creative Commons Attribution-NonCommercial License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited and is not used for commercial purposes.
You might also like
Sex-specific impact of inbreeding on pathogen load in the striped dolphin
(2020)
Journal Article
Genomics of habitat choice and adaptive evolution in a deep-sea fish
(2018)
Journal Article