Lorna Richardson
MGnify: the microbiome sequence data analysis resource in 2023
Richardson, Lorna; Allen, Ben; Baldi, Germana; Beracochea, Martin; Bileschi, Maxwell L; Burdett, Tony; Burgin, Josephine; Caballero-Pérez, Juan; Cochrane, Guy; Colwell, Lucy J; Curtis, Tom; Escobar-Zepeda, Alejandra; Gurbich, Tatiana A; Kale, Varsha; Korobeynikov, Anton; Raj, Shriya; Rogers, Alexander B; Sakharova, Ekaterina; Sanchez, Santiago; Wilkinson, Darren J; Finn, Robert D
Authors
Ben Allen
Germana Baldi
Martin Beracochea
Maxwell L Bileschi
Tony Burdett
Josephine Burgin
Juan Caballero-Pérez
Guy Cochrane
Lucy J Colwell
Tom Curtis
Alejandra Escobar-Zepeda
Tatiana A Gurbich
Varsha Kale
Anton Korobeynikov
Shriya Raj
Alexander B Rogers
Ekaterina Sakharova
Santiago Sanchez
Professor Darren Wilkinson darren.j.wilkinson@durham.ac.uk
Professor
Robert D Finn
Abstract
The MGnify platform (https://www.ebi.ac.uk/metagenomics) facilitates the assembly, analysis and archiving of microbiome-derived nucleic acid sequences. The platform provides access to taxonomic assignments and functional annotations for nearly half a million analyses covering metabarcoding, metatranscriptomic, and metagenomic datasets, which are derived from a wide range of different environments. Over the past 3 years, MGnify has not only grown in terms of the number of datasets contained but also increased the breadth of analyses provided, such as the analysis of long-read sequences. The MGnify protein database now exceeds 2.4 billion non-redundant sequences predicted from metagenomic assemblies. This collection is now organised into a relational database making it possible to understand the genomic context of the protein through navigation back to the source assembly and sample metadata, marking a major improvement. To extend beyond the functional annotations already provided in MGnify, we have applied deep learning-based annotation methods. The technology underlying MGnify's Application Programming Interface (API) and website has been upgraded, and we have enabled the ability to perform downstream analysis of the MGnify data through the introduction of a coupled Jupyter Lab environment.
Citation
Richardson, L., Allen, B., Baldi, G., Beracochea, M., Bileschi, M., Burdett, T., …Finn, R. (2023). MGnify: the microbiome sequence data analysis resource in 2023. Nucleic Acids Research, 51(D1), D753-D759. https://doi.org/10.1093/nar/gkac1080
Journal Article Type | Article |
---|---|
Acceptance Date | Nov 1, 2022 |
Online Publication Date | Dec 7, 2022 |
Publication Date | Jan 6, 2023 |
Deposit Date | Dec 3, 2023 |
Publicly Available Date | Dec 6, 2023 |
Journal | Nucleic Acids Research |
Print ISSN | 0305-1048 |
Electronic ISSN | 1362-4962 |
Publisher | Oxford University Press |
Peer Reviewed | Peer Reviewed |
Volume | 51 |
Issue | D1 |
Pages | D753-D759 |
DOI | https://doi.org/10.1093/nar/gkac1080 |
Public URL | https://durham-repository.worktribe.com/output/1980428 |
Files
Published Journal Article
(1.9 Mb)
PDF
Licence
http://creativecommons.org/licenses/by/4.0/
Publisher Licence URL
http://creativecommons.org/licenses/by/4.0/
Copyright Statement
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
You might also like
A Bayesian spatio‐temporal model for short‐term forecasting of precipitation fields
(2023)
Journal Article
Downloadable Citations
About Durham Research Online (DRO)
Administrator e-mail: dro.admin@durham.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2025
Advanced Search