Stefanie Warnat-Herresthal
Scalable prediction of acute myeloid leukemia using high-dimensional machine learning and blood transcriptomics
Warnat-Herresthal, Stefanie; Perrakis, Konstantinos; Taschler, Bernd; Becker, Matthias; Baßler, Kevin; Beyer, Marc; Günther, Patrick; Schulte-Schrepping, Jonas; Seep, Lea; Klee, Kathrin; Ulas, Thomas; Haferlach, Torsten; Mukherjee, Sach; Schultze, Joachim L.
Authors
Dr Konstantinos Perrakis konstantinos.perrakis@durham.ac.uk
Assistant Professor
Bernd Taschler
Matthias Becker
Kevin Baßler
Marc Beyer
Patrick Günther
Jonas Schulte-Schrepping
Lea Seep
Kathrin Klee
Thomas Ulas
Torsten Haferlach
Sach Mukherjee
Joachim L. Schultze
Abstract
Acute myeloid leukemia (AML) is a severe, mostly fatal hematopoietic malignancy. We were interested in whether transcriptomic-based machine learning could predict AML status without requiring expert input. Using 12,029 samples from 105 different studies, we present a large-scale study of machine learning-based prediction of AML in which we address key questions relating to the combination of machine learning and transcriptomics and their practical use. We find data-driven, high-dimensional approaches—in which multivariate signatures are learned directly from genome-wide data with no prior knowledge—to be accurate and robust. Importantly, these approaches are highly scalable with low marginal cost, essentially matching human expert annotation in a near-automated workflow. Our results support the notion that transcriptomics combined with machine learning could be used as part of an integrated -omics approach wherein risk prediction, differential diagnosis, and subclassification of AML are achieved by genomics while diagnosis could be assisted by transcriptomic-based machine learning.
Citation
Warnat-Herresthal, S., Perrakis, K., Taschler, B., Becker, M., Baßler, K., Beyer, M., Günther, P., Schulte-Schrepping, J., Seep, L., Klee, K., Ulas, T., Haferlach, T., Mukherjee, S., & Schultze, J. L. (2020). Scalable prediction of acute myeloid leukemia using high-dimensional machine learning and blood transcriptomics. iScience, 23(1), Article 100780. https://doi.org/10.1016/j.isci.2019.100780
Journal Article Type | Article |
---|---|
Acceptance Date | Dec 12, 2019 |
Online Publication Date | Dec 18, 2019 |
Publication Date | Jan 24, 2020 |
Deposit Date | Jun 10, 2020 |
Publicly Available Date | Jun 18, 2020 |
Journal | iScience |
Publisher | Cell Press |
Peer Reviewed | Peer Reviewed |
Volume | 23 |
Issue | 1 |
Article Number | 100780 |
DOI | https://doi.org/10.1016/j.isci.2019.100780 |
Public URL | https://durham-repository.worktribe.com/output/1262894 |
Related Public URLs | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6992905/ |
Files
Published Journal Article
(3.9 Mb)
PDF
Publisher Licence URL
http://creativecommons.org/licenses/by-nc-nd/4.0/
Copyright Statement
© 2020 The Authors.This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
You might also like
Proceedings of the 38th International Workshop on Statistical Modelling
(2024)
Presentation / Conference Contribution
Developments in Statistical Modelling
(2024)
Book
Regularized joint mixture models
(2023)
Journal Article
Variations of power-expected-posterior priors in normal regression models
(2019)
Journal Article
Scalable Bayesian regression in high dimensions with multiple data sources
(2019)
Journal Article