S. Bonner
Using Hadoop To Implement a Semantic Method Of Assessing The Quality Of Research Medical Datasets
Bonner, S.; Antoniou, G.; Moss, L.; Kureshi, I.; Corsair, D.; Tachmazidis, I.; Chin, Alvin; Xu, Wei; Wang, Fei
Authors
G. Antoniou
L. Moss
I. Kureshi
D. Corsair
I. Tachmazidis
Alvin Chin
Wei Xu
Fei Wang
Abstract
In this paper a system for storing and querying medical RDF data using Hadoop is developed. This approach enables us to create an inherently parallel framework that will scale the workload across a cluster. Unlike existing solutions, our framework uses highly optimised joining strategies to enable the completion of eight separate SPAQL queries, comprised of over eighty distinct joins, in only two Map/Reduce iterations. Results are presented comparing an optimised version of our solution against Jena TDB, demonstrating the superior performance of our system and its viability for assessing the quality of medical data.
Citation
Bonner, S., Antoniou, G., Moss, L., Kureshi, I., Corsair, D., Tachmazidis, I., Chin, A., Xu, W., & Wang, F. (2014, August). Using Hadoop To Implement a Semantic Method Of Assessing The Quality Of Research Medical Datasets. Presented at The 2014 International Conference on Big Data Science and Computing - BigDataScience '14., Beijing, China
Presentation Conference Type | Conference Paper (published) |
---|---|
Conference Name | The 2014 International Conference on Big Data Science and Computing - BigDataScience '14. |
Start Date | Aug 4, 2014 |
End Date | Aug 7, 2014 |
Publication Date | Aug 7, 2014 |
Deposit Date | May 15, 2015 |
Publicly Available Date | Mar 3, 2016 |
Publisher | Association for Computing Machinery (ACM) |
Series Title | ACM international conference proceedings series |
Book Title | Proceedings of the 3rd ASE International Conference on Big Data Science and Computing : 2014, Beijing, China : BigDataScience '14. |
DOI | https://doi.org/10.1145/2640087.2644163 |
Public URL | https://durham-repository.worktribe.com/output/1152986 |
Files
Accepted Conference Proceeding
(576 Kb)
PDF
Copyright Statement
© 2014 ACM. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in Proceedings of the 3rd ASE International Conference on Big Data Science and Computing : 2014, Beijing, China : BigDataScience '14. New York, USA: Association for Computing Machinery (ACM), Article No. 7, http://doi.acm.org/10.1145/10.1145/2640087.2644163
You might also like
Advancing Research Infrastructure Using OpenStack
(2013)
Journal Article
On the Classification of SSVEP-Based Dry-EEG Signals via Convolutional Neural Networks
(2018)
Presentation / Conference Contribution
Deep Topology Classification: A New Approach for Massive Graph Classification
(2017)
Presentation / Conference Contribution
Efficient Comparison of Massive Graphs Through The Use Of 'Graph Fingerprints'
(2016)
Presentation / Conference Contribution
Data Quality Assessment and Anomaly Detection Via Map / Reduce and Linked Data: A Case Study in the Medical Domain
(2015)
Presentation / Conference Contribution
Downloadable Citations
About Durham Research Online (DRO)
Administrator e-mail: dro.admin@durham.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2024
Advanced Search