S. Bonner
Using Hadoop To Implement a Semantic Method Of Assessing The Quality Of Research Medical Datasets
Bonner, S.; Antoniou, G.; Moss, L.; Kureshi, I.; Corsair, D.; Tachmazidis, I.; Chin, Alvin; Xu, Wei; Wang, Fei
Authors
G. Antoniou
L. Moss
I. Kureshi
D. Corsair
I. Tachmazidis
Alvin Chin
Wei Xu
Fei Wang
Abstract
In this paper a system for storing and querying medical RDF data using Hadoop is developed. This approach enables us to create an inherently parallel framework that will scale the workload across a cluster. Unlike existing solutions, our framework uses highly optimised joining strategies to enable the completion of eight separate SPAQL queries, comprised of over eighty distinct joins, in only two Map/Reduce iterations. Results are presented comparing an optimised version of our solution against Jena TDB, demonstrating the superior performance of our system and its viability for assessing the quality of medical data.
Citation
Bonner, S., Antoniou, G., Moss, L., Kureshi, I., Corsair, D., Tachmazidis, I., …Wang, F. (2014). Using Hadoop To Implement a Semantic Method Of Assessing The Quality Of Research Medical Datasets. In Proceedings of the 3rd ASE International Conference on Big Data Science and Computing : 2014, Beijing, China : BigDataScience '14. https://doi.org/10.1145/2640087.2644163
Conference Name | The 2014 International Conference on Big Data Science and Computing - BigDataScience '14. |
---|---|
Conference Location | Beijing, China |
Start Date | Aug 4, 2014 |
End Date | Aug 7, 2014 |
Publication Date | Aug 7, 2014 |
Deposit Date | May 15, 2015 |
Publicly Available Date | Mar 3, 2016 |
Publisher | Association for Computing Machinery (ACM) |
Series Title | ACM international conference proceedings series |
Book Title | Proceedings of the 3rd ASE International Conference on Big Data Science and Computing : 2014, Beijing, China : BigDataScience '14. |
DOI | https://doi.org/10.1145/2640087.2644163 |
Files
Accepted Conference Proceeding
(576 Kb)
PDF
Copyright Statement
© 2014 ACM. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in Proceedings of the 3rd ASE International Conference on Big Data Science and Computing : 2014, Beijing, China : BigDataScience '14. New York, USA: Association for Computing Machinery (ACM), Article No. 7, http://doi.acm.org/10.1145/10.1145/2640087.2644163
You might also like
Deep Topology Classification: A New Approach for Massive Graph Classification
(2017)
Conference Proceeding
Efficient Comparison of Massive Graphs Through The Use Of 'Graph Fingerprints'
(2016)
Conference Proceeding
Data Quality Assessment and Anomaly Detection Via Map / Reduce and Linked Data: A Case Study in the Medical Domain
(2015)
Conference Proceeding
PBStoHTCondor system for campus grids
(2015)
Conference Proceeding
Advancing Research Infrastructure Using OpenStack
(2013)
Journal Article
Downloadable Citations
About Durham Research Online (DRO)
Administrator e-mail: dro.admin@durham.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2024
Advanced Search