Skip to main content

Research Repository

Advanced Search

Using Hadoop To Implement a Semantic Method Of Assessing The Quality Of Research Medical Datasets

Bonner, S.; Antoniou, G.; Moss, L.; Kureshi, I.; Corsair, D.; Tachmazidis, I.; Chin, Alvin; Xu, Wei; Wang, Fei

Using Hadoop To Implement a Semantic Method Of Assessing The Quality Of Research Medical Datasets Thumbnail


Authors

S. Bonner

G. Antoniou

L. Moss

I. Kureshi

D. Corsair

I. Tachmazidis

Alvin Chin

Wei Xu

Fei Wang



Abstract

In this paper a system for storing and querying medical RDF data using Hadoop is developed. This approach enables us to create an inherently parallel framework that will scale the workload across a cluster. Unlike existing solutions, our framework uses highly optimised joining strategies to enable the completion of eight separate SPAQL queries, comprised of over eighty distinct joins, in only two Map/Reduce iterations. Results are presented comparing an optimised version of our solution against Jena TDB, demonstrating the superior performance of our system and its viability for assessing the quality of medical data.

Citation

Bonner, S., Antoniou, G., Moss, L., Kureshi, I., Corsair, D., Tachmazidis, I., Chin, A., Xu, W., & Wang, F. (2014, August). Using Hadoop To Implement a Semantic Method Of Assessing The Quality Of Research Medical Datasets. Presented at The 2014 International Conference on Big Data Science and Computing - BigDataScience '14., Beijing, China

Presentation Conference Type Conference Paper (published)
Conference Name The 2014 International Conference on Big Data Science and Computing - BigDataScience '14.
Start Date Aug 4, 2014
End Date Aug 7, 2014
Publication Date Aug 7, 2014
Deposit Date May 15, 2015
Publicly Available Date Mar 3, 2016
Publisher Association for Computing Machinery (ACM)
Series Title ACM international conference proceedings series
Book Title Proceedings of the 3rd ASE International Conference on Big Data Science and Computing : 2014, Beijing, China : BigDataScience '14.
DOI https://doi.org/10.1145/2640087.2644163
Public URL https://durham-repository.worktribe.com/output/1152986

Files

Accepted Conference Proceeding (576 Kb)
PDF

Copyright Statement
© 2014 ACM. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in Proceedings of the 3rd ASE International Conference on Big Data Science and Computing : 2014, Beijing, China : BigDataScience '14. New York, USA: Association for Computing Machinery (ACM), Article No. 7, http://doi.acm.org/10.1145/10.1145/2640087.2644163





You might also like



Downloadable Citations