L. Cheng
Robust and Efficient Large-Large Table Outer Joins on Distributed Infrastructures
Cheng, L.; Kotoulas, S.; Ward, T.; Theodoropoulos, G.
Authors
S. Kotoulas
T. Ward
G. Theodoropoulos
Contributors
Fernando Silva
Editor
Inês Dutra
Editor
Vítor Santos Costa
Editor
Abstract
Outer joins are ubiquitous in many workloads but are sensitive to load-balancing problems. Current approaches mitigate such problems caused by data skew by using (partial) replication. However, contemporary replication-based approaches (1) introduce overhead, since they usually result in redundant data movement, (2) are sensitive to parameter tuning and value of data skew and (3) typically require that one side is small. In this paper, we propose a novel parallel algorithm, Redistribution and Efficient Query with Counters (REQC), aimed at robustness in terms of size of join sides, variation in skew and parameter tuning. Experimental results demonstrate that our algorithm is faster, more robust and less demanding in terms of network bandwidth, compared to the state-of-the-art.
Citation
Cheng, L., Kotoulas, S., Ward, T., & Theodoropoulos, G. (2014). Robust and Efficient Large-Large Table Outer Joins on Distributed Infrastructures. In F. Silva, I. Dutra, & V. S. Costa (Eds.), Euro-Par 2014 Parallel Processing : 20th International Conference, Porto, Portugal, August 25-29, 2014 ; proceedings (258-269). Springer Verlag. https://doi.org/10.1007/978-3-319-09873-9_22
Publication Date | Aug 1, 2014 |
---|---|
Deposit Date | Apr 21, 2016 |
Publicly Available Date | May 4, 2016 |
Publisher | Springer Verlag |
Pages | 258-269 |
Series Title | Lecture notes in computer science |
Book Title | Euro-Par 2014 Parallel Processing : 20th International Conference, Porto, Portugal, August 25-29, 2014 ; proceedings. |
ISBN | 9783319098722 |
DOI | https://doi.org/10.1007/978-3-319-09873-9_22 |
Public URL | https://durham-repository.worktribe.com/output/1672281 |
Additional Information | Series: Lecture Notes in Computer Science |
Files
Accepted Book Chapter
(281 Kb)
PDF
Copyright Statement
The final publication is available at Springer via http://dx.doi.org/10.1007/978-3-319-09873-9_22
You might also like
Towards an Info-Symbiotic Decision Support System for Disaster Risk Management
(2015)
Presentation / Conference Contribution
Towards large-scale what-if traffic simulation with exact-differential simulation
(2015)
Presentation / Conference Contribution
Robust and Skew-resistant Parallel Joins in Shared-Nothing Systems
(2014)
Presentation / Conference Contribution
Design and evaluation of parallel hashing over large-scale data
(2014)
Presentation / Conference Contribution
Automated Dynamic Resource Provisioning and Monitoring in Virtualized Large-Scale Datacenter
(2014)
Presentation / Conference Contribution
Downloadable Citations
About Durham Research Online (DRO)
Administrator e-mail: dro.admin@durham.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2024
Advanced Search