Skip to main content

Research Repository

Advanced Search

Robust and Efficient Large-Large Table Outer Joins on Distributed Infrastructures

Cheng, L.; Kotoulas, S.; Ward, T.; Theodoropoulos, G.

Robust and Efficient Large-Large Table Outer Joins on Distributed Infrastructures Thumbnail


Authors

L. Cheng

S. Kotoulas

T. Ward

G. Theodoropoulos



Contributors

Fernando Silva
Editor

Inês Dutra
Editor

Vítor Santos Costa
Editor

Abstract

Outer joins are ubiquitous in many workloads but are sensitive to load-balancing problems. Current approaches mitigate such problems caused by data skew by using (partial) replication. However, contemporary replication-based approaches (1) introduce overhead, since they usually result in redundant data movement, (2) are sensitive to parameter tuning and value of data skew and (3) typically require that one side is small. In this paper, we propose a novel parallel algorithm, Redistribution and Efficient Query with Counters (REQC), aimed at robustness in terms of size of join sides, variation in skew and parameter tuning. Experimental results demonstrate that our algorithm is faster, more robust and less demanding in terms of network bandwidth, compared to the state-of-the-art.

Citation

Cheng, L., Kotoulas, S., Ward, T., & Theodoropoulos, G. (2014). Robust and Efficient Large-Large Table Outer Joins on Distributed Infrastructures. In F. Silva, I. Dutra, & V. S. Costa (Eds.), Euro-Par 2014 Parallel Processing : 20th International Conference, Porto, Portugal, August 25-29, 2014 ; proceedings (258-269). Springer Verlag. https://doi.org/10.1007/978-3-319-09873-9_22

Publication Date Aug 1, 2014
Deposit Date Apr 21, 2016
Publicly Available Date May 4, 2016
Publisher Springer Verlag
Pages 258-269
Series Title Lecture notes in computer science
Book Title Euro-Par 2014 Parallel Processing : 20th International Conference, Porto, Portugal, August 25-29, 2014 ; proceedings.
ISBN 9783319098722
DOI https://doi.org/10.1007/978-3-319-09873-9_22
Public URL https://durham-repository.worktribe.com/output/1672281
Additional Information Series: Lecture Notes in Computer Science

Files





You might also like



Downloadable Citations