L. Cheng
Robust and Skew-resistant Parallel Joins in Shared-Nothing Systems
Cheng, L.; Kotoulas, S.; Ward, T.; Theodoropoulos, G.
Authors
S. Kotoulas
T. Ward
G. Theodoropoulos
Abstract
The performance of joins in parallel database management systems is critical for data intensive operations such as querying. Since data skew is common in many applications, poorly engineered join operations result in load imbalance and performance bottlenecks. State-of-the-art methods designed to handle this problem offer significant improvements over naive implementations. However, performance could be further improved by removing the dependency on global skew knowledge and broadcasting. In this paper, we propose PRPQ (partial redistribution & partial query), an efficient and robust join algorithm for processing large-scale joins over distributed systems. We present the detailed implementation and a quantitative evaluation of our method. The experimental results demonstrate that the proposed PRPQ algorithm is indeed robust and scalable under a wide range of skew conditions. Specifically, compared to the state-of-art PRPD method, we achieve 16% - 167% performance improvement and 24% - 54% less network communication under different join workloads.
Citation
Cheng, L., Kotoulas, S., Ward, T., & Theodoropoulos, G. (2014, November). Robust and Skew-resistant Parallel Joins in Shared-Nothing Systems. Presented at 23rd ACM International Conference on Conference on Information and Knowledge Management - CIKM '14, Shanghai, China
Presentation Conference Type | Conference Paper (published) |
---|---|
Conference Name | 23rd ACM International Conference on Conference on Information and Knowledge Management - CIKM '14 |
Start Date | Nov 3, 2014 |
End Date | Nov 7, 2014 |
Publication Date | Nov 3, 2014 |
Deposit Date | Apr 21, 2016 |
Publicly Available Date | Apr 28, 2016 |
Pages | 1399-1408 |
Book Title | CIKM'14 : proceedings of the 23rd ACM International Conference on Information and Knowledge Management : November 3-7, 2014, Shanghai, China. |
DOI | https://doi.org/10.1145/2661829.2661888 |
Public URL | https://durham-repository.worktribe.com/output/1150460 |
Files
Accepted Conference Proceeding
(409 Kb)
PDF
Copyright Statement
© 2014 ACM. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in Long Cheng, Spyros Kotoulas, Tomas E. Ward, and Georgios Theodoropoulos. 2014. Robust and Skew-resistant Parallel Joins in Shared-Nothing Systems. In Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management (CIKM '14). ACM, New York, NY, USA, 1399-1408. DOI=http://dx.doi.org/10.1145/2661829.2661888
You might also like
Space-Time Matching Algorithms for Interest Management in Distributed Virtual Environments
(2014)
Journal Article
Synchronised Range Queries in Distributed Simulations of Multi-Agent Systems
(2013)
Journal Article
Synchronization in federation community networks
(2010)
Journal Article