Skip to main content

Research Repository

Advanced Search

All Outputs (5)

Design and evaluation of parallel hashing over large-scale data (2014)
Conference Proceeding
Cheng, L., Kotoulas, S., Ward, T., & Theodoropoulos, G. (2014). Design and evaluation of parallel hashing over large-scale data. In 2014 21st International Conference on High Performance Computing (HiPC 2014) : Velha Goa, India, 17 - 20 December 2014 (1-10). https://doi.org/10.1109/hipc.2014.7116909

High-performance analytical data processing systems often run on servers with large amounts of memory. A common data structure used in such environment is the hash tables. This paper focuses on investigating efficient parallel hash algorithms for pro... Read More about Design and evaluation of parallel hashing over large-scale data.

Robust and Skew-resistant Parallel Joins in Shared-Nothing Systems (2014)
Conference Proceeding
Cheng, L., Kotoulas, S., Ward, T., & Theodoropoulos, G. (2014). Robust and Skew-resistant Parallel Joins in Shared-Nothing Systems. In CIKM'14 : proceedings of the 23rd ACM International Conference on Information and Knowledge Management : November 3-7, 2014, Shanghai, China (1399-1408). https://doi.org/10.1145/2661829.2661888

The performance of joins in parallel database management systems is critical for data intensive operations such as querying. Since data skew is common in many applications, poorly engineered join operations result in load imbalance and performance bo... Read More about Robust and Skew-resistant Parallel Joins in Shared-Nothing Systems.

A two-tier index architecture for fast processing large RDF data over distributed memory (2014)
Conference Proceeding
Cheng, L., Kotoulas, S., Ward, T., & Theodoropoulos, G. (2014). A two-tier index architecture for fast processing large RDF data over distributed memory. In HT'14 : proceedings of the 25th ACM Conference on Hypertext and Social Media : September 1-4, 2014, Santiago, Chile (300-302). https://doi.org/10.1145/2631775.2631789

We propose an efficient method for fast processing large RDF data over distributed memory. Our approach adopts a two-tier index architecture on each computation node: (1) a light-weight primary index, to keep loading times low, and (2) a dynamic, mul... Read More about A two-tier index architecture for fast processing large RDF data over distributed memory.

Efficiently Handling Skew in Outer Joins on Distributed Systems (2014)
Conference Proceeding
Cheng, L., Kotoulas, S., Ward, T., & Theodoropoulos, T. (2014). Efficiently Handling Skew in Outer Joins on Distributed Systems. In 2014 14th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid 2014) : Chicago, Illinois, USA, 26-29 May 2014 ; proceedings (295-304). https://doi.org/10.1109/ccgrid.2014.35

Outer joins are ubiquitous in databases and big data systems. The question of how best to execute outer joins in large parallel systems is particularly challenging as real world datasets are characterized by data skew leading to performance issues. A... Read More about Efficiently Handling Skew in Outer Joins on Distributed Systems.

Automated Dynamic Resource Provisioning and Monitoring in Virtualized Large-Scale Datacenter (2014)
Conference Proceeding
Abar, S., Lemarinier, P., Theodoropoulos, G., & OHare, G. (2014). Automated Dynamic Resource Provisioning and Monitoring in Virtualized Large-Scale Datacenter. In 2014 IEEE 28th International Conference on Advanced Information Networking and Applications (AINA) : 13-16 May 2014, University of Victoria, Victoria, Canada (961-970). https://doi.org/10.1109/aina.2014.117

Infrastructure as a Service (IaaS) is a pay-as-you go based cloud provision model which on demand outsources the physical servers, guest virtual machine (VM) instances, storage resources, and networking connections. This article reports the design an... Read More about Automated Dynamic Resource Provisioning and Monitoring in Virtualized Large-Scale Datacenter.