Navigating the Maze of Graph Analytics Frameworks using Massive Graph Datasets

Satish, Nadathur; Sundaram, Narayanan; Patwary, Md Mostofa Ali; Seo, Jiwon; Park, Jongsoo; Hassaan, M. Amber; Sengupta, Shubho; Yin, Zhaoming; Dubey, Pradeep

doi:10.1145/2588555.2610518

Detailed Information

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

Navigating the Maze of Graph Analytics Frameworks using Massive Graph Datasets

Authors: Satish, Nadathur; Sundaram, Narayanan; Patwary, Md Mostofa Ali; Seo, Jiwon; Park, Jongsoo; Hassaan, M. Amber; Sengupta, Shubho; Yin, Zhaoming; Dubey, Pradeep

Issue Date: Jun-2014

Publisher: ACM

Citation: SIGMOD - International Conference on Management of Data, pp.979 - 990

Indexed: SCIE
SCOPUS

Journal Title: SIGMOD - International Conference on Management of Data

Start Page: 979

End Page: 990

URI: https://scholarworks.bwise.kr/hanyang/handle/2021.sw.hanyang/159708

DOI: 10.1145/2588555.2610518

Abstract: Graph algorithms are becoming increasingly important for analyzing large datasets in many fields. Real-world graph data follows a pattern of sparsity, that is not uniform but highly skewed towards a few items. Implementing graph traversal, statistics and machine learning algorithms on such data in a scalable manner is quite challenging. As a result, several graph analytics frameworks (GraphLab, CombBLAS, Giraph, SociaLite and Galois among others) have been developed, each offering a solution with different programming models and targeted at different users. Unfortunately, the "Ninja performance gap" between optimized code and most of these frameworks is very large (2-30X for most frameworks and up to 560X for Giraph) for common graph algorithms, and moreover varies widely with algorithms. This makes the end-users' choice of graph framework dependent not only on ease of use but also on performance. In this work, we offer a quantitative roadmap for improving the performance of all these frameworks and bridging the "ninja gap". We first present hand-optimized baselines that get performance close to hardware limits and higher than any published performance figure for these graph algorithms. We characterize the performance of both this native implementation as well as popular graph frameworks on a variety of algorithms. This study helps end-users delineate bottlenecks arising from the algorithms themselves vs. programming model abstractions vs. the framework implementations. Further, by analyzing the system-level behavior of these frameworks, we obtain bottlenecks that are agnostic to specific algorithms. We recommend changes to alleviate these bottlenecks (and implement some of them) and reduce the performance gap with respect to native code. These changes will enable end-users to choose frameworks based mostly on ease of use.

Files in This Item: Go to Link

Appears in Collections: 서울 공과대학 > 서울 컴퓨터소프트웨어학부 > 1. Journal Articles

Show full item record

qrcode

Related Researcher

Researcher Seo, Ji won photo

Seo, Ji won: COLLEGE OF ENGINEERING (SCHOOL OF COMPUTER SCIENCE)

Read more

Altmetrics

Total Views & Downloads

STATISTICS: Total View :5,996,370; Today View :23,114

RSS_1.0 RSS_2.0 ATOM_1.0

222, Wangsimni-ro, Seongdong-gu, Seoul, 04763, Korea+82-2-2220-1365

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Related Researcher

Altmetrics

Total Views & Downloads

BROWSE