182 Hits in 3.6 sec

GraphX: Unifying Data-Parallel and Graph-Parallel Analytics [article]

Reynold S. Xin, Daniel Crankshaw, Ankur Dave, Joseph E. Gonzalez, Michael J. Franklin, Ion Stoica
2014 arXiv   pre-print
To address these challenges we introduce GraphX, a distributed graph computation framework that unifies graph-parallel and data-parallel computation.  ...  As a consequence, existing graph analytics pipelines compose graph-parallel and data-parallel systems using external storage systems, leading to extensive data movement and complicated programming model  ...  By unifying graphs and collections as first class composable objects, the GraphX data model is capable of spanning the entire graph analytics pipeline.  ... 
arXiv:1402.2394v1 fatcat:xxx2uvx6arbgdnjqiw7igqztnm

From graphs to tables the design of scalable systems for graph analytics

Joseph E. Gonzalez
2014 Proceedings of the 23rd International Conference on World Wide Web - WWW '14 Companion  
To fill the need for a holistic approach to graph-analytics we introduce GraphX, which unifies graph-parallel and dataparallel computation under a single API and system.  ...  GraphX recasts advances in graph-processing in the context of relational algebra and distributed join optimization enabling more general data-parallel systems to process graphs efficiently.  ...  Motivated by these challenges we introduce GraphX and describe how it unifies graph-parallel and data-parallel computation.  ... 
doi:10.1145/2567948.2580059 dblp:conf/www/Gonzalez14 fatcat:bkn2izmri5gb3fiugiskpvrnfa

MatrixMap: Programming abstraction and implementation of matrix computation for big data analytics

Jiannong Cao, Guanqing Liang, Yaguang Huangfu
2017 Big Data & Information Analytics  
And we introduce Key-CSR data format and frequently used graph operations for graph algorithms.  ...  In this paper, we present MatrixMap, a unified and efficient data-parallel programming framework for general matrix computations.  ...  GraphX is Apache Spark's API for graphs and graph-parallel computation. Because GraphX is faster than Graphlab, we compare MatrixMap with GraphX for graph algorithms.  ... 
doi:10.3934/bdia.2016015 fatcat:fzrgcwpolfhczhuce55pknplqe

Big data analytics on Apache Spark

Salman Salloum, Ruslan Dautov, Xiaojun Chen, Patrick Xiaogang Peng, Joshua Zhexue Huang
2016 International Journal of Data Science and Analytics  
Apache Spark has emerged as the de facto framework for big data analytics with its advanced in-memory programming model and upper-level libraries for scalable machine learning, graph analysis, streaming  ...  In addition, we highlight some research and development directions on Apache Spark for big data analytics.  ...  GraphX: key features GraphX combines the advantages of both previous graphparallel systems and current Spark's data-parallel framework to provide a library for large-scale graph analytics [83] .  ... 
doi:10.1007/s41060-016-0027-9 dblp:journals/ijdsa/SalloumD0PH16 fatcat:gtzw3aqupnhxvcjbefovrnfhne


Ankur Dave, Alekh Jindal, Li Erran Li, Reynold Xin, Joseph Gonzalez, Matei Zaharia
2016 Proceedings of the Fourth International Workshop on Graph Data Management Experiences and Systems - GRADES '16  
GraphFrames generalize the ideas in previous graph-on-RDBMS systems, such as GraphX and Vertexica, by letting the system materialize multiple views of the graph (not just the specific triplet views in  ...  We implement GraphFrames over Spark SQL, enabling parallel execution on Spark and integration with custom code.  ...  GraphFrames expose a concise language-integrated API that unifies graph analytics and relational queries.  ... 
doi:10.1145/2960414.2960416 dblp:conf/grades/DaveJLXGZ16 fatcat:i3jegtzpn5hwvc4iofmndfwvku


Rong Chen, Jiaxin Shi, Yanzhe Chen, Haibo Chen
2015 Proceedings of the Tenth European Conference on Computer Systems - EuroSys '15  
A detailed evaluation on two clusters using graph-analytics and MLDM (machine learning and data mining) applications show that PowerLyra outperforms PowerGraph by up to 5.53X (from 1.24X) and 3.26X (from  ...  1.49X) for real-world and synthetic graphs accordingly, and is much faster than other systems like GraphX and Giraph, yet with much less memory consumption.  ...  Acknowledgments We thank our shepherd Amitabha Roy and the anonymous reviewers for their insightful comments, Kaiyuan Zhang for evaluating graph-parallel systems on single machine platform and Di Xiao  ... 
doi:10.1145/2741948.2741970 dblp:conf/eurosys/ChenSCC15 fatcat:3s6anhlbjrhflhhe5qn6rtixsm

Social Networks for Threat Perception and Analysis

Pragati Dnyaneshwar Bharsakle
2021 International Journal for Research in Applied Science and Engineering Technology  
In this paper we present an alternative data analytic solution by using pattern matching solution.  ...  In the current era of massive knowledge, high volumes of valuable knowledge is simply collected and generated. Social networks square measure samples of generating sources of those huge knowledge.  ...  Franklin, Ion Stoica [11] e have presented GraphX, an interactive graph computation engine that combines the advantages of graph-parallel systems and data-parallel systems.  ... 
doi:10.22214/ijraset.2021.35911 fatcat:ibklet2tebhv5ngo4g73wbvw6q

Time-evolving graph processing at scale

Anand Padmanabha Iyer, Li Erran Li, Tathagata Das, Ion Stoica
2016 Proceedings of the Fourth International Workshop on Graph Data Management Experiences and Systems - GRADES '16  
G T also unifies data streaming and graph streaming processing. Our preliminary evaluations on two representative datasets show promising results.  ...  Second, analytics tasks typically often involve  ...  Unifying Data & Graph Streams G T unifies data and graph stream processing.  ... 
doi:10.1145/2960414.2960419 dblp:conf/grades/IyerLDS16 fatcat:tks4gkhimzhtzocriqu3vxwrle


Kun Li, Daisy Zhe Wang, Alin Dobra, Christopher Dudley
2015 Proceedings of the VLDB Endowment  
We argue that the combination of UDA and GIST (UDA-GIST) unifies data-parallel and state-parallel processing in a single system, thus significantly extending the analytical capabilities of DBMSes.  ...  Most major DBMSes offer User-Defined Aggregate (UDA), a data-driven operator, to implement many of the analytical techniques in parallel.  ...  To support such advanced data analytics applications, the UDA-GIST framework developed in this work unifies data-parallel and state-parallel processing by extending existing database frameworks.  ... 
doi:10.14778/2735479.2735488 fatcat:bw7rdfn6izcjrjwyj4sllufewy

Big Data Analytics with Datalog Queries on Spark

Alexander Shkapsky, Mohan Yang, Matteo Interlandi, Hsuan Chiu, Tyson Condie, Carlo Zaniolo
2016 Proceedings of the 2016 International Conference on Management of Data - SIGMOD '16  
Among these platforms, Apache Spark is growing in popularity for machine learning and graph analytics.  ...  Developing efficient complex analytics in Spark requires deep understanding of both the algorithm at hand and the Spark API or subsystem APIs (e.g., Spark SQL, GraphX).  ...  IBM Research, Symantec and Intel.  ... 
doi:10.1145/2882903.2915229 pmid:28626296 pmcid:PMC5470845 dblp:conf/sigmod/ShkapskyYICCZ16 fatcat:fw2fje66wfaipfvhax5bi4mim4

Towards a Distributed Infrastructure for Evolving Graph Analytics

Vera Zaychik Moffitt, Julia Stoyanovich
2016 Proceedings of the 25th International Conference Companion on World Wide Web - WWW '16 Companion  
Portal streamlines exploratory analysis of evolving graphs, making it efficient and usable, and providing critical tools to computational and data scientists.  ...  We present different physical representations of TGraphs and show results of a preliminary experimental evaluation of these physical representations for an important class of evolving graph analytics.  ...  Despite much recent interest and activity on the topic, and despite increased variety and availability of evolving graph data, systematic support for scalable querying and analytics over evolving graphs  ... 
doi:10.1145/2872518.2889290 dblp:conf/www/MoffittS16 fatcat:cdqqs3dzpfgqvewfgqsfmeogiu

Apache Spark

Matei Zaharia, Michael J. Franklin, Ali Ghodsi, Joseph Gonzalez, Scott Shenker, Ion Stoica, Reynold S. Xin, Patrick Wendell, Tathagata Das, Michael Armbrust, Ankur Dave, Xiangrui Meng (+2 others)
2016 Communications of the ACM  
range from finance to scientific data processing and combine libraries for SQL, machine learning, and graphs. ˽ In six years, Apache Spark has grown to 1,000 contributors and thousands of deployments.  ...  THE GROWTH OF data volumes in industry and research poses tremendous opportunities, as well as tremendous computational challenges.  ...  GraphX 6 provides a graph computation interface similar to Pregel and GraphLab, 10, 11 implementing the same placement optimizations as these systems (such as vertex partitioning schemes) through its  ... 
doi:10.1145/2934664 fatcat:zqffhrnl4rhk5ayrdyv25aiyeq


Mihai Capotă, Tim Hegeman, Alexandru Iosup, Arnau Prat-Pérez, Orri Erling, Peter Boncz
2015 Proceedings of the GRADES'15 on - GRADES'15  
Graphs are increasingly used in industry, governance, and science. This has stimulated the appearance of many and diverse graph-processing platforms.  ...  We have already benchmarked with Graphalytics a variety of popular platforms, such as Giraph, GraphX, and Neo4j.  ...  Consequently, many competing graph-processing platforms, such as Giraph and GraphX, have recently emerged.  ... 
doi:10.1145/2764947.2764954 dblp:conf/sigmod/CapotaHIPEB14 fatcat:gcj3bb7cznc3lj3vqe6vfkn2by

A functional framework based on big data analytics for smart farming

Loubna Rabhi, Noureddine Falih, Lekbir Afraites, Belaid Bouikhalene
2021 Indonesian Journal of Electrical Engineering and Computer Science  
Big <span>data in agriculture is defined as massive volumes of data with a wide variety of sources and types which can be captured using internet of things sensors (soil and crops sensors, drones, and  ...  Big data outputs can be exploited by the future connected agriculture in order to reduce cost and time production, improve yield, develop new products, offer optimization and smart decision-making.  ...  Spark is considered as a more comprehensive unified framework for big data analytics by allowing a rich programming APIs like SQL, machine learning, graph processing and streaming, interactive and batch  ... 
doi:10.11591/ijeecs.v24.i3.pp1772-1779 fatcat:ln5n5zc3ozblvnhafzszqabe7u

Graph BI & Analytics: Current State and Future Challenges [chapter]

Amine Ghrab, Oscar Romero, Salim Jouili, Sabri Skhiri
2018 Lecture Notes in Computer Science  
Then we conclude by discussing future research directions and positioning them within a unified architecture of a graph BI and analytics framework.  ...  This paper presents the current status and open challenges of graph BI and analytics, and motivates the need for new warehousing frameworks aware of the topological nature of graphs.  ...  -Hybrid Systems: These frameworks enable a mixed workload of graph-parallel and data-parallel processing. GraphX [35] is a component of Apache Spark [36] developed for graph processing .  ... 
doi:10.1007/978-3-319-98539-8_1 fatcat:56fgfclobveifcbkwvbyejt5pi
« Previous Showing results 1 — 15 out of 182 results