Filters








24,082 Hits in 6.3 sec

Graph Convolutional Neural Networks for Web-Scale Recommender Systems

Rex Ying, Ruining He, Kaifeng Chen, Pong Eksombatchai, William L. Hamilton, Jure Leskovec
2018 Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining - KDD '18  
We deploy PinSage at Pinterest and train it on 7.5 billion examples on a graph with 3 billion nodes representing pins and boards, and 18 billion edges.  ...  both graph structure as well as node feature information.  ...  Following this, we give an overview of several techniques we developed that lead to the computation efficiency and fast convergence rate of PinSage, allowing us to train on billion node graphs and billions  ... 
doi:10.1145/3219819.3219890 dblp:conf/kdd/YingHCEHL18 fatcat:xp5aezpyjbcvfdjjmke3ivjgm4

Thematic issue on data management for graphs

Sihem Amer-Yahia, Lei Chen, Renée J. Miller
2019 The VLDB journal  
Graphs have always been an important fundamental data structure in computer science.  ...  , it is not clear what statistics should be maintained on the data to help improve query processing).  ...  In structural graph clustering, vertices of a graph are grouped into the same cluster if they are similar enough (where similarity is based on the adjacent vertices).  ... 
doi:10.1007/s00778-019-00543-2 fatcat:seebkjizqbb6nc66rp5pdg63bm

MMap: Fast billion-scale graph computation on a PC via memory mapping

Zhiyuan Lin, Minsuk Kahng, Kaeser Md. Sabrin, Duen Horng Polo Chau, Ho Lee, U Kang
2014 2014 IEEE International Conference on Big Data (Big Data)  
Graph computation approaches such as GraphChi and TurboGraph recently demonstrated that a single PC can perform efficient computation on billion-node graphs.  ...  graph algorithms for billion-scale graphs with little code, thanks to memory mapping; (3) extensive experiments on real graphs, including the 6.6 billion edge Ya-hooWeb graph, and show that this new approach  ...  GraphChi [14] is one of the first works that demonstrated how graph computation can be performed on massive graphs with billions of nodes and edges on a commodity Mac mini computer, with the speed matching  ... 
doi:10.1109/bigdata.2014.7004226 pmid:25866846 pmcid:PMC4389765 dblp:conf/bigdataconf/LinKSCLK14 fatcat:b3apfyrlc5cpxfrq2odr5o6r4q

Benchmarking parallel eigen decomposition for residuals analysis of very large graphs

Edward M. Rutledge, Benjamin A. Miller, Michelle S. Beard
2012 2012 IEEE Conference on High Performance Extreme Computing  
The computational driver for one important class of graph analysis algorithms is the computation of leading eigenvectors of matrix representations of a graph.  ...  software, for graphs with 1 million to 1 billion vertices, and 8 million to 8 billion edges.  ...  Thus, our Matlab implementation is computationally similar to our SLEPc implementation.  ... 
doi:10.1109/hpec.2012.6408677 dblp:conf/hpec/RutledgeMB12 fatcat:3f7cv7twv5d3phf67viww6qwtm

Towards Training Billion Parameter Graph Neural Networks for Atomic Simulations [article]

Anuroop Sriram, Abhishek Das, Brandon M. Wood, Siddharth Goyal, C. Lawrence Zitnick
2022 arXiv   pre-print
On the large-scale Open Catalyst 2020 (OC20) dataset, these graph-parallelized models lead to relative improvements of 1) 15% on the force MAE metric for the S2EF task and 2) 21% on the AFbT metric for  ...  In this paper, we introduce Graph Parallelism, a method to distribute input graphs across multiple GPUs, enabling us to train very large GNNs with hundreds of millions or billions of parameters.  ...  An alternate line of work, that is more similar to ours, keeps the entire graph in memory by efficiently partitioning the graph among multiple nodes (Jia et al., 2020; Ma et al., 2019; Tripathy et al.  ... 
arXiv:2203.09697v1 fatcat:p3c7royaund2tj7ctz24tby45a

Querying Web-Scale Information Networks Through Bounding Matching Scores

Jiahui Jin, Samamon Khemmarat, Lixin Gao, Junzhou Luo
2015 Proceedings of the 24th International Conference on World Wide Web - WWW '15  
The bounding technique can be implemented in a distributed environment, allowing our approach to efficiently answer the queries on web-scale information networks.  ...  Web-scale information networks containing billions of entities are common nowadays. Querying these networks can be modeled as a subgraph matching problem.  ...  . • In order to scale the algorithm to billions of nodes, we propose an index-free algorithm that computes matching scores online.  ... 
doi:10.1145/2736277.2741131 dblp:conf/www/JinKGL15 fatcat:w5ksmrgffzhwrecgaeg3b2hmti

TGL: A General Framework for Temporal GNN Training on Billion-Scale Graphs [article]

Hongkuan Zhou, Da Zheng, Israt Nisa, Vasileios Ioannidis, Xiang Song, George Karypis
2022 arXiv   pre-print
Temporal Graph Neural Networks capture temporal information as well as structural and contextual information in the generated dynamic node embeddings.  ...  To address the limitations of current TGNNs only being evaluated on small-scale datasets, we introduce two large-scale real-world datasets with 0.2 and 1.3 billion temporal edges.  ...  TGL: A General Framework for Temporal GNN Training on Billion-Scale Graphs.  ... 
arXiv:2203.14883v2 fatcat:t2xd2nmrezejdamcdspt23q2bi

Large Scale Graph Matching(LSGM): Techniques, Tools, Applications and Challenges

Azka Mahmood, Hina Farooq, Javed Ferzund
2017 International Journal of Advanced Computer Science and Applications  
rather than focusing on structural details of graphs.  ...  Large Scale Graph Matching (LSGM) is one of the fundamental problems in Graph theory and it has applications in many areas such as Computer Vision, Machine Learning, Pattern Recognition and Big Data Analytics  ...  Other examples are the Twitter graph which is one of the largest graphs that have 1.5 Billion edges and graph for Yahoo (The Altavista graph) contains 6.6 Billion edges [6] , [7] .  ... 
doi:10.14569/ijacsa.2017.080465 fatcat:i4wunwwhu5d43pfj7roloojwmm

Spectral Analysis for Billion-Scale Graphs: Discoveries and Implementation [chapter]

U Kang, Brendan Meeder, Christos Faloutsos
2011 Lecture Notes in Computer Science  
., convergence) for large sparse matrices, let alone for billion-scale ones.  ...  of the largest publicly available graphs (120Gb, 1.4 billion nodes, 6.6 billion edges). 1 YahooWeb, LinkedIn: released under NDA.  ...  Introduction Graphs with billions of edges, or billion-scale graphs, are becoming common; Facebook boasts about 0.5 billion active users, who-calls-whom networks can reach similar sizes in large countries  ... 
doi:10.1007/978-3-642-20847-8_2 fatcat:owpg3prm6ramdfjll74dlhmzxy

HM-ANN: Efficient Billion-Point Nearest Neighbor Search on Heterogeneous Memory

Jie Ren, Minjia Zhang, Dong Li
2020 Neural Information Processing Systems  
In this work, we present a novel graph-based similarity search algorithm called HM-ANN, which takes both memory and data heterogeneity into consideration and enables billion-scale similarity search on  ...  a single node without using compression.  ...  Exhaustive search is infeasible at billion-point scales, because it is extremely computational demanding.  ... 
dblp:conf/nips/0015ZL20 fatcat:fg42ojrwmvhuxpsrwkjg2g54rq

Building Graphs at a Large Scale: Union Find Shuffle [article]

Saigopal Thota, Mridul Jain, Nishad Kamat, Saikiran Malikireddy, Pruthvi Raj Eranti, Albin Kuruvilla
2021 arXiv   pre-print
Large scale graph processing using distributed computing frameworks is becoming pervasive and efficient in the industry.  ...  nodes and 60 Billions linkages (and growing).  ...  ., efficient data structures, processing algorithms, and storage paradigms become crucial for managing data.  ... 
arXiv:2012.05430v2 fatcat:bsrbxvhzg5akjjhzvt26uo4jle

A Survey on Graph Database Management Techniques for Huge Unstructured Data

Patil N. S., Kiran P, Kiran N. P., Naresh Patel K. M.
2018 International Journal of Electrical and Computer Engineering (IJECE)  
Use of graph database is expected to be beneficial in business, and social networking sites that generate huge unstructured data as that Big Data requires proper and efficient computational techniques  ...  This paper reviews the existing graph data computational techniques and the research work, to offer the future research line up in graph database management.  ...  It is a large graph processing machine. It provides fast graph exploration and parallel computing for larger datasets. It also provides high throughput on large graphs which have a billion nodes.  ... 
doi:10.11591/ijece.v8i2.pp1140-1149 fatcat:gves26azifhrtc5mzkrajlxk34

ZOOMER: Boosting Retrieval on Web-scale Graphs by Regions of Interest [article]

Yuezihan Jiang, Yu Cheng, Hanyu Zhao, Wentao Zhang, Xupeng Miao, Yu He, Liang Wang, Zhi Yang, Bin Cui
2022 arXiv   pre-print
Deployed as a large-scale distributed system, ZOOMER supports graphs with billions of nodes for training and thousands of requests per second for serving.  ...  ZOOMER is designed for tackling two challenges presented by the massive user data at Taobao: low training/serving efficiency due to the huge scale of the graphs, and low recommendation quality due to the  ...  Therefore, ZOOMER is able to support web-scale Taobao graphs with billion-scale nodes and tens of billion-scale edges at an acceptable cost.  ... 
arXiv:2203.12596v1 fatcat:azfdiue4evh3lmocvvrjq45g2i

HipMCL: a high-performance parallel implementation of the Markov clustering algorithm for large-scale networks

Ariful Azad, Georgios A Pavlopoulos, Christos A Ouzounis, Nikos C Kyrpides, Aydin Buluç
2018 Nucleic Acids Research  
We show that HipMCL can efficiently utilize 2000 compute nodes and cluster a network of ∼70 million nodes with ∼68 billion edges in ∼2.4 h.  ...  Characteristic examples are gene expression networks or protein-protein interaction networks, which hold information about functional affinities or structural similarities.  ...  These graphs represent structural similarities or functional affinities, e.g. sequence homology or expression, respectively (2) .  ... 
doi:10.1093/nar/gkx1313 pmid:29315405 pmcid:PMC5888241 fatcat:fwlt3x7tfjes3ikfbtvp6gzv3q

DSSLP: A Distributed Framework for Semi-supervised Link Prediction [article]

Dalong Zhang, Xianzheng Song, Ziqi Liu, Zhiqiang Zhang, Xin Huang, Lin Wang, Jun Zhou
2020 arXiv   pre-print
However, it's a great challenge to train and deploy a link prediction model on industrial-scale graphs with billions of nodes and edges.  ...  Instead of training model on the whole graph, DSSLP is proposed to train on the k-hops neighborhood of nodes in a mini-batch setting, which helps reduce the scale of the input graph and distribute the  ...  One kind of popular approaches is to evaluate node similarity based on structural information of the graph.  ... 
arXiv:2002.12056v2 fatcat:gw7ghj5ycjhn7ahscjuncb6bxq
« Previous Showing results 1 — 15 out of 24,082 results