Filters








1,254 Hits in 4.0 sec

Multi-Level Local SGD for Heterogeneous Hierarchical Networks [article]

Timothy Castiglia, Anirban Das, Stacy Patterson
2022 arXiv   pre-print
We propose Multi-Level Local SGD, a distributed gradient method for learning a smooth, non-convex objective in a heterogeneous multi-level network.  ...  We first provide a unified mathematical framework that describes the Multi-Level Local SGD algorithm.  ...  This code simulates a multi-level network with heterogeneous workers, and trains a model using MLL-SGD.  ... 
arXiv:2007.13819v3 fatcat:fnp7rq4nprdsxkazae6s7kd3sq

Edge-assisted Democratized Learning Towards Federated Analytics [article]

Shashi Raj Pandey, Minh N.H. Nguyen, Tri Nguyen Dang, Nguyen H. Tran, Kyi Thar, Zhu Han, Choong Seon Hong
2021 arXiv   pre-print
Moreover, a hierarchical FL structure with distributed computing platforms demonstrates incoherent model performances at different aggregation levels.  ...  handle new/unseen data, for real-world applications.  ...  SYSTEM MODEL We consider a multi-connectivity scenario [35] for multi-path wireless network links and adopt a 2-layer heterogeneous network (HetNet) topology with a single macrocell eNodeB (MBS) at layer  ... 
arXiv:2012.00425v2 fatcat:r5jsfcsy5ve33cnp7yido5xbxy

Demystifying Why Local Aggregation Helps: Convergence Analysis of Hierarchical SGD [article]

Jiayi Wang, Shiqiang Wang, Rong-Rong Chen, Mingyue Ji
2022 arXiv   pre-print
Hierarchical SGD (H-SGD) has emerged as a new distributed SGD algorithm for multi-level communication networks.  ...  In H-SGD, before each global aggregation, workers send their updated local models to local servers for aggregations.  ...  Governments are authorized to reproduce and distribute reprints for Government purposes notwithstanding any copyright notation hereon.  ... 
arXiv:2010.12998v3 fatcat:c6adw7i5nrdg3gef2wghshb2ve

Dynamic Gradient Aggregation for Federated Domain Adaptation [article]

Dimitrios Dimitriadis, Kenichi Kumatani, Robert Gmyr, Yashesh Gaur, Sefik Emre Eskimez
2021 arXiv   pre-print
In this paper, a new learning algorithm for Federated Learning (FL) is introduced.  ...  Dynamic Gradient Aggregation Training with heterogeneous local data poses additional challenges, especially for the aggregation step, Equation 4.  ...  Different strategies for model aggregation were investigated, such as model averaging ("FedAvg" row) or hierarchical optimization using optimizers such as Adam, SGD, etc.  ... 
arXiv:2106.07578v1 fatcat:xgbkwqfxsvas5epyzvrm2q7jrq

Federated Transfer Learning with Dynamic Gradient Aggregation [article]

Dimitrios Dimitriadis, Kenichi Kumatani, Robert Gmyr, Yashesh Gaur, Sefik Emre Eskimez
2020 arXiv   pre-print
The hierarchical optimization offers additional flexibility in the training pipeline besides the enhanced convergence speed.  ...  On top of the hierarchical optimization, a dynamic gradient aggregation algorithm is proposed, based on a data-driven weight inference.  ...  ACKNOWLEDGEMENTS The authors would like to thank Masaki Itagaki, Ziad Al Bawab, Lei He, Michael Zeng, Xuedong Huang, Veljko Miljanic and Frank Seide for their project support and technical discussions.  ... 
arXiv:2008.02452v1 fatcat:k6t56opr55hftplhiwp7kwigtm

Link Prediction Algorithms for Social Networks Based on Machine Learning and HARP

Hao Shao, Lunwen Wang, Yufan Ji
2019 IEEE Access  
In Node2Vec, the global structure of the network is neglected and the stochastic gradient descent (SGD) method is easy to fall into local optimum.  ...  Based on this algorithm, an improved link prediction algorithm combining machine learning and hierarchical representation learning for network (HARP) is proposed.  ...  Ozcan A proposed a new multi-variable algorithm for link prediction of evolved heterogeneous networks based on non-linear autoregressive neural networks by outer inputs, and experiments on various data  ... 
doi:10.1109/access.2019.2938202 fatcat:rav4cqchifbzpm4mois5bgduke

Demystifying Parallel and Distributed Deep Learning: An In-Depth Concurrency Analysis [article]

Tal Ben-Nun, Torsten Hoefler
2018 arXiv   pre-print
Accelerating their training is a major challenge and techniques range from distributed algorithms to low-level circuit design.  ...  Deep Neural Networks (DNNs) are becoming an important tool in modern computing applications.  ...  Additional GA architecture search methods include the use of multi-level hierarchical representations of DNNs [160] (Fig. 24b) , which implement an asynchronous distributed tournament selection (centralized  ... 
arXiv:1802.09941v2 fatcat:ne2wiplln5eavjvjwf5to7nwsu

Grapy-ML: Graph Pyramid Mutual Learning for Cross-Dataset Human Parsing

Haoyu He, Jing Zhang, Qiming Zhang, Dacheng Tao
2020 PROCEEDINGS OF THE THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE AND THE TWENTY-EIGHTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE  
Specifically, the network weights of the first two levels are shared to exchange the learned coarse-granularity information across different datasets.  ...  Then, it adopts a top-down mechanism to progressively refine the hierarchical features through all the levels. GPM also enables efficient mutual learning.  ...  Thus the finest-level network in our model can focus on the inner difference of similar items, and the whole parsing task is not as hard as the single-pass methods without using hierarchical multi-granularity  ... 
doi:10.1609/aaai.v34i07.6728 fatcat:juvtu6haingcxgoaabsydtbfaa

RLlib: Abstractions for Distributed Reinforcement Learning [article]

Eric Liang, Richard Liaw, Philipp Moritz, Robert Nishihara, Roy Fox, Ken Goldberg, Joseph E. Gonzalez, Michael I. Jordan, Ion Stoica
2018 arXiv   pre-print
We argue for distributing RL components in a composable way by adapting algorithms for top-down hierarchical control, thereby encapsulating parallelism and resource requirements within short-running compute  ...  We demonstrate the benefits of this principle through RLlib: a library that provides scalable software primitives for RL.  ...  This experiment was done for PPO with 64 Evaluator processes. The PPO batch size was 320k, The SGD batch size was 32k, and we used 20 SGD passes per PPO batch.  ... 
arXiv:1712.09381v4 fatcat:ihhwdewi4bfndags5x5c65mfaa

Multi-Modality Cascaded Fusion Technology for Autonomous Driving [article]

Hongwu Kuang, Xiaodong Liu, Jingwei Zhang, Zicheng Fang
2020 arXiv   pre-print
In this paper, we propose a general multi-modality cascaded fusion framework, exploiting the advantages of decision-level and feature-level fusion, utilizing target position, size, velocity, appearance  ...  Multi-modality fusion is the guarantee of the stability of autonomous driving systems.  ...  Related Work Multi-Modality Fusion Multi-modality can be divided into decision-level [7] , feature-level [9] and data-level [24] .  ... 
arXiv:2002.03138v1 fatcat:szse6ak5ffemvmmc6mrwc6ucly

HCNet: Hierarchical Context Network for Semantic Segmentation [article]

Yanwen Chong, Congchong Nie, Yulong Tao, Xiaoshu Chen, Shaoming Pan
2020 arXiv   pre-print
In order to solve the above problem, we propose a hierarchical context network to differentially model homogeneous pixels with strong correlations and heterogeneous pixels with weak correlations.  ...  Through aggregating fine-grained pixel context features and coarse-grained region context features, our proposed network can not only hierarchically model global context information but also harvest multi-granularity  ...  After integrating pixel-level and region-level context, the performance of our hierarchical context network improves to 79.86% as expected.  ... 
arXiv:2010.04962v2 fatcat:4dzvxnircvfv3h3egs4jv2k3fy

Multi-modal learning for predicting the genotype of glioma [article]

Yiran Wei, Xi Chen, Lei Zhu, Lipei Zhang, Carola-Bibiane Schönlieb, Stephen J. Price, Chao Li
2022 arXiv   pre-print
Moreover, to extract tumor-related features from the brain network, we design a hierarchical attention module for the brain network encoder.  ...  Further, we design a bi-level multi-modal contrastive loss to align the multi-modal features and tackle the domain gap at the focal tumor and global brain.  ...  For multi-modal learning, we adopt the SGD optimizer to optimize the network with a weight decay of 0.0005 and a batch size of 20.  ... 
arXiv:2203.10852v1 fatcat:uxdssxmszjbvxmpffqit4r5wky

Grapy-ML: Graph Pyramid Mutual Learning for Cross-dataset Human Parsing [article]

Haoyu He, Jing Zhang, Qiming Zhang, Dacheng Tao
2019 arXiv   pre-print
Specifically, the network weights of the first two levels are shared to exchange the learned coarse-granularity information across different datasets.  ...  Then, it adopts a top-down mechanism to progressively refine the hierarchical features through all the levels. GPM also enables efficient mutual learning.  ...  Thus the finest-level network in our model can focus on the inner difference of similar items, and the whole parsing task is not as hard as the single-pass methods without using hierarchical multi-granularity  ... 
arXiv:1911.12053v1 fatcat:ugfuo5dp35hr7i5lpkrnrhfc6m

Distributed Hybrid CPU and GPU training for Graph Neural Networks on Billion-Scale Graphs [article]

Da Zheng, Xiang Song, Chengru Yang, Dominique LaSalle, George Karypis
2022 arXiv   pre-print
To ensure data locality and load balancing, DistDGLv2 partitions heterogeneous graphs by using a multi-level partitioning algorithm with min-edge cut and multiple balancing constraints.  ...  To ensure model accuracy, DistDGLv2 follows a synchronous training approach and allows ego-networks forming mini-batches to include non-local vertices.  ...  This is insufficient for a heterogeneous graph. We formulate this load balancing problem as a multi-constraint partitioning problem [14] .  ... 
arXiv:2112.15345v3 fatcat:d6xecdcoxnc6bpn2xsan3cugim

Multi-encoder multi-resolution framework for end-to-end speech recognition [article]

Ruizhi Li, Xiaofei Wang, Sri Harish Mallidi, Takaaki Hori, Shinji Watanabe, Hynek Hermansky
2018 arXiv   pre-print
A hierarchical attention mechanism is then used to combine the encoder-level information.  ...  Two heterogeneous encoders with different architectures, temporal resolutions and separate CTC networks work in parallel to extract complimentary acoustic information.  ...  We adapt Hierarchical Attention Network (HAN) in [16] for information fusion.  ... 
arXiv:1811.04897v1 fatcat:5kpfcdtarbfcrch6dbzoq4lfyu
« Previous Showing results 1 — 15 out of 1,254 results