5 Hits in 2.5 sec

swCaffe: a Parallel Framework for Accelerating Deep Learning Applications on Sunway TaihuLight [article]

Jiarui Fang, Liandeng Li, Haohuan Fu, Jinlei Jiang, Wenlai Zhao, Conghui He, Xin You, Guangwen Yang
2019 arXiv   pre-print
This paper reports our efforts on swCaffe, a highly efficient parallel framework for accelerating deep neural networks (DNNs) training on Sunway TaihuLight, the current fastest supercomputer in the world  ...  Finally, we present the scalability of swCaffe for the training of ResNet-50 and AlexNet on the scale of 1024 nodes.  ...  INTRODUCTION Deep Learning [1] has already proven its usability in a variety of applications [2] .  ... 
arXiv:1903.06934v1 fatcat:m5dbajx3urhyzoy6t5yf4dwwg4

Distributed deep learning system for cancerous region detection on Sunway TaihuLight

GuoFeng Lv, MingFan Li, Hong An, Han Lin, Junshi Chen, Wenting Han, Qian Xiao, Fei Wang, Rongfen Lin
2020 CCF Transactions on High Performance Computing  
With a benchmark from deep learning-based cancerous region detection algorithm, the average parallel efficiency obtains over 80% for at most 1024 processors.  ...  To explore the potential of distributed training on deep neural networks, we implement several distributed algorithms with the basis of swFlow on the world-leading supercomputer, Sunway TaihuLight.  ...  The previous work, swFLOW, a TensorFlow-based dataflow deep learning framework on Sunway TaihuLight.  ... 
doi:10.1007/s42514-020-00046-5 fatcat:4353jt2cprab7ij5v4rgnal6t4

swTVM: Exploring the Automated Compilation for Deep Learning on Sunway Architecture [article]

Changxi Liu, Hailong Yang, Rujun Sun, Zhongzhi Luan, Lin Gan, Guangwen Yang, Depei Qian
2019 arXiv   pre-print
order to generate efficient code for deep learning application on Sunway.  ...  In the meanwhile, the Sunway many-core processor renders itself as a competitive candidate for its attractive computational power in both scientific and deep learning applications.  ...  Deep Learning Applications on Sunway TaihuLight.  ... 
arXiv:1904.07404v2 fatcat:vmsxvvxjvbd5tdvyirb6urmfji

swHPFM: Refactoring and Optimizing the Structured Grid Fluid Mechanical Algorithm on the Sunway TaihuLight Supercomputer

Jingbo Li, Xingjun Zhang, Jianfeng Zhou, Xiaoshe Dong, Chuhua Zhang, Zeyu Ji
2019 Applied Sciences  
Using the proposed framework and algorithm, engineers can exploit the parallelism of the existing fluid mechanical algorithm and achieve a satisfactory performance on the Sunway TaihuLight.  ...  The Sunway TaihuLight supercomputer, which uses the SW26010 processor as its computing node, provides a powerful computing performance for this purpose.  ...  However, deep learning (swDNN) and customized Caffe (swCaffe) by architecture-oriented optimization methods are proposed, have been optimized [20] .  ... 
doi:10.3390/app10010072 fatcat:sxp7m2ausvdgxeg6y6xtpnzdha

An Efficient Method for Training Deep Learning Networks Distributed

Chenxu WANG, Yutong LU, Zhiguang CHEN, Junnan LI
2020 IEICE transactions on information and systems  
Training deep learning (DL) is a computationally intensive process; as a result, training time can become so long that it impedes the development of DL.  ...  Third, we optimize the parallel I/O by making each reader read data as continuously as possible to avoid the high overhead of discontinuous data reading.  ...  peak. swDNN [18] optimizes convolution kernels on SW26010 processors; these are a kind of manycore processor that provide computing power for Sunway TaihuLight.  ... 
doi:10.1587/transinf.2020pap0007 fatcat:e6lfiexwffds5fly3wgshgtmku