Filters








59 Hits in 8.4 sec

Enabling Fast and Flexible Distributed Deep Learning with Programmable Switches [article]

Heng Pan, Penglai Cui, Zhenyu li, Ru Jia, Penghao Zhang, Leilei Zhang, Ye Yang, Jiahao Wu, Jianbo Dong, Zheng Cao, Qiang Li, Hongqiang Harry Liu (+2 others)
2022 arXiv   pre-print
To address this challenge, this paper designs and implements Libra, a network aggregator, that utilizes in-network computation to optimize the communication for distributed DL training in two aspects:  ...  With the ever-increasing model size and train-ing data volume, distributed deep learning emerges which utilizes a cluster to train a model in parallel.  ...  To address the above gap, we design and implement Libra to enable in-network gradient aggregation for distributed sparse DL training.  ... 
arXiv:2205.05243v2 fatcat:2ui7enlpkrdvpbninwrqxdyo3q

Application of machine learning methods to detect and classify Core images using GAN and texture recognition [article]

Daniyar Nurseitov, Kairat Bostanbekov, Galymzhan Abdimanap, Abdelrahman Abdallah, Anel Alimova, Darkhan Kurmangaliyev
2022 arXiv   pre-print
for missing contents in images.  ...  The second problem is filling the hole in the core image by applying the Generative adversarial network(GAN) technique and using Contextual Residual Aggregation(CRA) which creates high frequency residual  ...  As a result, the Deep Encoding Pooling Network is trained from start to finish with stochastic gradient descent and back-propagation.  ... 
arXiv:2204.14224v1 fatcat:xxfbpsrmlrenpkutnm6yhmvvj4

Deep Learning Based Electric Pylon Detection in Remote Sensing Images

Sijia Qiao, Yu Sun, Haopeng Zhang
2020 Remote Sensing  
Considering the low efficiency of manual detection, we propose to utilize deep learning methods for electric pylon detection in high-resolution remote sensing images in this paper.  ...  The comparative analysis can provide reference for the selection of specific deep learning model in actual electric pylon detection task.  ...  The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.  ... 
doi:10.3390/rs12111857 fatcat:ynqrppvky5gtfcu4regh5kfawm

An Empirical Study of Adder Neural Networks for Object Detection [article]

Xinghao Chen, Chang Xu, Minjing Dong, Chunjing Xu, Yunhe Wang
2021 arXiv   pre-print
Moreover, we insert more shortcut connections in the neck part and design a new feature fusion architecture for avoiding the sparse features of adder layers.  ...  In this paper, we present an empirical study of AdderNets for object detection.  ...  Path aggregation network for instance segmentation.  ... 
arXiv:2112.13608v1 fatcat:vxk6iw2l4fey5piyszil37ww4m

DistGNN: Scalable Distributed Training for Large-Scale Graph Neural Networks [article]

Vasimuddin Md, Sanchit Misra, Guixiang Ma, Ramanarayan Mohanty, Evangelos Georganas, Alexander Heinecke, Dhiraj Kalamkar, Nesreen K. Ahmed, Sasikanth Avancha
2021 arXiv   pre-print
In this paper, we present DistGNN that optimizes the well-known Deep Graph Library (DGL) for full-batch training on CPU clusters via an efficient shared memory implementation, communication reduction using  ...  Our results on four common GNN benchmark datasets: Reddit, OGB-Products, OGB-Papers and Proteins, show up to 3.7x speed-up using a single CPU socket and up to 97x speed-up using 128 CPU sockets, respectively  ...  In the fastemerging domain of geometric deep learning [6] , a specific field called Graph Neural Networks (GNN) has recently shown impressive results across a spectrum of graph and network representation  ... 
arXiv:2104.06700v3 fatcat:he2ciftnpbgwnj4rp4yzgc3nmm

Deep Learning for SAR Ship Detection: Past, Present and Future

Jianwei Li, Congan Xu, Hang Su, Long Gao, Taoyang Wang
2022 Remote Sensing  
After the revival of deep learning in computer vision in 2012, SAR ship detection comes into the deep learning era too.  ...  The advantages and disadvantages of speed and accuracy are also analyzed. In the future part, we list the problem and direction of this field.  ...  Such as integral image features, HoG (histogram of oriented gradients), SURF (speeded up robust features), and LBP (local binary pattern).  ... 
doi:10.3390/rs14112712 fatcat:dbd6a4ugwjc65pook3wpcuj52a

Lite-YOLOv5: A Lightweight Deep Learning Detector for On-Board Ship Detection in Large-Scene Sentinel-1 SAR Images

Xiaowo Xu, Xiaoling Zhang, Tianwen Zhang
2022 Remote Sensing  
First, in order to obtain a lightweight network, we design a lightweight cross stage partial (L-CSP) module to reduce the amount of calculation and we apply network pruning for a more compact detector.  ...  Current SAR ship detection methods based on deep learning (DL) are difficult to deploy on satellites, because these methods usually have complex models and huge calculations.  ...  We will explore a reasonable hardware acceleration scheme for on-board SAR ship detection. 4.  ... 
doi:10.3390/rs14041018 fatcat:jnoisc2b5ngg5ft7544jiduj5u

CF-YOLO: Cross Fusion YOLO for Object Detection in Adverse Weather with a High-quality Real Snow Dataset [article]

Qiqi Ding, Peng Li, Xuefeng Yan, Ding Shi, Luming Liang, Weiming Wang, Haoran Xie, Jonathan Li, Mingqiang Wei
2022 arXiv   pre-print
CF is a plug-and-play feature aggregation module, which integrates the advantages of Feature Pyramid Network and Path Aggregation Network in a simpler yet more flexible form.  ...  Currently, not only there is a lack of snowy OD datasets to train cutting-edge detectors, but also these detectors have difficulties learning latent information beneficial for detection in snow.  ...  In a deep network, the receptive field of deep layers is relatively large, therefore, deeper layers are likely to take more meaningless features into account.  ... 
arXiv:2206.01381v1 fatcat:36txzvbgbbhjxgbb7iy47sl36y

BNAS v2: Learning Architectures for Binary Networks with Empirical Improvements [article]

Dahyun Kim, Kunal Pratap Singh, Jonghyun Choi
2021 arXiv   pre-print
We show that our method searches architectures with stable training curves despite the quantization error inherent in binary networks.  ...  Questioning that the architectures designed for FP networks might not be the best for binary networks, we propose to search architectures for binary networks (BNAS) by defining a new search space for binary  ...  Note that the memory savings and inference speed-up differ for different networks, as described in [7] . The FLOPs, memory savings and inference speed-up for Bi-Real Net is from Table 3 in [11] .  ... 
arXiv:2110.08562v1 fatcat:6ru32qv7wfd2pkdm3aftn5wy5y

Automatic Fabric Defect Detection Using Cascaded Mixed Feature Pyramid with Guided Localization

Wu, Zhang, Fang
2020 Sensors  
Stacked feature pyramid networks are set up to aggregate cross-scale defect patterns on interpolating mixed depth-wise block in stage one.  ...  After balanced sampling, the proposals are down-sampled by position-sensitive pooling for region of interest, in order to characterize interactions among fabric defect images in stage two.  ...  Thanks go to the Alibaba Group for the dataset of fabric defect images. Conflicts of Interest: The authors declare no conflict of interest.  ... 
doi:10.3390/s20030871 pmid:32041348 pmcid:PMC7039386 fatcat:bw6ncpas5vcubjgejal2sfnera

YOLOv4: Optimal Speed and Accuracy of Object Detection [article]

Alexey Bochkovskiy, Chien-Yao Wang, Hong-Yuan Mark Liao
2020 arXiv   pre-print
the MS COCO dataset at a realtime speed of ~65 FPS on Tesla V100.  ...  There are a huge number of features which are said to improve Convolutional Neural Network (CNN) accuracy.  ...  In the research of deep learning, some people put their focus on searching for good activation function.  ... 
arXiv:2004.10934v1 fatcat:77r4dezjbne6ro75wq3pfmko7a

Instances as Queries [article]

Yuxin Fang, Shusheng Yang, Xinggang Wang, Yu Li, Chen Fang, Ying Shan, Bin Feng, Wenyu Liu
2021 arXiv   pre-print
For video instance segmentation, QueryInst achieves the best performance among all online VIS approaches and strikes a decent speed-accuracy trade-off. Code is available at .  ...  This approach eliminates the explicit multi-stage mask head connection and the proposal distribution inconsistency issues inherent in non-query based multi-stage instance segmentation methods.  ...  Aggregated residual transformations for Wanli Ouyang, and Dahua Lin. Libra r-cnn: Towards bal- deep neural networks. In CVPR, 2017. anced learning for object detection.  ... 
arXiv:2105.01928v3 fatcat:wxoq6lnsffcjhgatmcrguakk6y

A comprehensive review of Binary Neural Network [article]

Chunyu Yuan, Sos S. Agaian
2022 arXiv   pre-print
It is natural to study game-changing technologies such as Binary Neural Networks (BNN) to increase deep learning capabilities.  ...  This article focuses exclusively on 1-bit activations and weights 1-bit convolution networks, contrary to previous surveys in which low-bit works are mixed in.  ...  Open problem 3: How to effectively speed up BNN training time?  ... 
arXiv:2110.06804v3 fatcat:b2w6atz27fbgdacq5aiov32bpi

MG-GCN: Scalable Multi-GPU GCN Training Framework [article]

Muhammed Fatih Balın and Kaan Sancak and Ümit V. Çatalyürek
2021 arXiv   pre-print
Full batch training of Graph Convolutional Network (GCN) models is not feasible on a single GPU for large graphs containing tens of millions of vertices or more.  ...  Thus, we propose MG-GCN, a multi-GPU GCN training framework taking advantage of the high-speed communication links between the GPUs present in multi-GPU systems.  ...  Polo Chau for providing us access to their DGX-A100 for our experiments. This work was partially supported by the NSF grant CCF-1919021.  ... 
arXiv:2110.08688v1 fatcat:2tcu7s3gijh4fop3npe3nk3lyy

Deep learning in computer vision: A critical review of emerging techniques and application scenarios

Junyi Chai, Hao Zeng, Anming Li, Eric W.T. Ngai
2021 Machine Learning with Applications  
By comparing CNN, RBM, Autoencoder, and Sparse Coding, they finally concluded that CNN was the most suitable architecture for CV.  ...  and output in general CNN networks (e.g., AlexNet, VGG).  ...  The detection speed is up to 59 FPS when the input size is 300x300.  ... 
doi:10.1016/j.mlwa.2021.100134 fatcat:xwrvp237jnhqzoa3hdkj7qhanu
« Previous Showing results 1 — 15 out of 59 results