2,336 Hits in 5.3 sec

Joint Multi-Dimension Pruning via Numerical Gradient Update [article]

Zechun Liu and Xiangyu Zhang and Zhiqiang Shen and Zhe Li and Yichen Wei and Kwang-Ting Cheng and Jian Sun
2021 arXiv   pre-print
Then we optimize the pruning vector with gradient update and model joint pruning as a numerical gradient optimization process.  ...  We present joint multi-dimension pruning (abbreviated as JointPruning), an effective method of pruning a network on three crucial aspects: spatial, depth and channel simultaneously.  ...  + ' 0 ) Fig. 1: The overall framework of the proposed joint multi-dimensional pruning.  ... 
arXiv:2005.08931v2 fatcat:yvnze4n7kzh43hnpj7emq5ye6u

UMEC: Unified model and embedding compression for efficient recommendation systems

Jiayi Shen, Haotao Wang, Shupeng Gui, Jianchao Tan, Zhangyang Wang, Ji Liu
2021 International Conference on Learning Representations  
.,. 2 s (l) ,2 + z (R Flops (s) − R budget ) , and we can adopt the gradient ascent method as the update rule. Update s The optimization on s relies on both sparsity and resource loss.  ...  For a pruned layer l, the input and output dimension are restricted by the number of pruned neurons, annotated as s (l) and s (l+1) .  ... 
dblp:conf/iclr/ShenWGTWL21 fatcat:bjhf7ynftnahlfoj6qrcqfkpla

Unified Visual Transformer Compression [article]

Shixing Yu, Tianlong Chen, Jiayi Shen, Huan Yuan, Jianchao Tan, Sen Yang, Ji Liu, Zhangyang Wang
2022 arXiv   pre-print
However, the computational overhead of ViTs remains prohibitive, due to stacking multi-head self-attention modules and else.  ...  This paper proposes a unified ViT compression framework that seamlessly assembles three effective techniques: pruning, layer skipping, and knowledge distillation.  ...  to the updating policy of gt, one gradient term w.r.t. s and r are ∇s z (R Flops (s, r, gt) − R budget ), ∇r z (R Flops (s, r, gt) − R budget ) respectively.  ... 
arXiv:2203.08243v1 fatcat:5rrj5vn53zdahejoxtfaoda6me

Only Train Once: A One-Shot Neural Network Training And Pruning Framework [article]

Tianyi Chen, Bo Ji, Tianyu Ding, Biyi Fang, Guanyi Wang, Zhihui Zhu, Luming Liang, Yixin Shi, Sheng Yi, Xiao Tu
2021 arXiv   pre-print
a structured-sparsity optimization problem and propose a novel optimization algorithm, Half-Space Stochastic Projected Gradient (HSPG), to solve it, which outperforms the standard proximal methods on  ...  Structured pruning is a commonly used technique in deploying deep neural networks (DNNs) onto resource-constrained devices.  ...  ., the tensors fed into the fully connected layer, and project them onto a 2-dimensional space via PCA [40] .  ... 
arXiv:2107.07467v2 fatcat:cbsetynjo5cu3ojulf7azddlz4

A Fast Harmonic Mean Linear Discriminant Analysis for Dimensionality Reduction

2022 International Journal of Intelligent Engineering and Systems  
As well, a first-order approximation of the inverse Eigenvector matrix and the complete matrix of Eigenvectors are updated at every iteration.  ...  Dimensionality reduction is the most prominent process in artificial intelligence and data science because of using a massive amount of high-dimensional information.  ...  The gradient of Eq. ( 29 ) is: Determine the Stiefel manifold gradient using Eq. ( 11 ); Update 𝒢 using Eq. ( 12 ); Get back 𝒢 to the manifold using the joint diagonalization; Execute Algorithm 2;  ... 
doi:10.22266/ijies2022.0831.20 fatcat:tmoo7mw5ibblre6xazi3v556im

Towards Structured Dynamic Sparse Pre-Training of BERT [article]

Anastasia Dietrich and Frithjof Gressmann and Douglas Orr and Ivan Chelombiev and Daniel Justus and Carlo Luschi
2021 arXiv   pre-print
In this work, we develop and study a straightforward, dynamic always-sparse pre-training approach for BERT language modeling task, which leverages periodic compression steps based on magnitude pruning  ...  The dark horizontal blocks in the RigL updates indicate a collapse due to outliers along the input dimension, which indicates that the effect arises from the activation part of the dense gradient update  ...  In Figure 11 , we show that for gradient-based re-allocation, the dense gradient is dominated by outliers in the activation, e.g., along the input dimension of each layer, which imposes a strong bias  ... 
arXiv:2108.06277v1 fatcat:sa7tnrabcrbmjhl2ewm25rqgny

Coarse-to-Fine Searching for Efficient Generative Adversarial Networks [article]

Jiahao Wang, Han Shu, Weihao Xia, Yujiu Yang, Yunhe Wang
2021 arXiv   pre-print
In addition, a fair supernet training approach is utilized to ensure that all sub-networks can be updated fairly and stably.  ...  We first discover an intact search space of generator networks including three dimensionalities, i.e., path, operator, channel for fully excavating the network performance.  ...  parameters and weights via gradient descent.  ... 
arXiv:2104.09223v1 fatcat:u3t62uvt7nhopipjeo2vfu24ty

Differentiable Neural Input Search for Recommender Systems [article]

Weiyu Cheng, Yanyan Shen, Linpeng Huang
2020 arXiv   pre-print
For efficiency concern, these methods typically choose embedding dimensions from a restricted set of candidate dimensions.  ...  Existing works have proposed heuristic or reinforcement learning-based methods to search for mixed feature embedding dimensions.  ...  (b) The joint distribution plot of feature embedding dimensions and feature frequencies after dimension pruning. (c) Comparison of DNIS and network pruning performance over different pruning rates.  ... 
arXiv:2006.04466v2 fatcat:7kieg735j5enjlxtv255mdcyfu

Efficient Multi-objective Reinforcement Learning via Multiple-gradient Descent with Iteratively Discovered Weight-Vector Sets

Yongcan Cao, Huixin Zhan
2021 The Journal of Artificial Intelligence Research  
via finding a minimum-norm point in the convex hull of the set of multiple policy gradients when the impact of one objective on others is unknown a priori.  ...  In particular, we first propose a new PAOLS algorithm that integrates pruning and approximate optimistic linear support algorithm to efficiently discover the weight-vector sets of multiple gradients that  ...  Formally, the model's parameters θ will be updated via the single objective gradient update as θ i = θ − α θ L T i (f θ ), ( where α is the step size.  ... 
doi:10.1613/jair.1.12270 fatcat:jrnf3b5ujbevnbysaln2jo626u

Motion Planning for a Humanoid Mobile Manipulator System [article]

Yan Wei, Wei Jiang, Ahmed Rahmani, Qiang Zhan
2018 arXiv   pre-print
Fourthly, an EEs' via-point-based multi-objective genetic algorithm is proposed to design the "human-like" via-poses by optimizing four objective functions.  ...  In detail, an efficient direct-connect bidirectional RRT and gradient descent algorithm is proposed to reduce the sampled nodes largely, and a geometric optimization method is proposed for path pruning  ...  Objective functions Due to the high redundancy, there exist numerous joints combinations given EEs' desired positions-orientations X EE , and there is always a preference.  ... 
arXiv:1806.07349v1 fatcat:rr73lp3ymnbs5ksyemxqq4zld4

Codebook Training for Trellis-Based Hierarchical Grassmannian Classification

Stefan Schwarz, Theodoros Tsiftsis
2021 IEEE Wireless Communications Letters  
Exploiting the similarity of the proposed trellis classifier with a neural network, we propose stochastic gradient-based training techniques.  ...  We consider classification of points on a complexvalued Grassmann manifold of m-dimensional subspaces within the n-dimensional complex Euclidean space.  ...  To train layer r, we update the corresponding codebook entry Q (r) j * r such as to increase the quantization metric U H Û 2 via a stochastic gradient step.  ... 
doi:10.1109/lwc.2021.3139166 fatcat:pddkobmdy5hxde4ifbxd6vhjzu

Machine Learning for Microcontroller-Class Hardware – A Review [article]

Swapnil Sayan Saha, Sandeep Singh Sandha, Mani Srivastava
2022 arXiv   pre-print
We present both qualitative and numerical insights into different stages of model development by showcasing several use cases.  ...  T: Gradient norm for sample selection via uncertainty and diversity.  ...  Models operating on intrinsic dimensions of the data are computationally tractable and mitigate the curse of dimensionality.  ... 
arXiv:2205.14550v3 fatcat:y272riitirhwfgfiotlwv5i7nu

Communication-Efficient Edge AI: Algorithms and Systems [article]

Yuanming Shi, Kai Yang, Tao Jiang, Jun Zhang, Khaled B. Letaief
2020 arXiv   pre-print
Based on over-the-air computation, Amiri and Gunduz [85] proposed a gradient sparsification and random linear projection method to reduce the dimension of gradients due to limited channel bandwidth.  ...  methods for d−dimensional convex optimization problems.  ... 
arXiv:2002.09668v1 fatcat:nhasdzb7t5dt5brs2r7ocdzrnm

Effectively Subsampled Quadratures For Least Squares Polynomial Approximations [article]

Pranay Seshadri, Akil Narayan, Sankaran Mahadevan
2017 arXiv   pre-print
We conclude with numerical experiments on an analytical function and a model piston problem that show the efficacy of our approach compared with randomized subsampling.  ...  For polynomial approximation, we use a column pruning heuristic that removes columns based on the highest total orders and then solves the tall least squares problem.  ...  Further pruning of the polynomial subspace is performed via heuristics.  ... 
arXiv:1601.05470v4 fatcat:ggr52udoizfhll5tq4s4f23wja

Energy-Aware Neural Architecture Optimization with Fast Splitting Steepest Descent [article]

Dilin Wang, Meng Li, Lemeng Wu, Vikas Chandra, Qiang Liu
2020 arXiv   pre-print
Our fast algorithm allows us to reduce the computational cost of splitting to the same level of typical back-propagation updates and enables efficient implementation on GPU.  ...  network architectures; 2) we substantially speed up the splitting process of Liu et al. (2019), which requires expensive eigen-decomposition, by proposing a highly scalable Rayleigh-quotient stochastic gradient  ...  min v R S (v) := v Sv v v , v min ∝ arg min v R S (v), (7) which can be solved using gradient descent or other numerical methods.  ... 
arXiv:1910.03103v3 fatcat:qlo7iewv7zc43apspvy2opd5by
« Previous Showing results 1 — 15 out of 2,336 results