Filters








2,311 Hits in 6.5 sec

Vision Transformer Slimming: Multi-Dimension Searching in Continuous Optimization Space [article]

Arnav Chavan and Zhiqiang Shen and Zhuang Liu and Zechun Liu and Kwang-Ting Cheng and Eric Xing
2022 arXiv   pre-print
This paper explores the feasibility of finding an optimal sub-model from a vision transformer and introduces a pure vision transformer slimming (ViT-Slim) framework.  ...  Our method is based on a learnable and unified ℓ_1 sparsity constraint with pre-defined factors to reflect the global importance in the continuous searching space of different dimensions.  ...  The last advantage is that we can search for a finergrained architecture such as the different dimensionalities in different self-attention heads, as our search space is continuous in them.  ... 
arXiv:2201.00814v2 fatcat:2ia77hhmfndb7blaqry4sdjejm

Super Vision Transformer [article]

Mingbao Lin, Mengzhao Chen, Yuxin Zhang, Ke Li, Yunhang Shen, Chunhua Shen, Rongrong Ji
2022 arXiv   pre-print
We attempt to reduce the computational costs in vision transformers (ViTs), which increase quadratically in the token number.  ...  Also, our SuperViT significantly outperforms existing studies on efficient vision transformers.  ...  ViT-Slim [5] searches for a sub-transformer network across three dimensions of input tokens, MHSA and MLP modules with a 1 -regularized soft mask to indicate the global importance of dimensions, just  ... 
arXiv:2205.11397v2 fatcat:icjeohab7ncijahldaw6vvpsym

A Survey on Vision Transformer [article]

Kai Han, Yunhe Wang, Hanting Chen, Xinghao Chen, Jianyuan Guo, Zhenhua Liu, Yehui Tang, An Xiao, Chunjing Xu, Yixing Xu, Zhaohui Yang, Yiman Zhang (+1 others)
2021 arXiv   pre-print
Furthermore, we also take a brief look at the self-attention mechanism in computer vision, as it is the base component in transformer.  ...  In this paper, we review these vision transformer models by categorizing them in different tasks and analyzing their advantages and disadvantages.  ...  [290] extended the network slimming approach [147] to vision transformers for reducing the dimensions of linear projections in both FFN and attention modules.  ... 
arXiv:2012.12556v4 fatcat:ldtbdgy6tbdttfqzhzml7n577m

TinyViT: Fast Pretraining Distillation for Small Vision Transformers [article]

Kan Wu, Jinnian Zhang, Houwen Peng, Mengchen Liu, Bin Xiao, Jianlong Fu, Lu Yuan
2022 arXiv   pre-print
Vision transformer (ViT) recently has drawn great attention in computer vision due to its remarkable model capability.  ...  To alleviate this issue, we propose TinyViT, a new family of tiny and efficient small vision transformers pretrained on large-scale datasets with our proposed fast distillation framework.  ...  It may help both the manual design and the search space design for efficient small vision transformers. 1) For small vision transformers, it improves the accuracy when replacing the transformer block in  ... 
arXiv:2207.10666v1 fatcat:3xkyjmockvhmbltkj3jhsur2sa

Chasing Sparsity in Vision Transformers: An End-to-End Exploration [article]

Tianlong Chen, Yu Cheng, Zhe Gan, Lu Yuan, Lei Zhang, Zhangyang Wang
2021 arXiv   pre-print
Vision transformers (ViTs) have recently received explosive popularity, but their enormous model sizes and training costs remain daunting.  ...  Our approach jointly optimizes model parameters and explores connectivity throughout training, ending up with one sparse network as the final output.  ...  Acknowledgment Z.W. is in part supported by an NSF RTML project (#2053279).  ... 
arXiv:2106.04533v3 fatcat:5fua73defbfgfeln7ifd6g56ce

EdgeViTs: Competing Light-weight CNNs on Mobile Devices with Vision Transformers [article]

Junting Pan, Adrian Bulat, Fuwen Tan, Xiatian Zhu, Lukasz Dudziak, Hongsheng Li, Georgios Tzimiropoulos, Brais Martinez
2022 arXiv   pre-print
Self-attention based models such as vision transformers (ViTs) have emerged as a very competitive architecture alternative to convolutional neural networks (CNNs) in computer vision.  ...  Specifically, we show that our models are Pareto-optimal when both accuracy-latency and accuracy-energy trade-offs are considered, achieving strict dominance over other ViTs in almost all cases and competing  ...  Vision transformers. ViTs [10] quickly popularize transformer-based architectures for computer vision.  ... 
arXiv:2205.03436v2 fatcat:dybmfnlw45hepb5hv2ilxeiuam

Non-terrestrial Communications Assisted by Reconfigurable Intelligent Surfaces [article]

Jia Ye, Jingping Qiao, Abla Kammoun, Mohamed-Slim Alouini
2021 arXiv   pre-print
Then we overview the literature related to RANTNs from the perspectives of performance analysis and optimization, followed by the widely used methodologies.  ...  In combination with next-generation communication technologies, the advanced technologies in RANTNs are discussed.  ...  To handle the case of continuous state and action spaces, policy gradient methods have been proposed.  ... 
arXiv:2109.00876v1 fatcat:gukzusg4rzeevcfzdl2ff7yxdm

Artificial Intelligence for UAV-Enabled Wireless Networks: A Survey

Mohamed-Amine Lahmeri, Mustafa A. Kishk, Mohamed-Slim Alouini
2021 IEEE Open Journal of the Communications Society  
In this article, we provide a comprehensive overview of some potential applications of AI in UAV-based networks.  ...  As a result, a significant part of the research community has started to integrate intelligence at the core of UAVs networks by applying AI algorithms in solving several problems in relation to drones.  ...  transforming data from a high-dimensional space representation to a lower-dimensional space.  ... 
doi:10.1109/ojcoms.2021.3075201 fatcat:4q6cl2sz7nha5mijm6fmomv3fi

Acoustic Features for Environmental Sound Analysis [chapter]

Romain Serizel, Victor Bisot, Slim Essid, Gaël Richard
2017 Computational Analysis of Sound Scenes and Events  
Therefore feature selection is generally solved in a sub-optimal manner, usually by introducing two main simplifications: • Brute-force search is avoided by recurring to a near-optimal search strategy.  ...  Dimensionality reduction A common approach to cope with the potentially large dimensionality of the feature space is to use transformation techniques such as PCA, linear discriminant analysis (LDA), or  ... 
doi:10.1007/978-3-319-63450-0_4 fatcat:mudkectf6nbp7hdab7u75kv7ei

A Journey from Improper Gaussian Signaling to Asymmetric Signaling

Sidrah Javed, Osama Amin, Basem Shihada, Mohamed-Slim Alouini
2020 IEEE Communications Surveys and Tutorials  
As such, the theory of impropriety has vast applications in medicine, geology, acoustics, optics, image and pattern recognition, computer vision, and other numerous research fields with our main focus  ...  The deviation of continuous and discrete complex random variables from the traditional proper and symmetric assumption to a generalized improper and asymmetric characterization (accounting correlation  ...  For minimal dimensions, line search or even exhaustive search can give promising results whereas other algorithms are needed for NP-hard problems.  ... 
doi:10.1109/comst.2020.2989626 fatcat:zyno7ku6n5eqnp6rrcopczb4qu

Next Generation Terahertz Communications: A Rendezvous of Sensing, Imaging, and Localization

Hadi Sarieddeen, Nasir Saeed, Tareq Y. Al-Naffouri, Mohamed-Slim Alouini
2020 IEEE Communications Magazine  
In this paper, we present a progressive vision of how the traditional "THz gap" will transform into a "THz rush" over the next few years.  ...  Then, we illustrate how their coalescence results in enhanced environment-aware system performance in beyond-5G use cases.  ...  Based on the pairwise estimated distances, the MDS algorithm tries to locate the sensor nodes in a given dimensional space.  ... 
doi:10.1109/mcom.001.1900698 fatcat:dhllcybc6fcpbkowo7j6dtz6ey

Next Generation Terahertz Communications: A Rendezvous of Sensing, Imaging, and Localization [article]

Hadi Sarieddeen, Nasir Saeed, Tareq Y. Al-Naffouri, Mohamed-Slim Alouini
2020 arXiv   pre-print
In this paper, we present a progressive vision of how the traditional "THz gap" will transform into a "THz rush" over the next few years.  ...  Then, we illustrate how their coalescence results in enhanced environment-aware system performance in beyond-5G use cases.  ...  Based on the pairwise estimated distances, the MDS algorithm tries to locate the sensor nodes in a given dimensional space.  ... 
arXiv:1909.10462v2 fatcat:aub5wqfibndtnhz54meg45c7ya

Smart Radio Environments Empowered by Reconfigurable Intelligent Surfaces: How it Works, State of Research, and Road Ahead [article]

Marco Di Renzo, Alessio Zappone, Merouane Debbah, Mohamed-Slim Alouini, Chau Yuen, Julien de Rosny, Sergei Tretyakov
2020 arXiv   pre-print
What are the most suitable uses and applications of reconfigurable intelligent surfaces in wireless networks? What are the most promising smart radio environments for wireless applications?  ...  Maxwell's mathematical theories of electromagnetism, and reporting pragmatic guidelines and recipes for employing appropriate physics-based models of metasurfaces in wireless communications.  ...  The general case study that encompasses two-dimensional metasurfaces in a three-dimensional space can be found in [43] .  ... 
arXiv:2004.09352v1 fatcat:ncsxqispbrcdrcpyrqvja3ie3m

Multiview Approaches to Event Detection and Scene Analysis [chapter]

Slim Essid, Sanjeel Parekh, Ngoc Q. K. Duong, Romain Serizel, Alexey Ozerov, Fabio Antonacci, Augusto Sarti
2017 Computational Analysis of Sound Scenes and Events  
Joint subspace learning Feature-space transformation A number of techniques has been suggested to map the observed feature vectors from two modalities to a low dimensional space where a measure of "dependency  ...  the same space, dimensionality difference between views is eliminated and direct comparison across views is made possible.  ... 
doi:10.1007/978-3-319-63450-0_9 fatcat:3s3dchsicbg4pkuqrsnrvtkrs4

A Key 6G Challenge and Opportunity – Connecting the Remaining 4 Billions: A Survey on Rural Connectivity [article]

Elias Yaacoub, Mohamed-Slim Alouini
2019 arXiv   pre-print
In addition, energy and cost efficiency of the studied technologies are analyzed. Typical application scenarios in rural areas are discussed, and several country-specific use cases are surveyed.  ...  Providing connectivity to around half of the World population living in rural or underprivileged areas is a tremendous challenge, but also a unique opportunity.  ...  Several optimization algorithms (Hill Climbing, Virtual Force, Time-Efficient Local Search, and Random) were implemented for different rural settlement models (Dispersed, Linear, Nucleated, and Isolated  ... 
arXiv:1906.11541v1 fatcat:kb6tpkvskjggbp25bt7it5iibm
« Previous Showing results 1 — 15 out of 2,311 results