Filters








10,560 Hits in 10.1 sec

A Model Compression Method with Matrix Product Operators for Speech Enhancement [article]

Xingwei Sun, Ze-Feng Gao, Zhong-Yi Lu, Junfeng Li, Yonghong Yan
2020 pre-print
In this paper, we propose a model compression method based on matrix product operators (MPO) to substantially reduce the number of parameters in DNN models for speech enhancement.  ...  Our proposal provides an effective model compression method for speech enhancement, especially in cloud-free application.  ...  CONCLUSION AND DISCUSSION In this paper, we propose a novel model compression method based on matrix product operators to reduce model size and apply it into the DNN-based monaural speech enhancement task  ... 
doi:10.1109/taslp.2020.3030495 arXiv:2010.04950v1 fatcat:ekgoocbjczam3fh76xtferk4im

Compressing LSTM Networks by Matrix Product Operators [article]

Ze-Feng Gao, Xingwei Sun, Lan Gao, Junfeng Li, Zhong-Yi Lu
2022 arXiv   pre-print
We compare the MPO-LSTM model-based compression model with the traditional LSTM model with pruning methods on sequence classification, sequence prediction, and speech enhancement tasks in our experiments  ...  In this paper, we propose a matrix product operator(MPO) based neural network architecture to replace the LSTM model.  ...  In terms of compressed models, the speech enhancement performance decreases with the increase of compression rate for both compression methods.  ... 
arXiv:2012.11943v3 fatcat:zptqadvhufeq7gyzydcxe7q74u

A two-stage full-band speech enhancement model with effective spectral compression mapping [article]

Zhongshu Hou, Qinwen Hu, Kai Chen, Jing Lu
2022 arXiv   pre-print
Instead of suppressing noise in a single network structure, we first estimate a spectral magnitude mask, converting the speech to a high signal-to-ratio (SNR) state, and then utilize a subsequent model  ...  In this paper, we propose a learnable spectral compression mapping (SCM) to effectively compress the high frequency components so that they can be processed in a more efficient manner.  ...  We propose a two-stage full band speech enhancement model with SCM in the T-F domain, named MHA-DPCRN.  ... 
arXiv:2206.13136v1 fatcat:oxazvrlh35hxjis4xyc7gw6qla

On Psychoacoustically Weighted Cost Functions Towards Resource-Efficient Deep Neural Networks for Speech Denoising

Aswin Sivaraman, Kai Zhen, Minje Kim, Jongmo Sung
2017 Figshare  
The experimental results showcase our method as a valid approach for infusing perceptual significance to deep neural network operations.  ...  We present a psychoacoustically enhanced cost function to balance network complexity and perceptual performance of deep neural networks for speech denoising.  ...  X ≈ S + N S = M ⊙ X Note: ⊙ indicates the Hadamard product. For the (i+1)-th hidden layer in a network with L total layers, the feedforward process is defined as follows.  ... 
doi:10.6084/m9.figshare.5588809.v1 fatcat:66ina2gt75hl3hlqnmzix64xla

Learning a Neural Diff for Speech Models [article]

Jonathan Macoskey, Grant P. Strimel, Ariya Rastrow
2021 arXiv   pre-print
We present neural update approaches for release of subsequent speech model generations abiding by a data budget.  ...  , our budgeted updates outperform comparable model compression baselines by significant margins.  ...  bulk matrix operations at high speed.  ... 
arXiv:2108.01561v2 fatcat:vtngciz3ivalpovpdoefw5ll6m

Speech Enhancement based on Wiener Filter and Compressive Sensing

Amart Sulong, Teddy Surya Gunawan, Mira Kartiwi
2016 Indonesian Journal of Electrical Engineering and Computer Science  
In other word, compressive sensing method by randomize measurement matrix are combined with the Wiener filter to analyse the noisy speech signal with less introduce to noise signal and producing high signal  ...  results for the speech system.  ...  Statistical-Model-Based Methods and Wiener Filtering It is a new speech enhancement method knows as speech boosting.  ... 
doi:10.11591/ijeecs.v2.i2.pp367-379 fatcat:wjr4uv3sbbbtlaz5oo76ihvpym

A block-based compressed sensing method for underdetermined blind speech separation incorporating binary mask

Tao Xu, Wenwu Wang
2010 2010 IEEE International Conference on Acoustics, Speech and Signal Processing  
A block-based compressed sensing approach coupled with binary time-frequency masking is presented for the underdetermined speech separation problem. The proposed algorithm consists of multiple steps.  ...  Using the estimated mixing matrix, the sources are recovered by a compressed sensing approach.  ...  In the separating step, with the estimated mixing matrix A, we formulate the signal recovery problem as a compressed sensing [5] model.  ... 
doi:10.1109/icassp.2010.5494935 dblp:conf/icassp/XuW10 fatcat:u4dfevwfefau5a5pimqiurptbm

LPCSE: Neural Speech Enhancement through Linear Predictive Coding [article]

Yang Liu, Na Tang, Xiaoli Chu, Yang Yang, Jun Wang
2022 arXiv   pre-print
from the existing expert-rule based models of speech pronunciation and distortion, such as the classic Linear Predictive Coding (LPC) speech model because it is difficult to integrate the models with  ...  In this paper, to improve the efficiency of neural speech enhancement, we introduce an LPC-based speech enhancement (LPCSE) architecture, which leverages the strong inductive biases in the LPC speech model  ...  I is the identity tensor with the same shape as W . To better understand the LP2Wav block, an example illustrates the compression operation for the sparse matrix W , as shown in Fig. 5 .  ... 
arXiv:2206.06908v2 fatcat:w7r5j6ngafeldcwfino75gkcwq

Weight, Block or Unit? Exploring Sparsity Tradeoffs for Speech Enhancement on Tiny Neural Accelerators [article]

Marko Stamenovic, Nils L. Westhausen, Li-Chia Yang, Carl Jensen, Alex Pawlicki
2021 arXiv   pre-print
We explore network sparsification strategies with the aim of compressing neural speech enhancement (SE) down to an optimal configuration for a new generation of low power microcontroller based neural accelerators  ...  Although efficient speech enhancement is an active area of research, our work is the first to apply block pruning to SE and the first to address SE model compression in the context of microNPU's.  ...  compression [20, 21, 47] to studio production [11, 16] .  ... 
arXiv:2111.02351v2 fatcat:3sfua4wchbbfxkdxklwu4akzfi

Sparse Modeling with Applications to Speech Processing: A Survey

Ahmed Omara, Alaa Hefnawy, Abdelhalim Zekry
2016 Indonesian Journal of Electrical Engineering and Computer Science  
This article introduces a literature review of sparse coding applications in the field of speech processing.</p>  ...  Applications that use sparse representation are many and include compression, source separation, enhancement, and regularization in inverse problems, feature extraction, and more.  ...  In [46] , the author enhanced the Least Absolute Shrinkage and Selection Operator (LASSO) algorithm for improving the speech recognition rates.  ... 
doi:10.11591/ijeecs.v2.i1.pp161-167 fatcat:ijlcng3ehjamlpwukxdbkpy3xu

Structured Weight Matrices-Based Hardware Accelerators in Deep Neural Networks: FPGAs and ASICs [article]

Caiwen Ding, Ao Ren, Geng Yuan, Xiaolong Ma, Jiayu Li, Ning Liu, Bo Yuan, Yanzhi Wang
2018 arXiv   pre-print
For FPGA implementations on long short term memory (LSTM) networks, the proposed SWM-based LSTM can achieve up to 21X enhancement in performance and 33.5X gains in energy efficiency compared with the baseline  ...  In algorithm part, SWM-based framework adopts block-circulant matrices to achieve a fine-grained tradeoff between accuracy and compression ratio.  ...  Quantization and Weight Reduction Data quantization on weights and neurons is a commonly used method for model compression.  ... 
arXiv:1804.11239v1 fatcat:xzrhegowvvem3ausfk3bj6r52i

TinyLSTMs: Efficient Neural Speech Enhancement for Hearing Aids [article]

Igor Fedorov, Marko Stamenovic, Carl Jensen, Li-Chia Yang, Ari Mandell, Yiming Gan, Matthew Mattina, Paul N. Whatmough
2020 arXiv   pre-print
Results show a reduction in model size and operations of 11.9× and 2.9×, respectively, over the baseline for compressed models, without a statistical difference in listening preference and only exhibiting  ...  Although model compression techniques are an active area of research, we are the first to demonstrate their efficacy for RNN speech enhancement, using pruning and integer quantization of weights/activations  ...  Conclusions Neural speech enhancement is a key technology for future HA products.  ... 
arXiv:2005.11138v1 fatcat:tjodbvbz3jf2rhie326dgaxqru

TinyLSTMs: Efficient Neural Speech Enhancement for Hearing Aids

Igor Fedorov, Marko Stamenovic, Carl Jensen, Li-Chia Yang, Ari Mandell, Yiming Gan, Matthew Mattina, Paul N. Whatmough
2020 Interspeech 2020  
Results show a reduction in model size and operations of 11.9× and 2.9×, respectively, over the baseline for compressed models, without a statistical difference in listening preference and only exhibiting  ...  Although model compression techniques are an active area of research, we are the first to demonstrate their efficacy for RNN speech enhancement, using pruning and integer quantization of weights/activations  ...  Conclusions Neural speech enhancement is a key technology for future HA products.  ... 
doi:10.21437/interspeech.2020-1864 dblp:conf/interspeech/FedorovSJYMGMW20 fatcat:7lco46asrzc7vk55op4u5w3fsq

A High-Efficiency Fatigued Speech Feature Selection Method for Air Traffic Controllers Based on Improved Compressed Sensing

Yonggang Yan, Yi Mao, Zhiyuan Shen, Yitao Wei, Guozhuang Pan, Jinfu Zhu, Qiu-Hua Lin
2021 Journal of Healthcare Engineering  
For adapting a method to the specific field of fatigued speech, we propose an improved compressed sensing construction algorithm to decrease the reconstruction error and achieve superior sparse coding.  ...  This paper addresses these problems by proposing a high-efficiency fatigued speech selection method based on improved compressed sensing.  ...  For example, Haneche et al. proposed a novel speech enhancement approach based on the CS framework in 2019 [22] , while Langari et al. extracted the best subset of features for speech emotion recognition  ... 
doi:10.1155/2021/2292710 pmid:34616528 pmcid:PMC8487830 fatcat:qd5wojpbajez3ersmbdphjzl4i

A dereverberation algorithm for spherical microphone arrays using compressed sensing techniques

Ping Kun Tony Wu, Nicolas Epain, Craig Jin
2012 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)  
In this paper, we present a novel multichannel dereverberation algorithm that enhances a target signal in a reverberant environment.  ...  The proposed algorithm is designed for a spherical microphone array and formulated in the spherical harmonic domain.  ...  The PESQ score and SegSRR for various speech enhancement algorithms in Room 3 Method PESQ SegSRR Raw Microphone Signal 2.56 5.46 ICA approach 2.80 5.90 MUSIC+MINT 2.90 9.29 SR dereverberation  ... 
doi:10.1109/icassp.2012.6288808 dblp:conf/icassp/WuEJ12 fatcat:dq5vsygrkfafnc7ir3memhshx4
« Previous Showing results 1 — 15 out of 10,560 results