Filters








33 Hits in 0.95 sec

Image declipping with deep networks [article]

Shachar Honig, Michael Werman
2018 arXiv   pre-print
We present a deep network to recover pixel values lost to clipping. The clipped area of the image is typically a uniform area of minimum or maximum brightness, losing image detail and color fidelity.  ...  Using neural networks and their ability to model natural images allows our neural network, DeclipNet, to reconstruct data in clipped regions producing state of the art results.  ...  Network Architecture We explored various deep architectures to produce an unclipped image from a clipped image.  ... 
arXiv:1811.06277v1 fatcat:p4adpoh6pbf2phmhz5kvy7rnwu

Image to Image Translation based on Convolutional Neural Network Approach for Speech Declipping [article]

Hamidreza Baradaran Kashani, Ata Jodeiri, Mohammad Mohsen Goodarzi, Shabnam Gholamdokht Firooz
2019 arXiv   pre-print
Motivated by the idea of image-to-image translation, we propose a declipping approach, namely U-Net declipper in which the magnitude spectrum images of clipped signals are translated to the corresponding  ...  In this paper, we focus on enhancement of clipped speech by using a fully convolutional neural network as U-Net.  ...  Such a powerful and comprehensive mapping brings Deep Neural Network (DNN) on the table.  ... 
arXiv:1910.12116v1 fatcat:dleyklsl3jf4beqn7uzqbflfiq

Supervision Exists Everywhere: A Data Efficient Contrastive Language-Image Pre-training Paradigm [article]

Yangguang Li, Feng Liang, Lichen Zhao, Yufeng Cui, Wanli Ouyang, Jing Shao, Fengwei Yu, Junjie Yan
2022 arXiv   pre-print
This work proposes a novel training paradigm, Data efficient CLIP (DeCLIP), to alleviate this limitation.  ...  However, CLIP is quite data-hungry and requires 400M image-text pairs for pre-training, thereby restricting its adoption.  ...  Then we use the query list to crawl images from the Internet, after filtering data with smaller images, filtering data with damaged images, filtering data without the caption, and filtering data with Chinese  ... 
arXiv:2110.05208v2 fatcat:tnbg3ibfdngephxhssdl6sqyj4

Democratizing Contrastive Language-Image Pre-training: A CLIP Benchmark of Data, Model, and Supervision [article]

Yufeng Cui, Lichen Zhao, Feng Liang, Yangguang Li, Jing Shao
2022 arXiv   pre-print
Moreover, we further combine DeCLIP with FILIP, bringing us the strongest variant DeFILIP. The CLIP-benchmark would be released at: https://github.com/Sense-GVT/DeCLIP for future CLIP research.  ...  Contrastive Language-Image Pretraining (CLIP) has emerged as a novel paradigm to learn visual models from language supervision.  ...  More works about improving image captioning with CLIP, e.g. CLIPCap [14] , CLIP4Caption [21] .  ... 
arXiv:2203.05796v1 fatcat:dlreojzy45chnghb7mttwfochm

VoiceFixer: Toward General Speech Restoration with Neural Vocoder [article]

Haohe Liu, Qiuqiang Kong, Qiao Tian, Yan Zhao, DeLiang Wang, Chuanzeng Huang, Yuxuan Wang
2021 arXiv   pre-print
Prior methods mainly focus on single-task speech restoration (SSR), such as speech denoising or speech declipping.  ...  We evaluate VoiceFixer with additive noise, room reverberation, low-resolution, and clipping distortions.  ...  Recently, several deep learning based one-stage methods have been proposed to model f (·) such as fully connected neural networks, recurrent neural networks, and convolutional neural networks.  ... 
arXiv:2109.13731v3 fatcat:m5yuo44k6bad7iyotzu2vtqkyq

2021 Index IEEE Journal of Selected Topics in Signal Processing Vol. 15

2021 IEEE Journal on Selected Topics in Signal Processing  
., +, JSTSP Feb. 2021 264-278 Neural networks Accurate and Lightweight Image Super-Resolution With Model-Guided Deep Unfolding Network.  ...  ., +, JSTSP June 2021 954-967 Image denoising Accurate and Lightweight Image Super-Resolution With Model-Guided Deep Unfolding Network.  ... 
doi:10.1109/jstsp.2021.3135675 fatcat:pofbfingjbc7dhn5i7mtqbebuy

A survey and an extensive evaluation of popular audio declipping methods [article]

Pavel Záviška, Pavel Rajmic, Alexey Ozerov, Lucas Rencker
2020 arXiv   pre-print
The article is accompanied with the repository containing the evaluated methods.  ...  In this paper, we provide an extensive review of audio declipping algorithms proposed in the literature.  ...  Průša for helping with the accompanying HTML page. Thanks to S. Kitić and N. Bertin for discussing SPADE algorithms and projections with tight frames. The work of P. Záviška and P.  ... 
arXiv:2007.07663v1 fatcat:7xdsmuxu3nhsvczztaq42bly7m

INDIGO: Intrinsic Multimodality for Domain Generalization [article]

Puneet Mangla and Shivam Chandhok and Milan Aggarwal and Vineeth N Balasubramanian and Balaji Krishnamurthy
2022 arXiv   pre-print
with the visual modality to enhance generalization to unseen domains at test-time.  ...  However, when multiple source domains are involved, the cost of curating textual annotations for every image in the dataset can blow up several times, depending on their number.  ...  DeCLIP [43] employs additional self-, multi-view, nearestneighbor supervision along with image-text contrastive supervision to match the performance of CLIP but with 7.1x lesser data.  ... 
arXiv:2206.05912v1 fatcat:ff4f25xzt5hfjizs53qftpwkpe

Speech Enhancement via Deep Spectrum Image Translation Network [article]

Hamidreza Baradaran Kashani, Ata Jodeiri, Mohammad Mohsen Goodarzi, Iman Sarraf Rezaei
2019 arXiv   pre-print
To this end, we suggest a new architecture, called VGG19-UNet, where a deep fully convolutional network known as VGG19 is embedded at the encoder part of an image-to-image translation network, i.e.  ...  Motivated to address this problem, we propose a novel speech enhancement approach using a deep spectrum image translation network.  ...  In fact, in the proposed image translation network, VGG19 plays the role of a very deep feature extractor of the noisy spectrum images.  ... 
arXiv:1911.01902v1 fatcat:spa5co4mcrexvhsm3ecsekgcni

Introducing SPAIN (SParse Audio INpainter) [article]

Ondřej Mokrý, Pavel Záviška, Pavel Rajmic, Vítězslav Veselý
2019 arXiv   pre-print
., originally developed for audio declipping, to the task of audio inpainting. The new SPAIN (SParse Audio INpainter) comes in synthesis and analysis variants.  ...  Moreover, A-SPAIN performs on a par with the state-of-the-art method based on linear prediction in terms of the SNR, and, for larger gaps, SPAIN is even slightly better in terms of the PEMO-Q psychoacoustic  ...  One of the latest methods employs a deep neural network for the task of audio inpainting [10] .  ... 
arXiv:1810.13137v4 fatcat:ca5hjthdmjaofb4w5gxugrmmgi

Data-consistent neural networks for solving nonlinear inverse problems

Yoeri E. Boink, Markus Haltmeier, Sean Holman, Johannes Schwab
2022 Inverse Problems and Imaging  
We propose data-consistent neural networks that can be combined with classical regularization methods.  ...  Numerical simulations show that compared to standard two-step deep learning methods, our approach provides better stability with respect to out of distribution examples in the test set, while performing  ...  Werman, Image declipping with deep networks, 2018 25th IEEE International Conference on Image Processing (ICIP), (2018), 3923–3927. [19] M. V. de Hoop, M. Lassas and C. A.  ... 
doi:10.3934/ipi.2022037 fatcat:fljynxnaynajvgdw562vbzymtq

ARMAS: Active Reconstruction of Missing Audio Segments [article]

Zohra Cheddad, Abbas Cheddad
2022 arXiv   pre-print
The results (including comparing the SPAIN, Autoregressive, deep learning-based, graph-based, and other methods) are evaluated with three different metrics.  ...  This work may trigger interest in optimising this approach and/or transferring it to different domains (i.e., image reconstruction).  ...  ) as a continuous tone image.  ... 
arXiv:2111.10891v3 fatcat:7qvzsrzmozgzll3tqsm4pkdhri

Audio inpainting of music by means of neural networks [article]

Andrés Marafioti, Nicki Holighaus, Piotr Majdak, Nathanaël Perraudin
2022 arXiv   pre-print
We studied the ability of deep neural networks (DNNs) to restore missing audio content based on its context, a process usually referred to as audio inpainting.  ...  The proposed DNN structure was trained on audio signals containing music and musical instruments, separately, with 64-ms long gaps.  ...  Our network, inspired by the context encoder for image inpainting [8] , is an encoder-decoder pipeline fed with TF coefficients of the context information, S b and S a (Fig. 1b) .  ... 
arXiv:1810.12138v3 fatcat:ixn4uxfy35haxbkumpjunmgcha

Table of Contents

2021 IEEE/ACM Transactions on Audio Speech and Language Processing  
Guanason Detection and Classification of Acoustic Scenes and Events Receptive Field Regularization Techniques for Audio Classification and Tagging With Deep Convolutional Neural Networks . . . . . . .  ...  Gu Deep Selective Memory Network With Selective Attention and Inter-Aspect Modeling for Aspect Level Sentiment Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  ... 
doi:10.1109/taslp.2021.3137066 fatcat:ocit27xwlbagtjdyc652yws4xa

Table of Contents

2021 IEEE/ACM Transactions on Audio Speech and Language Processing  
Hansen A Deep Adaptation Network for Speech Enhancement: Combining a Relativistic Discriminator With Multi-Kernel Maximum Mean Discrepancy . . . . . . . . . . . . . . . . . . . . J. Cheng, R.  ...  Wang Deep Selective Memory Network With Selective Attention and Inter-Aspect Modeling for Aspect Level Sentiment Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  ... 
doi:10.1109/taslp.2021.3137064 fatcat:rpka3f2bhjh37c7pkhiowyndhm
« Previous Showing results 1 — 15 out of 33 results