Filters








1,122 Hits in 3.0 sec

Untangling in Invariant Speech Recognition [article]

Cory Stephenson, Jenelle Feather, Suchismita Padhy, Oguz Elibol, Hanlin Tang, Josh McDermott, SueYeon Chung
2020 arXiv   pre-print
Taken together, these findings shed light on how deep auditory models process time dependent input signals to achieve invariant speech recognition, and show how different concepts emerge through the layers  ...  Higher level concepts such as parts-of-speech and context dependence also emerge in the later layers of the network.  ...  Untangling in Invariant Speech Recognition -Supplemental Material 5 Details on measuring empirical and theoretical manifold capacity Empirical Manifold Capacity Here we provide a detailed description  ... 
arXiv:2003.01787v1 fatcat:6iuxurdn4zek3h3vzk2bjxatoy

Neural population geometry: An approach for understanding biological and artificial neural networks [article]

SueYeon Chung, L. F. Abbott
2021 arXiv   pre-print
capacity, disentanglement and abstraction in cognitive systems, topological representations underlying cognitive maps, dynamic untangling in motor systems, and a dynamical approach to cognition.  ...  We review examples of geometrical approaches providing insight into the function of biological and artificial neural networks: representation untangling in perception, a geometric theory of classification  ...  This theory has been used to show how categorical information emerges across layer hierarchy as a result of geometrical changes in ANNs implementing visual object recognition [6] , speech recognition  ... 
arXiv:2104.07059v2 fatcat:i3huichzs5eehcb7ij6rthkl2e

Low-latency Speaker-independent Continuous Speech Separation

Takuya Yoshioka, Zhuo Chen, Changliang Liu, Xiong Xiao, Hakan Erdogan, Dimitrios Dimitriadis
2019 ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)  
The output signals can be simply sent to a speech recognition engine because they do not include speech overlaps.  ...  The previous SI-CSS method uses a neural network trained with permutation invariant training and a data-driven beamformer and thus requires much processing latency.  ...  Speech separation, whose goal is to untangle a mixture of cooccuring speech signals, could potentially solve the overlapping speech problem in far-field conversation transcription.  ... 
doi:10.1109/icassp.2019.8682274 dblp:conf/icassp/YoshiokaCLXED19 fatcat:nvx2cfpbkrfr3jm2fv4f3v6ec4

Low-Latency Speaker-Independent Continuous Speech Separation [article]

Takuya Yoshioka, Zhuo Chen, Changliang Liu, Xiong Xiao, Hakan Erdogan, Dimitrios Dimitriadis
2019 arXiv   pre-print
The output signals can be simply sent to a speech recognition engine because they do not include speech overlaps.  ...  The previous SI-CSS method uses a neural network trained with permutation invariant training and a data-driven beamformer and thus requires much processing latency.  ...  Speech separation, whose goal is to untangle a mixture of cooccuring speech signals, could potentially solve the overlapping speech problem in far-field conversation transcription.  ... 
arXiv:1904.06478v1 fatcat:wjozo4mthvel7bpwuuoiw6y4ny

Dual Encoder-Decoder based Generative Adversarial Networks for Disentangled Facial Representation Learning

Cong Hu, Zhen-Hua Feng, Xiao-Jun Wu, Josef Kittler
2020 IEEE Access  
The proposed network is evaluated on the tasks of pose-invariant face recognition (PIFR) and face synthesis across poses.  ...  In the proposed method, both the generator and discriminator are designed with deep encoder-decoder architectures as their backbones.  ...  The proposed DED-GAN method is also quantitatively evaluated for pose invariant face recognition. B.  ... 
doi:10.1109/access.2020.3009512 fatcat:km463jwtzfe6hd4oi373npvmve

Combining Adversarial Training and Disentangled Speech Representation for Robust Zero-Resource Subword Modeling

Siyuan Feng, Tan Lee, Zhiyuan Peng
2019 Interspeech 2019  
This study addresses the problem of unsupervised subword unit discovery from untranscribed speech.  ...  It forms the basis of the ultimate goal of ZeroSpeech 2019, building text-to-speech systems without text labels.  ...  Deep neural network (DNN) acoustic models (AMs) for the tasks of automatic speech recognition (ASR) and speech synthesis have shown impressive performance for major languages such as English and Mandarin  ... 
doi:10.21437/interspeech.2019-1337 dblp:conf/interspeech/FengLP19 fatcat:y3lftkiqunb3vfbze7dw5jygqq

Combining Adversarial Training and Disentangled Speech Representation for Robust Zero-Resource Subword Modeling [article]

Siyuan Feng, Tan Lee, Zhiyuan Peng
2019 arXiv   pre-print
This study addresses the problem of unsupervised subword unit discovery from untranscribed speech.  ...  It forms the basis of the ultimate goal of ZeroSpeech 2019, building text-to-speech systems without text labels.  ...  Deep neural network (DNN) acoustic models (AMs) for the tasks of automatic speech recognition (ASR) and speech synthesis have shown impressive performance for major languages such as English and Mandarin  ... 
arXiv:1906.07234v3 fatcat:ql2jnvscmndt7f7sz2c4ej7f74

Beyond biometrics

Egon L. van den Broek
2010 Procedia Computer Science  
In this article, a new class of biometrics is proposed that is founded on processing biosignals, as opposed to images.  ...  Next, biosignals' use is illustrated by two biosignal-based biometrics: voice identification and handwriting recognition. Additionally, the concept of a digital human model is introduced.  ...  Speech interface Recently, it has been shown that speech recognition can be achieved even without sound, without processing the speech signal itself [16] .  ... 
doi:10.1016/j.procs.2010.04.284 fatcat:iqc6zyt4vvedbnz53cyxtxxxuy

Latest Advances in Human Brain Dynamics

Stavros I. Dimitriadis
2021 Brain Sciences  
It is paramount for every neuroscientist to understand the nature of emerging technologies and approaches in investigating functional brain dynamics [...]  ...  This novel study untangled the importance of invariant representation of computer vision and the deeper conception of the representation invariant mechanism of the human visual information processing.  ...  Thus, representation invariance plays a typical role in CNN and human visual processing information under complicated image-based tasks.  ... 
doi:10.3390/brainsci11111476 pmid:34827475 pmcid:PMC8615593 fatcat:6nfzglsau5he3nhg7gji3yw2hy

Training neural networks to recognize speech increased their correspondence to the human auditory pathway but did not yield a shared hierarchy of acoustic features [article]

Jessica AF Thompson, Yoshua Bengio, Elia Formisano, Marc Schönwiesner
2021 bioRxiv   pre-print
Here, we compared the representations of CNNs trained to recognize speech (triphone recognition) to 7-Tesla fMRI activity collected throughout the human auditory pathway, including subcortical and cortical  ...  regions, while participants listened to speech.  ...  Cox, Untangling invariant object recognition, Trends in 556 Cognitive Sciences 11 (2007) 333-341. doi:10.1016/j.tics.2007.06.010. 557 Y. Bengio, A. Courville, P.  ... 
doi:10.1101/2021.01.26.428323 fatcat:mvngxdcfqbe3ndxirojhutu7hi

Measuring and modeling the motor system with machine learning

Sebastien B. Hausmann, Alessandro Marin Vargas, Alexander Mathis, Mackenzie W. Mathis
2021 Current Opinion in Neurobiology  
, kinematic analyses, dimensionality reduction, and closed-loop feedback, to its use in understanding neural correlates and untangling sensorimotor systems.  ...  The utility of machine learning in understanding the motor system is promising a revolution in how to collect, measure, and analyze data.  ...  Funding was provided, in part by the SNSF Grant #201057 to MWM.  ... 
doi:10.1016/j.conb.2021.04.004 pmid:34116423 fatcat:biacdnyth5dvrd7udappz6ylpy

Learning to perceive and recognize a second language: the L2LP model revised

Jan-Willem van Leussen, Paola Escudero
2015 Frontiers in Psychology  
We present a test of a revised version of the Second Language Linguistic Perception (L2LP) model, a computational model of the acquisition of second language (L2) speech perception and recognition.  ...  In sum, the proposed revision to the L2LP model contributes to our understanding of L2 acquisition, with implications for speech processing in general.  ...  Thanks also go out to Klara Weiand, Paul Boersma and the audience at OCP9 in Berlin for earlier comments on this work.  ... 
doi:10.3389/fpsyg.2015.01000 pmid:26300792 pmcid:PMC4523759 fatcat:7rlijdawubg6fagiptrsx7wuvq

SOURCES AND DEVELOPMENT OF ORAL SELF EXPRESSION IN A FOREIGN LANGUAGE

Colley F. Sparkman
1934 The Modern Language Journal  
speech.  ...  SPARKMAN This item of conscious changing of the variable elements in a given phrase may a bit later also include a change of the invariable element for another fixed element.  ... 
doi:10.1111/j.1540-4781.1934.tb00103.x fatcat:433jvj2qq5eeraid2u52tjqhau

A Systematic Review on Affective Computing: Emotion Models, Databases, and Recent Advances [article]

Yan Wang, Wei Song, Wei Tao, Antonio Liotta, Dawei Yang, Xinlei Li, Shuyong Gao, Yixuan Sun, Weifeng Ge, Wei Zhang, Wenqiang Zhang
2022 arXiv   pre-print
Major breakthroughs have been made recently in the areas of affective computing (i.e., emotion recognition and sentiment analysis).  ...  Next, we survey and taxonomize state-of-the-art unimodal affect recognition and multimodal affective analysis in terms of their detailed architectures and performances.  ...  For identity-invariant FER, Ali and Hughes [291] proposed a novel disentangled expression learning GAN (DE-GAN) by untangling the facial expression representation from identity information. Yu et al  ... 
arXiv:2203.06935v3 fatcat:h4t3omkzjvcejn2kpvxns7n2qe

Understanding rostral–caudal auditory cortex contributions to auditory perception

Kyle Jasmin, César F. Lima, Sophie K. Scott
2019 Nature Reviews Neuroscience  
There are functional and anatomical distinctions between the neural systems involved in the recognition of sounds in the environment and those involved in the sensorimotor guidance of sound production  ...  Evidence for the separation of these processes has historically come from disparate literatures on the perception and production of speech, music and other sounds.  ...  [H1] Rostral auditory processing [H2] Recognition processes Human speech is a perfect example of a spectrally complex sound.  ... 
doi:10.1038/s41583-019-0160-2 pmid:30918365 pmcid:PMC6589138 fatcat:ezxf4kyr35efjnakc6b2lev6ou
« Previous Showing results 1 — 15 out of 1,122 results