Filters








651 Hits in 4.6 sec

Contrastive prediction strategies for unsupervised segmentation and categorization of phonemes and words [article]

Santiago Cuervo, Maciej Grabias, Jan Chorowski, Grzegorz Ciesielski, Adrian Łańcucki, Paweł Rychlikowski, Ricard Marxer
2022 arXiv   pre-print
We investigate the performance on phoneme categorization and phoneme and word segmentation of several self-supervised learning (SSL) methods based on Contrastive Predictive Coding (CPC).  ...  Our multi-level ACPC (mACPC) improves in all categorization metrics and achieves state-of-the-art performance in word segmentation.  ...  The authors thank the Polish National Science Center for funding under the OPUS-18 2019/35/B/ST6/04379 grant and the PlGrid consortium for computational resources.  ... 
arXiv:2110.15909v2 fatcat:sbr3gfyqfva73lyqnn57lumugu

A Rudimentary Lexicon and Semantics Help Bootstrap Phoneme Acquisition

Abdellah Fourtassi, Emmanuel Dupoux
2014 Proceedings of the Eighteenth Conference on Computational Natural Language Learning  
We derive, in an unsupervised way, an approximate lexicon and a rudimentary semantic representation.  ...  Infants spontaneously discover the relevant phonemes of their language without any direct supervision.  ...  Research Council (ERC-2011-AdG-295810 BOOTPHON), the Agence Nationale pour la Recherche (ANR-10-LABX-0087 IEC, ANR-10-IDEX-0001-02 PSL*), the Fondation de France, the Ecole de Neurosciences de Paris, and  ... 
doi:10.3115/v1/w14-1620 dblp:conf/conll/FourtassiD14 fatcat:t7yp6vw5nbffzewr3m3befhxxa

A summary of the 2012 JHU CLSP workshop on zero resource speech technologies and models of early language acquisition

Aren Jansen, Emmanuel Dupoux, Sharon Goldwater, Mark Johnson, Sanjeev Khudanpur, Kenneth Church, Naomi Feldman, Hynek Hermansky, Florian Metze, Richard Rose, Mike Seltzer, Pascal Clark (+15 others)
2013 2013 IEEE International Conference on Acoustics, Speech and Signal Processing  
the application of Bayesian word segmentation algorithms to automatic subword unit tokenizations.  ...  Finally, we present two strategies for integrating zero resource techniques into supervised settings, demonstrating the potential of unsupervised methods to improve mainstream technologies.  ...  segmentation token F-scores (%) for Bernstein-Ratner/Brent phonemes and various Switchboard tokenizations.  ... 
doi:10.1109/icassp.2013.6639245 dblp:conf/icassp/JansenDGJKCFHMRSCMVBBCDFHLLNPRST13 fatcat:4lrcendhhjgz5nmr2fsovmzgae

Unsupervised Speech Recognition via Segmental Empirical Output Distribution Matching [article]

Chih-Kuan Yeh, Jianshu Chen, Chengzhu Yu, Dong Yu
2018 arXiv   pre-print
We propose a fully unsupervised learning algorithm that alternates between solving two sub-problems: (i) learn a phoneme classifier for a given set of phoneme segmentation boundaries, and (ii) refining  ...  Experimental results on TIMIT dataset demonstrate the success of this fully unsupervised phoneme recognition system, which achieves a phone error rate (PER) of 41.6%.  ...  The key ideas of our Segmental Empirical-ODM are: (i) the distribution of the predicted outputs across consecutive segments shall match the phoneme language model and (ii) the predicted outputs within  ... 
arXiv:1812.09323v1 fatcat:dyho7xwgmrc2rkmscepzo6vrwu

Comparing Models of Phonotactics for Word Segmentation

Natalie Schrimpf, Gaja Jarosz
2014 Proceedings of the 2014 Joint Meeting of SIGMORPHON and SIGFSM  
We also introduce a novel estimation method, and compare it to other strategies for estimating the parameters of the phonotactic models from unsegmented data.  ...  The syllablebased transitional probability model achieves a word token f-score of nearly 80%, the highest reported performance for a phonotactic segmentation model with no lexicon.  ...  As Yang discusses, the fatal flaw for this approach is that it categorically fails to segment monosyllabic words, which account for an overwhelming majority of words in child-directed speech.  ... 
doi:10.3115/v1/w14-2803 dblp:conf/sigmorphon/SchrimpfJ14 fatcat:kk42kicp6bavhh6qrqnnm3rsvm

Symbol Emergence in Robotics: A Survey [article]

Tadahiro Taniguchi, Takayuki Nagai, Tomoaki Nakamura, Naoto Iwahashi, Tetsuya Ogata, Hideki Asoh
2015 arXiv   pre-print
Specifically, we describe some state-of-art research topics concerning SER, e.g., multimodal categorization, word discovery, and a double articulation analysis, that enable a robot to obtain words and  ...  their embodied meanings from raw sensory--motor information, including visual information, haptic information, auditory information, and acoustic speech signals, in a totally unsupervised manner.  ...  The DAA explicitly assumes double articulation, and infers the latent letters, i.e., the segment or phoneme, and the latent words, i.e., the words or segments, in an unsupervised manner.  ... 
arXiv:1509.08973v1 fatcat:yg6bscvy2fdpdhapltyonvhs2a

Speech segmentation with a neural encoder model of working memory

Micha Elsner, Cory Shain
2017 Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing  
We present the first unsupervised LSTM speech segmenter as a cognitive model of the acquisition of words from unsegmented input.  ...  To our knowledge, ours is the first fully unsupervised system to be able to segment both symbolic and acoustic representations of speech.  ...  Acknowledgments We thank members of the Clippers discussion group (especially Joo-Kyung Kim, Eric Fosler-Lussier and Wei Xu) and three anonymous reviewers.  ... 
doi:10.18653/v1/d17-1112 dblp:conf/emnlp/ElsnerS17 fatcat:5dohwiweezcqdghztq7xyvnkvq

The Utility of Cognitive Plausibility in Language Acquisition Modeling: Evidence From Word Segmentation

Lawrence Phillips, Lisa Pearl
2015 Cognitive Science  
We discuss these cognitive plausibility checkpoints generally and then apply them to a case study in word segmentation, investigating a promising Bayesian segmentation strategy.  ...  Our more cognitively plausible model shows a beneficial effect of cognitive constraints on segmentation performance.  ...  This contrasts with other segmentation strategies that require the unit to be the syllable (Lignos 2012) or the phoneme (TPminima).  ... 
doi:10.1111/cogs.12217 pmid:25656757 fatcat:tehrrftu5zayzfxjv2g6pifolm

Bootstrapping Word Boundaries: A Bottom-up Corpus-Based Approach to Speech Segmentation

Paul Cairns, Richard Shillcock, Nick Chater, Joe Levy
1997 Cognitive Psychology  
Speech is continuous, and isolating meaningful chunks for lexical access is a nontrivial problem.  ...  In particular, we confirm the utility of the Metrical Segmentation Strategy (Cutler & Norris, 1988) and demonstrate a route by which this utility might be recognized by the infant, without requiring the  ...  So, if an infant were constantly trying to predict input, dissonance between predictions and reality would provide the necessary impulse for our segmentation strategy to emerge.  ... 
doi:10.1006/cogp.1997.0649 pmid:9245468 fatcat:3jwmfsr2dfhfnm636yntomnev4

Learning Phonemes With a Proto-Lexicon

Andrew Martin, Sharon Peperkamp, Emmanuel Dupoux
2012 Cognitive Science  
We show that a third type of information source, the occurrence of pairs of minimally differing word forms in speech heard by the infant, is also useful for learning phonemic categories and is in fact  ...  Peperkamp, Le Calvez, Nadal, and Dupoux (2006) present an algorithm that can discover phonemes using the distributions of allophones as well as the phonetic properties of the allophones and their contexts  ...  Agence Nationale pour la Recherche and ERC-2011-AdG-295810 from the European Research Council.  ... 
doi:10.1111/j.1551-6709.2012.01267.x pmid:22985465 fatcat:gv2r6wkfnraptgjqhdvpbqkq2e

Learning to Discover, Ground and Use Words with Segmental Neural Language Models [article]

Kazuya Kawakami, Chris Dyer, Phil Blunsom
2019 arXiv   pre-print
In contrast to previous segmentation models that treat word segmentation as an isolated task, our model unifies word discovery, learning how words fit together to form sentences, and, by conditioning the  ...  Experiments show that the unconditional model learns predictive distributions better than character LSTM models, discovers words competitively with nonparametric Bayesian word segmentation models, and  ...  For example, if there is a pair of a reference segmentation and a prediction, Reference: do you see a boy Prediction: doyou see a boy then 4 words are discovered in the prediction where the reference has  ... 
arXiv:1811.09353v2 fatcat:gdbodihmkbgr3ccua4hh753g6a

A Computational Model of Early Language Acquisition from Audiovisual Experiences of Young Infants

Okko Räsänen, Khazar Khorrami
2019 Interspeech 2019  
the name of the object, and using random visual labels for utterances during absence of attention.  ...  This paper presents a step towards a more realistic test of the multimodal bootstrapping hypothesis by describing a neural network model that can learn word segments and their meanings from referentially  ...  On the other hand, first segmenting words and then trying to find their referents risks suboptimal segmentation strategies, as segmentation in isolation is not a well-defined task.  ... 
doi:10.21437/interspeech.2019-1523 dblp:conf/interspeech/RasanenK19 fatcat:extdlnk365gdzm2mktbn7nsyjy

An Empirical Evaluation of Zero Resource Acoustic Unit Discovery [article]

Chunxi Liu, Jinyi Yang, Ming Sun, Santosh Kesiraju, Alena Rott, Lucas Ondel, Pegah Ghahremani, Najim Dehak, Lukas Burget, Sanjeev Khudanpur
2017 arXiv   pre-print
AUD provides an important avenue for unsupervised acoustic model training in a zero resource setting where expert-provided linguistic knowledge and transcribed speech are unavailable.  ...  Acoustic unit discovery (AUD) is a process of automatically identifying a categorical acoustic unit inventory from speech and producing corresponding acoustic unit tokenizations.  ...  the unsupervised AUD accuracies, such that we can still evaluate AUD efficacy in the zero-resource condition that no orthographic phoneme transcripts for NMI measure are available but only word pairs.  ... 
arXiv:1702.01360v1 fatcat:3ydcq25fdvc2hpiimtepwfrheq

Latent perceptual mapping with data-driven variable-length acoustic units for template-based speech recognition

Shiva Sundaram, Jerome R. Bellegarda
2012 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)  
Our initial work adopted a bag-of-frames strategy to represent relevant acoustic information within speech segments.  ...  The outcome can be viewed as a generalization of both conventional template-based approaches and recently proposed sparse representation solutions.  ...  A nearestneighbor rule is then adopted to predict the phoneme label of the unknown test segment.  ... 
doi:10.1109/icassp.2012.6288826 dblp:conf/icassp/SundaramB12 fatcat:ragphlsztzbnhfj76ulyiayqzu

Towards Zero-Shot Learning for Automatic Phonemic Transcription

Xinjian Li, Siddharth Dalmia, David Mortensen, Juncheng Li, Alan Black, Florian Metze
2020 PROCEEDINGS OF THE THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE AND THE TWENTY-EIGHTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE  
Instead of predicting phonemes directly, we first predict distributions over articulatory attributes, and then compute phoneme distributions with a customized acoustic model.  ...  Automatic phonemic transcription tools are useful for low-resource language documentation.  ...  Acknowledgements This project was sponsored by the Defense Advanced Research Projects Agency (DARPA) Information Innovation Office (I2O), program: Low Resource Languages for Emergent Incidents (LORELEI  ... 
doi:10.1609/aaai.v34i05.6341 fatcat:ax4mqalpnjaw7jonbgxsednxrm
« Previous Showing results 1 — 15 out of 651 results