Filters








81 Hits in 8.8 sec

EARSHOT: A Minimal Neural Network Model of Incremental Human Speech Recognition

James S Magnuson, Heejo You, Sahil Luthra, Monica Li, Hosung Nam, Monty Escabí, Kevin Brown, Paul D Allopenna, Rachel M Theodore, Nicholas Monto, Jay G Rueckl
2020 Cognitive Science  
Most models of human speech recognition (HSR) have side-stepped this problem, working with abstract, idealized inputs and deferring the challenge of working with real speech.  ...  This allows the model to learn to map real speech from multiple talkers to semantic targets with high accuracy, with human-like timecourse of lexical access and phonological competition.  ...  We thank Eddie Chang and Nima Mesgarani for supplying us with data from Mesgarani et al. (2014) used to compare EARSHOT and human STG responses.  ... 
doi:10.1111/cogs.12823 pmid:32274861 fatcat:fwhc7ud7xza5xdmwbw55nj7akq

Detection of Sentence Boundaries in Polish Based on Acoustic Cues

Magdalena Igras, Bartosz Ziółko
2016 Archives of Acoustics  
Classification is performed with parameters describing phone duration and energy, speaking rate, fundamental frequency contours and frequency bands.  ...  The efficiency of the algorithm is 52% precision and 98% recall. Another significant outcome of the research is statistical models of acoustic cues correlated with punctuation in spoken Polish.  ...  Grzybowska and S. Kacprzak for their comments and suggestions, which helped us prepare this paper.  ... 
doi:10.1515/aoa-2016-0023 fatcat:kii4lw6otncmbpbio3v5i34az4

Shortlist B: A Bayesian model of continuous speech recognition

Dennis Norris, James M. McQueen
2008 Psychological review  
, and evidence on lexical involvement in phonemic decision-making.  ...  abstract prelexical and lexical representations, a feedforward architecture with no online feedback, and a lexical segmentation algorithm based on the viability of chunks of the input as possible words  ...  We also thank Roel Smits for his help with this project, in particular for his work on the model's input, and Matthias Sjerps for his assistance with the figures.  ... 
doi:10.1037/0033-295x.115.2.357 pmid:18426294 fatcat:bkzshcyyarduvmeyqxdvzcmqby

Detecting word-level stress in continuous speech: A case study of Brazilian Portuguese

Simone Harmath-de Lemos
2021 Journal of Portuguese Linguistics  
These results, along with metrics obtained for vowels in pre-and posttonic positions indicate (i) that stress in BP is captured fairly well across speakers and genders by representations of the speech  ...  in a given language, and across languages-such as vowel duration, intensity, pitch (F0), and spectral features 3 -and in part because measuring the acoustic correlates of stress is a task that can be  ...  The present work came to life during the 2020 pandemic and would not have seen the light of the day without the invaluable support of the editors of this special issue.  ... 
doi:10.5334/jpl.238 fatcat:iazas7pxkrdyhgftpark42n2qu

From Here to Utility [chapter]

Steven Greenberg
2005 Text, Speech and Language Technology  
Keywords: Speech technology, automatic speech recognition, prosody, phonetics, spontaneous speech, syllable structure  ...  Department of Defense and the National Science Foundation.  ...  ACKNOWLEDGEMENTS The author wishes to thank Hannah Carvey, Shuangyu Chang, Jeff Good, Leah Hitchcock and Rosaria Silipo for important contributions to the research described.  ... 
doi:10.1007/1-4020-2637-4_7 fatcat:x4cqkaoqqndexbukrpvkk4x4ay

TOWARD AN UNDERSTANDING OF THE ROLE OF SPEECH RECOGNITION IN NONNATIVE SPEECH ASSESSMENT

Klaus Zechner, Isaac I. Bejar, Ramin Hemat
2007 ETS Research Report Series  
Since its inception in 1963, the TOEFL has evolved from a paper-based test to a computer-based test and, in 2005, to an Internet-based test, TOEFL iBT.  ...  The increasing availability and performance of computer-based testing has prompted more research on the automatic assessment of language and speaking proficiency.  ...  Therefore, an important metric in comparing ASR systems is their word error rate (see Jurafsky & Martin, 2000, p. 271) . The architecture of an ASR system has become fairly standardized.  ... 
doi:10.1002/j.2333-8504.2007.tb02044.x fatcat:tsnlv5zc3bhr5bxf3ghz5uitrm

Artificial Intelligence, speech and language processing approaches to monitoring Alzheimer's Disease: a systematic review [article]

Sofia de la Fuente Garcia, Craig Ritchie, Saturnino Luz
2020 arXiv   pre-print
Consequently, speech and language data have been extensively studied in connection with its diagnosis.  ...  Language is a valuable source of clinical information in Alzheimer's Disease, as it declines concurrently with neurodegeneration.  ...  Acknowledgements We thank Marshall Dozier, the Academic Support Librarian for her help with the search queries and the PROSPERO protocol.  ... 
arXiv:2010.06047v1 fatcat:gowcdpj6pfddfns3gh7amtqpze

Abstracts of the 8th International Conference on Speech Motor Control Groningen, August 2022

Redactie SSTP
2022 Stem- Spraak- en Taalpathologie  
The five years since the previous conference in 2017 have yielded not only further insights in genetic, neural, physiological and developmental aspects of speech production, stuttering and other speech  ...  turning point in which large data sets are analyzed with powerful artificial intelligence and machine learning algorithms.  ...  We look forward to a stimulating and productive conference, Ben Maassen Groningen,  ... 
doi:10.21827/32.8310/2022-115 fatcat:7i3ekwzxijaobirgmxzmojt4a4

Recognizing speech in a novel accent: the motor theory of speech perception reframed

Clément Moulin-Frier, Michael A. Arbib
2013 Biological cybernetics  
The core tenet of the model is that the listener uses hypotheses about the word the speaker is currently uttering to update probabilities linking the sound produced by the speaker to phonemes in the native  ...  Part 1 places the motor theory of speech perception in a larger framework based on our earlier models of the adaptive formation of mirror neurons for grasping, and for viewing extensions of that mirror  ...  The mirror neurons for manual actions in the macaque include a subset, the audiovisual mirror neurons, whose firing is correlated not only with performance and visual observation of a certain type of manual  ... 
doi:10.1007/s00422-013-0557-3 pmid:23754133 fatcat:m5qgdnu4xjfldluovibostcqj4

Computational modeling of phonetic and lexical learning in early language acquisition: Existing models and future directions

Okko Räsänen
2012 Speech Communication  
Specifically, the focus is on how phonetic categories and word-like units can be acquired purely on the basis of the statistical structure of speech signals, possibly aided by some articulatory or visual  ...  background knowledge, a situation faced by human infants when they first attempt to learn their native language.  ...  Moore, and the two anonymous reviewers for their invaluable comments on the manuscript. COMPUTATIONAL MODELING OF EARLY LANGUAGE ACQUISITION 54  ... 
doi:10.1016/j.specom.2012.05.001 fatcat:lxqfltnne5gt5mibfrqj2vzmhi

The analysis of speech in different temporal integration windows: cerebral lateralization as 'asymmetric sampling in time'

David Poeppel
2003 Speech Communication  
commensurate with syllabicity and intonation contours), and (2) that speech perception is mediated by both left and right auditory cortices, AST suggests a time-based perspective that maintains anatomic  ...  The AST model is motivated by observations from psychophysics and cognitive neuroscience that speak to the fractionation of auditory processing, in general, and speech perception, in particular.  ...  (3) Sound-based representations interface in task-dependent ways with other systems.  ... 
doi:10.1016/s0167-6393(02)00107-3 fatcat:gi4pjgyxzndqhbywv5j76rntk4

Silent Speech Interfaces for Speech Restoration: A Review [article]

Jose A. Gonzalez-Lopez, Alejandro Gomez-Alanis, Juan M. Martín-Doñas, José L. Pérez-Córdoba, Angel M. Gomez
2020 arXiv   pre-print
In this review, we focus on the first case and present latest SSI research aimed at providing new alternative and augmentative communication methods for persons with severe speech disorders.  ...  SSIs rely on non-acoustic biosignals generated by the human body during speech production to enable communication whenever normal verbal communication is not possible or not desirable.  ...  Later attempts focused on phoneme [207] and syllable [208] recognition but with a very limited dataset (3 phonemes and 6 syllables, respectively).  ... 
arXiv:2009.02110v2 fatcat:i2o4zxqko5anhn2eqivtnsd2di

Silent Speech Interfaces for Speech Restoration: A Review

Jose A. Gonzalez-Lopez, Alejandro Gomez-Alanis, Juan M. Martin-Donas, Jose L. Perez-Cordoba, Angel M. Gomez
2020 IEEE Access  
In this review, we focus on the first case and present latest SSI research aimed at providing new alternative and augmentative communication methods for persons with severe speech disorders.  ...  SSIs rely on non-acoustic biosignals generated by the human body during speech production to enable communication whenever normal verbal communication is not possible or not desirable.  ...  Later attempts focused on phoneme [207] and syllable [208] recognition but with a very limited dataset (3 phonemes and 6 syllables, respectively).  ... 
doi:10.1109/access.2020.3026579 fatcat:yvvaebeavfdfrav73sfs62a5dm

An Efficient Technique to Implement Similarity Measures in Text Document Clustering using Artificial Neural Networks Algorithm

K. Selvi, R.M. Suresh
2014 Research Journal of Applied Sciences Engineering and Technology  
All the phonemes, syllables, letters, words or base pair corresponds in accordance to the application.  ...  The model of similarity evaluation is the central element in accomplishing a perceptive of variables and perception that encourage behavior and mediate concern.  ...  The main part of this study work extends n-gram proposed algorithm, based on Artificial Neural Networks and matches all the phonemes, syllables, letters, words or base pairs as per the application.  ... 
doi:10.19026/rjaset.8.1235 fatcat:pvr3bkjtmvde3bzqhqq7vemapa

Multimodal Language Acquisition Based on Motor Learning and Interaction [chapter]

Jonas Hörnstein, Lisa Gustavsson, José Santos-Victor, Francisco Lacerda
2010 Studies in Computational Intelligence  
Acknowledgements This work was partially supported by EU Project CONTACT and by the Fundação para a Ciência e a Tecnologia (ISR/IST pluriannual funding) through the POS Conhecimento Program that includes  ...  phonemes do not need to be pre-programmed but can emerge as a result of the interaction and can be represented in the form of vocal tract target positions.  ...  There is a strong correlation between how the robot and a human articulate the vowels.  ... 
doi:10.1007/978-3-642-05181-4_20 fatcat:fhrosfupojhxxchfl64o3a72ne
« Previous Showing results 1 — 15 out of 81 results