A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2011; you can also visit the original URL.
The file type is application/pdf
.
Filters
Modeling human activities as speech
2011
CVPR 2011
While the essence of the speech signal is the variation of air pressure in time, our method models activities as the likelihood time series of action associated local interest patterns. ...
Human activity recognition and speech recognition appear to be two loosely related research areas. ...
We are motivated to model human activities as speech due to the analogies between their production mechanisms. ...
doi:10.1109/cvpr.2011.5995555
dblp:conf/cvpr/ChenA11
fatcat:cfakbtxj4rbrjdn2gbw633urhi
Communication culture and speech etiquette
2022
Ренессанс в парадигме новаций образования и технологий в XXI веке
In turn, Yakubinsky noted human speech activity as a diverse phenomenon, determined by all the complex variety of factors and functions [8, 17-58]. ...
Therefore, the socialization of the personality takes place, during which the child's thinking and models of his behavior are formed, therefore the social function of the language as a means of communication ...
In turn, Yakubinsky noted human speech activity as a diverse phenomenon, determined by all the complex variety of factors and functions [8, . ...
doi:10.47689/innovations-in-edu-vol-iss1-pp39-40
fatcat:mjcdlaqzzna6vo7b22kjtwt3a4
Optimizing Speech Recognition Using a Computational Model of Human Hearing: Effect of Noise Type and Efferent Time Constants
2020
IEEE Access
In this study, an auditory model with efferent-inspired processing provided the front-end to an automatic-speech-recognition system (ASR), used as a tool to evaluate speech recognition with changes in ...
The model improves our understanding of the complex interactions involved in speech recognition in noise, and could be used to simulate the difficulties of speech perception in noise as a consequence of ...
In general, the speech recognition accuracy obtained is lower than that observable for a human listener (the human-machine speech gap), as seen in a study of human listeners' performance on the same speech ...
doi:10.1109/access.2020.2981885
fatcat:vly5mjpde5exjejlp2xerldxgy
Visualizing Phoneme Category Adaptation in Deep Neural Networks
2018
Interspeech 2018
ability to serve as a model of human perceptual learning. ...
The aim of this paper is two-fold: investigate whether a deep neural network-based (DNN) ASR system can adapt to only a few examples of ambiguous speech as humans have been found to do; investigate a DNN's ...
, and show that DNNs can be used as a way to investigate human speech processing. ...
doi:10.21437/interspeech.2018-1707
dblp:conf/interspeech/ScharenborgTHD18
fatcat:mr47gspfjvat7i7m7y62g2ugx4
Bridging automatic speech recognition and psycholinguistics: Extending Shortlist to an end-to-end model of human speech recognition (L)
2003
Journal of the Acoustical Society of America
Experiments based on "real-life" speech highlight critical limitations posed by some of the simplifying assumptions made in models of human speech recognition. ...
This letter evaluates potential benefits of combining human speech recognition ͑HSR͒ and automatic speech recognition by building a joint model of an automatic phone recognizer ͑APR͒ and a computational ...
referred to as ''joint model''͒ that can be regarded as an end-to-end model of human speech recognition. ...
doi:10.1121/1.1624065
pmid:14714783
fatcat:qkohvlql3jdpxku23u64bxjjiu
AVA-Speech: A Densely Labeled Dataset of Speech Activity in Movies
[article]
2018
arXiv
pre-print
Speech activity detection (or endpointing) is an important processing step for applications such as speech recognition, language identification and speaker diarization. ...
The labels in the dataset annotate three different speech activity conditions: clean speech, speech co-occurring with music, and speech co-occurring with noise, which enable analysis of model performance ...
instant and keep the max score as the model prediction for Speech-Active. ...
arXiv:1808.00606v2
fatcat:ipttxwsxrnchjkbmv3conxc2hq
Deep Residual Local Feature Learning for Speech Emotion Recognition
[chapter]
2020
Lecture Notes in Computer Science
Speech Emotion Recognition (SER) is becoming a key role in global business today to improve service efficiency, like call center services. Recent SERs were based on a deep learning approach. ...
detail in deeper layers using residual learning for solving vanishing gradient and reducing overfitting; and MLP is adopted to find the relationship of learning and discover probability for predicted speech ...
Here, we briefly describe three important components of speech signals: glottal flow, prosody, and human hearing. Glottal flow can be viewed as a source of speech signals [25] . ...
doi:10.1007/978-3-030-63830-6_21
fatcat:26tbej4bmfe5zer5kuav2c7oky
From Birdsong to Human Speech Recognition: Bayesian Inference on a Hierarchy of Nonlinear Dynamical Systems
2013
PLoS Computational Biology
level and translated a birdsong model into a novel human sound learning and recognition model with an emphasis on speech. ...
We show that the resulting Bayesian model with a hierarchy of nonlinear dynamical systems can learn speech samples such as words rapidly and recognize them robustly, even in adverse conditions. ...
Conceptual overview: A generative model of human speech As a model, we employ a novel Bayesian recognition method of dynamical sensory input such as birdsong and speech. ...
doi:10.1371/journal.pcbi.1003219
pmid:24068902
pmcid:PMC3772045
fatcat:wriv3xx3rverrjjzrcgxvt6mcy
EARSHOT: A Minimal Neural Network Model of Incremental Human Speech Recognition
2020
Cognitive Science
Most models of human speech recognition (HSR) have side-stepped this problem, working with abstract, idealized inputs and deferring the challenge of working with real speech. ...
This allows the model to learn to map real speech from multiple talkers to semantic targets with high accuracy, with human-like timecourse of lexical access and phonological competition. ...
We thank Eddie Chang and Nima Mesgarani for supplying us with data from Mesgarani et al. (2014) used to compare EARSHOT and human STG responses. ...
doi:10.1111/cogs.12823
pmid:32274861
fatcat:fwhc7ud7xza5xdmwbw55nj7akq
AVA-Speech: A Densely Labeled Dataset of Speech Activity in Movies
2018
Interspeech 2018
Speech activity detection (or endpointing) is an important processing step for applications such as speech recognition, language identification and speaker diarization. ...
The labels in the dataset annotate three different speech activity conditions: clean speech, speech co-occurring with music, and speech cooccurring with noise, which enable analysis of model performance ...
instant and keep the max score as the model prediction for Speech-Active. ...
doi:10.21437/interspeech.2018-2028
dblp:conf/interspeech/ChaudhuriREGKMP18
fatcat:tkldyaebb5cj5peipc5s4xxwqy
Exploring the Dependencies between Behavioral and Neuro-physiological Time-series Extracted from Conversations between Humans and Artificial Agents
2020
Proceedings of the 9th International Conference on Pattern Recognition Applications and Methods
The second step consists in applying machine learning models to predict brain activity on the basis of various aspects of behavior given knowledge about the functional role of the areas under scrutiny. ...
Here, we use a unique corpus including fMRI and behavior recorded when participants discussed with a human or a conversational robot. ...
These recordings include speech produced by the two interlocutors, as well as eyetracking signals of the participant while viewing videos of the human or artificial interlocutor. ...
doi:10.5220/0008989503530360
dblp:conf/icpram/HmamoucheO0C20
fatcat:ykrea3wqgffc5mji63f7lkae5y
A neural theory of speech acquisition and production
2012
Journal of Neurolinguistics
The DIVA model thus provides a well-defined framework for guiding the interpretation of experimental results related to the putative human speech mirror system. ...
As the DIVA model is defined both computationally and anatomically, it is ideal for generating precise predictions concerning speechrelated brain activation patterns observed during functional imaging ...
Simulated fMRI activations from the DIVA model when performing the same speech task as the subjects in the fMRI experiment. ...
doi:10.1016/j.jneuroling.2009.08.006
pmid:22711978
pmcid:PMC3375605
fatcat:axonezm2n5bytkfr6kmf4g2qsm
Modeling human word recognition with sequences of artificial neurons
[chapter]
1996
Lecture Notes in Computer Science
A new psycholinguistically motivated and neural network based model of human word recognition is presented. In contrast to earlier models it uses real speech as input. ...
In experiments with a small lexicon which includes groups of very similar word forms, the model meets high standards with respect to word recognition and simulates a number of wellknown psycholinguistical ...
Therefore, the RAW-model (Real-speech model for Auditory Word recognition) was designed to serve as a starting point for a simulation lab which combines the use of real speech and the implementation of ...
doi:10.1007/3-540-61510-5_61
fatcat:ym4gtt4w3rf6tfwymzap75ain4
Repetition enhancement to voice identities in the dog brain
2020
Scientific Reports
In the human speech signal, cues of speech sounds and voice identities are conflated, but they are processed separately in the human brain. ...
The processing of speech sounds and voice identities is typically performed by non-primary auditory regions in humans and non-human primates. ...
The dog auditory cortex is therefore not as tuned to human vocalizations as the human auditory cortex is. ...
doi:10.1038/s41598-020-60395-7
pmid:32132562
pmcid:PMC7055288
fatcat:3vs6yyogujeqfkkf6qxya2wtzm
Open challenges in understanding development and evolution of speech forms: The roles of embodied self-organization, motivation and active exploration
2015
Journal of Phonetics
In particular, we emphasize the importance of embodied self-organization , as well as the role of mechanisms of motivation and active curiosity-driven exploration in speech formation. ...
Based on the analysis of mathematical models of the origins of speech forms, with a focus on their assumptions , we study the fundamental question of how speech can be formed out of non--speech, at both ...
Thus, the model relies on a pre--existing set of linguistic abilities, as well as abstracts away from many non--linguistic processes, such as sensorimotor development outside speech, non--linguistic activities ...
doi:10.1016/j.wocn.2015.09.001
fatcat:zwjgd3tbnvfz7fodancabckfru
« Previous
Showing results 1 — 15 out of 394,626 results