36,230 Hits in 5.9 sec


Milan Sigmund, Petr Zelinka
2011 Information Technology and Control  
Experimental results show that analysis of glottal excitation appears to be a useful approach to provide evidence of alcohol intoxication of over 1‰.  ...  For use in our experiments, a new collection of Czech alcoholized speech consisting of phonetically identical speech data spoken in both sober and intoxicated state was created.  ...  These changes include both the content of speech and its acoustic form expressed physically by parameters of speech signal.  ... 
doi:10.5755/j01.itc.40.2.429 fatcat:eei2waswrngx7dhidbv22bi53a

Robust Speech Recognition by Combining Short-Term and Long-Term Spectrum Based Position-Dependent CMN with Conventional CMN

2008 IEICE transactions on information and systems  
In a distant-talking environment, the length of channel impulse response is longer than the short-term spectral analysis window.  ...  We assume that a static speech segment (such as a vowel, for example) affected by reverberation, can be modeled by a long-term cepstral analysis.  ...  Acknowledgments This work was supported by The Global COE Program "Frontiers of Intelligent Human Sensing", from the ministry of Education, Culture, Sports, Science and Technology.  ... 
doi:10.1093/ietisy/e91-d.3.457 fatcat:btggonsdi5dermxvgmompyfsru

Estimating Age in Short Utterances Based on Multi-Class Classification Approach

Ameer A. Badr, Alia K. Abdul-Hassan
2021 Computers Materials & Continua  
The results show a clear relative improvement in terms of MAE up to 28% and 10% for female and male speakers, respectively, in comparison to related works that utilized the TIMIT dataset.  ...  In this study, an automatic system is proposed to estimate age in short speech utterances without depending on the text as well as the speaker.  ...  Funding Statement: The authors received no specific funding for this study. Conflicts of Interest: The authors declare that they have no conflict of interest to report regarding the present study.  ... 
doi:10.32604/cmc.2021.016732 fatcat:qycmsoyvuzdxvadjxbtpv4q2su

Statistical Analysis Of Glottal Pulses In Speech Under Psychological Stress

Zdenek Brabec, Ales Prokes, Milan Sigmund
2008 Zenodo  
Publication in the conference proceedings of EUSIPCO, Lausanne, Switzerland, 2008  ...  Such a speech cue would allow an analysis without the physical presence of the speaker. In this paper, we focus on actual effects of stress on speech signal.  ...  The broad sense reflects the underlying long-term stress and the narrow sense refers to the short-term excitation of the mind that prompts people to act.  ... 
doi:10.5281/zenodo.41137 fatcat:fmkgsaggnbc45hcgm2bazlqf34

Influence of Speaker-Specific Parameters on Speech Separation Systems

David Ditter, Timo Gerkmann
2019 Interspeech 2019  
Furthermore we conclude that current systems separate (short-term) speaking styles rather than (long-term) speaker characteristics.  ...  Our analysis allows us to do performance predictions for given speakers based on measurements of their fundamental frequency.  ...  To detect the voiced speech frames and formants we again use Praat [14, 15] where the formant analysis is based on a short-term Linear Predictive Coding (LPC) analysis [17] .  ... 
doi:10.21437/interspeech.2019-2459 dblp:conf/interspeech/DitterG19 fatcat:uez3x2ys6zhenm577qhffjv6bm

Creation of a Nigerian Voice Corpus for Indigenous Speaker Recognition

Adekunle A. Akinrinmade, Emmanuel Adetiba, Joke A. Badejo, Aderemi A. Atayero
2019 Journal of Physics, Conference Series  
Silent frames were excluded using short-term spectral energy threshold for Voice Activity Detection (VAD).  ...  The creation of such an indigenous database (or corpus) can open doors to Nigerian automatic speaker recognition as well as for indigenous language, ethnicity, gender, age group and emotion classification  ...  Conference on Engineering for Sustainable World Journal of Physics: Conference Series 1378 (2019) 032011 7 International Conference on Engineering for Sustainable World Journal of Physics: Conference  ... 
doi:10.1088/1742-6596/1378/3/032011 fatcat:qnnysdwr5ba4ra7rpeaw7qxgv4

Speaker Discrimination Using Long-Term Spectrum of Speech

Milan Sigmund
2019 Information Technology and Control  
In this article, we investigate a specific long-term speech spectrum with respect to its use for speaker recognition.  ...  The long-term effect was satisfied by averaging short-term autocorrelation coefficients over the whole utterance.  ...  For research, infrastructure of the SIX Center was used.  ... 
doi:10.5755/j01.itc.48.3.21248 fatcat:ofwj4f7yiraahbg7vh4m2ii7eq

Accommodating sample size effect on similarity measures in speaker clustering

Alexander Haubold, John R. Kender
2008 2008 IEEE International Conference on Multimedia and Expo  
Speaker data is represented as Mel Frequency Cepstral Coefficient (MFCC) vectors, and features are compared using the KL2 metric to form clusters of speech segments for each speaker.  ...  We investigate the symmetric Kullback-Leibler (KL2) distance in speaker clustering and its unreported effects for differently-sized feature matrices.  ...  In the domain of lecture videos some work is available for discussion scene analysis [4] . Speaker clustering is based on the comparison of features extracted from a speaker segmented audio stream.  ... 
doi:10.1109/icme.2008.4607737 dblp:conf/icmcs/HauboldK08 fatcat:q4u2zy7tbncupdl5vxoqco4sjy

Voice Pleasantness of Female Voices and the Assessment of Physical Characteristics [chapter]

Vivien Zuta
2009 Lecture Notes in Computer Science  
This study completes the collected voice parameters, which count so far as indicators of physical attributes and as factors for voice attractiveness.  ...  It has been demonstrated that there seem to be non-linguistic patterns from which listeners refer to the appearance of a speaker.  ...  For instance the statistical analyses for Acoustic Parameters and Speakers' Characteristics are currently carried out and of course other examinations about the physical parameters that lead listeners  ... 
doi:10.1007/978-3-642-03320-9_12 fatcat:mafuuiwnzjetvoxn6u7jbl57wa


2001 Pamukkale University Journal of Engineering Sciences  
In this paper, we present a general view of speech features and well known classifiers originally developed for text-independent speaker recognition systems.  ...  Extensive research in the past has been directed towards finding effective speech characteristics for speaker recognition.  ...  By modeling the underlying acoustic classes, the speaker model is better able to represent the short-term variations of a person's voice, allowing high identification performance for short utterances.  ... 
doaj:a57ff6fae7794c0f8b7295d39988c79c fatcat:urdkkjqwavg4fk52zlgvpmloty

An Efficient Speaker Diarization using Privacy Preserving Audio Features Based of Speech/Non Speech Detection

S Sathyapriya, A Indhumathi
2014 International Journal of Computer Trends and Technology  
Privacy-sensitive audio features for speaker diarization in multiparty conversations: i.e., a set of audio features having low linguistic information for speaker diarization in a single and multiple distant  ...  In addition a comprehensive analysis of these features has been provided for the two tasks in a variety of conditions, such as indoor (predominantly) and outdoor audio.  ...  Here a dynamic conversation analysis is carried out using nonverbal prompts based on short-term autocorrelation and relative spectral entropy.  ... 
doi:10.14445/22312803/ijctt-v9p136 fatcat:3tgyj4ju3ncwth3n4tg5qcd4h4

Development of an Easy-to-Use Spanish Health Literacy Test

Shoou-Yih D. Lee, Deborah E. Bender, Rafael E. Ruiz, Young Ik Cho
2006 Health Services Research  
The study was intended to develop and validate a health literacy test, termed the Short Assessment of Health Literacy for Spanish-speaking Adults (SAHLSA), for the Spanish-speaking population.  ...  The design of SAHLSA was based on the Rapid Estimate of Adult Literacy in Medicine (REALM), known as the most easily administered tool for assessing health literacy in English.  ...  Thus, the relative fit of the two models and the parameters would be estimated using the MULTILOG program, suitable for dichotomous and polytomous item analysis (Thissen 1991) .  ... 
doi:10.1111/j.1475-6773.2006.00532.x pmid:16899014 pmcid:PMC1797080 fatcat:zdczohnednffzgajsgj4cil2pa

Text-independent speaker recognition from a large linguistically unconstrained time-spaced data base

J. Markel, S. Davis
1979 IEEE Transactions on Acoustics Speech and Signal Processing  
the effectiveness of long-term average features for speaker recognition.  ...  For L v corresponding to approximately thirty-nine seconds of speech, text-independent results (no linguistic constraints embedded into the data base) of 98.05% for speaker identification and 4.25% for  ...  Beatrice Oshika for organization of the data base, and Rob Arnott and Ted Applebaum for programming assistance with the real-time system and the non-real-time batch processing programs.  ... 
doi:10.1109/tassp.1979.1163201 fatcat:w2o3lzh7y5e2la4sp6sgbm5gq4

Effect of Feature Extraction Techniques on the Performance of Speaker Identification

M. Elkholy, N. Korany
2013 International Journal of Signal Processing Systems  
Terms-speaker recognition, speaker identification, vector quantization, relative spectral technique -perceptual linear predictive (RASTA-PLP), perceptual linear prediction (PLP), mel frequency cepstral  ...  One word per speaker is used within the train phase and the identification rate is calculated for each feature extraction technique.  ...  LPC analysis is an effective method to estimate the main parameters of speech signals. The LPC coefficients are obtained for each frame independently.  ... 
doi:10.12720/ijsps.1.1.93-97 fatcat:idljt43in5bjtk6sdkljr4nfdy

Model-based face and lip animation for interactive virtual reality applications

Michel D. Bondy, Nicolas D. Georganas, Emil M. Petriu, Dorina C. Petriu, Marius D. Cordea, Thomas E. Whalen
2001 Proceedings of the ninth ACM international conference on Multimedia - MULTIMEDIA '01  
In this paper, we describe an experimental performance-driven animation system for an avatar face using model-based video coding and audio-track driven lip animation.  ...  Categories and Subject Descriptors [Multimedia tools, end-systems and applications]: multi-modal interaction and integration, design and applications of virtual environments.  ...  This model provides a simplified version of the physically based animation using only facial muscles for the activation of the neighboring nodes in a facial mesh.  ... 
doi:10.1145/500213.500242 fatcat:o6jbsw667jaullwsbb3grcsaiy
« Previous Showing results 1 — 15 out of 36,230 results