Filters








6,385 Hits in 4.8 sec

Audio-driven Emotional Speech Animation [article]

Constantinos Charalambous, Zerrin Yumak, A. Frank Van Der Stappen
2018 Eurographics State of the Art Reports  
We propose a procedural audio-driven speech animation method that takes into account emotional variations in speech.  ...  The expressive speech model matches the pitch and intensity variations in audio to individual visemes.  ...  [TT14] , we also added phoneme substitution or deletion rules. We define a phoneme duration threshold, in which several consonants are dropped from speech (i.e. h, t).  ... 
doi:10.2312/egp.20181019 fatcat:s6lktjrlczhlxjqe5dyeb5aig4

Non-Native Pronunciation Variation Modeling for Automatic Speech Recognition [chapter]

Hong Kook, Mina Kim, Yoo Rhee
2010 Advances in Speech Recognition  
In order to explore the benefits of speech communication in devices, there have been many research works performed over the past several decades.  ...  Third, pronunciation modeling approaches derive pronunciation variant rules from non-native speech and apply the derived rules to pronunciation models for non-native speech (Amdal et al.  ...  Third, if more than half of the neighboring phonemes of X in Eq. (7) are different from the neighboring phonemes of the target phoneme Y, this rule pattern is removed from the rule pattern set.  ... 
doi:10.5772/10112 fatcat:omyy23c35fc3dhau3vcmvjlscu

Hybrid statistical pronunciation models designed to be trained by a medium-size corpus

Bahram Vazirnezhad, Farshad Almasganj, Seyed Mohammad Ahadi
2009 Computer Speech and Language  
In addition, in continuous speech, all sorts of interactions may take place between words, resulting in various phonological processes. However, for isolated speech the situation is different.  ...  Nowadays, we face a range of ASR system applications from carefully read speech to ordinary conversational speech; and it is clear that pronunciation variation happens to a greater extent in natural speech  ...  Acknowledgements The authors would like to express their sincere thanks to Iran Telecommunication Research Center (ITRC) and Research Center of Intelligent Signal Processing (RCISP) for their continual  ... 
doi:10.1016/j.csl.2008.02.001 fatcat:kcmwbbjz6relhksnxddinqiszy

A Review on Different Approaches for Speech Recognition System

Suman K.Saksamudre, P.P. Shrishrimal, R.R. Deshmukh
2015 International Journal of Computer Applications  
This paper presents the basic idea of speech recognition, proposed types of speech recognition, issues in speech recognition, different useful approaches for feature extraction of the speech signal with  ...  Now day's research in speech recognition system is motivated for ASR system with a large vocabulary that supports speaker independent operations and continuous speech in different language.  ...  The expert knowledge about variation in speech is hand-coded into a system.  ... 
doi:10.5120/20284-2839 fatcat:tt4snelqkzhf7hzoceyekxji74

Speaker adapted dynamic lexicons containing phonetic deviations of words

Bahram Vazirnezhad, Farshad Almasganj, Seyed Mohammad Ahadi, Ari Chanen
2009 Journal of Zhejiang University: Science A  
Speaker variability is an important source of speech variations which makes continuous speech recognition a difficult task.  ...  Employing the set of speaker adapted dynamic lexicons in a Farsi (Persian) continuous speech recognition task results in word error rate reductions of as much as 10.1% in a speaker-dependent scenario and  ...  ACKNOWLEDGEMENTS We would like to express our sincere thanks to the Iran Telecommunication Research Center (ITRC) and the Research Center of Intelligent Signal Processing (RCISP) for their continual support  ... 
doi:10.1631/jzus.a0820761 fatcat:mflnayovardqldxn6qgpvg7tie

Visyllable Based Speech Animation

Sumedha Kshirsagar, Nadia Magnenat-Thalmann
2003 Computer graphics forum (Print)  
Traditionally, the speech animation of 3D synthetic faces involves extraction of visemes from input speech followed by the application of co-articulation rules to generate realistic animation.  ...  Visemes are visual counterpart of phonemes.  ...  Both these approaches are based on the classification of phoneme groups and their observed interaction during speech pronunciation.  ... 
doi:10.1111/1467-8659.t01-2-00711 fatcat:ofhobddvf5dn7mi6ny6qjpzi3e

Design of an Automatic English Pronunciation Error Correction System Based on Radio Magnetic Pronunciation Recording Devices

Zhang Shufang, Gengxin Sun
2021 Journal of Sensors  
of learner speech and analyze the mapping relationship between the resulting mispronunciation and the corresponding standard pronunciation to automatically generate additional phoneme confusion rules.  ...  In this paper, a system for automatic detection and correction of mispronunciation of native Chinese learners of English by speech recognition technology is designed with the help of radiomagnetic pronunciation  ...  Make use of the error rules in the learner's pronunciation and integrate these rules into the speech recognition to detect and diagnose the possible error categories in the learner's phoneme pronunciation  ... 
doi:10.1155/2021/5946228 fatcat:ikrqnf3pbvdhxebayzmymvipka

Speech Recognition of Isolated Words using a New Speech Database in Sylheti

2019 International journal of recent technology and engineering  
Although many interactive speech applications in "well-resourced" major languages are being developed, uses of these applications are still limited due to language barrier.  ...  With the advancements in the field of artificial intelligence, speech recognition based applications are becoming more and more popular in the recent years.  ...  to the three rules-of-thumb mentioned in Section 4(C).  ... 
doi:10.35940/ijrte.c5874.098319 fatcat:5w33jqvuhfh4zm5uowwgnvyape

Design and Development of a Prosody Generator for Arabic TTS Systems

Zied Mnasri, Fatouma Boukadida, Noureddine Ellouze
2010 International Journal of Computer Applications  
In our purpose to promote Arabic TTS synthesis, an Integrated Model of Arabic Prosody for Speech Synthesis (IMAPSS) tool has been designed to integrate our developed models for text analysis, NN-based  ...  phonemic duration prediction and Fujisaki-inspired F 0 contour.  ...  In fact, any language needs to be processed on its own, in order to extract its characteristics and to meet its requirements, and especially to model the dynamics of its prosodic features variations.  ... 
doi:10.5120/1641-2206 fatcat:3nhi6j322bg67kwl7nnfgr5auu

Design and Development of a Prosody Generator for Arabic TTS Systems

Zied Mnasri, Fatouma Boukadida, Noureddine Ellouze
2011 International Journal of Applied Information Systems  
In our purpose to promote Arabic TTS synthesis, an Integrated Model of Arabic Prosody for Speech Synthesis (IMAPSS) tool has been designed to integrate our developed models for text analysis, NN-based  ...  phonemic duration prediction and Fujisaki-inspired F 0 contour.  ...  In fact, any language needs to be processed on its own, in order to extract its characteristics and to meet its requirements, and especially to model the dynamics of its prosodic features variations.  ... 
doi:10.5120/ijais-3650 fatcat:257jem4hfbfq3gudpsy4pkivrm

ANFIS for Tamil Phoneme Classification

2019 International Journal of Engineering and Advanced Technology  
In spite of a wide range of research in this field, here we examine the power of ANFIS for least explored Tamil phoneme recognition problem.  ...  Most research in this area revolve around try to model the pattern of features observed in the speech spectra with the use of Hidden Markov Models (HMM), various types of neural networks like deep recurrent  ...  This involves several steps that include feature extraction from raw data, segmentation of continuous speech into smaller units like syllables or phonemes, classifying the phonemes and identifying the  ... 
doi:10.35940/ijeat.f8804.088619 fatcat:rgefl6a26jcgjmqu2u4cdlcyh4

Chart-driven Connectionist Categorial Parsing of Spoken Korean [article]

WonIl Lee, Geunbae Lee, Jong-Hyeok Lee
1995 arXiv   pre-print
In this paper, we developed a phoneme-level integration model of speech and linguistic processings through general morphological analysis for agglutinative languages and a efficient parsing scheme for  ...  While most of the speech and natural language systems which were developed for English and other Indo-European languages neglect the morphological processing and integrate speech and natural language at  ...  In the speech recognition side, the recognition must be at phoneme-level for large vocabulary continuous speech, and the speech recognition module must provide right level of outputs to the natural language  ... 
arXiv:cmp-lg/9511005v1 fatcat:6cacgofktrc6xbihlh5cb4xl4q

Backend Tools for Speech Synthesis in Speech Processing

K. M. Shiva Prasad, G. N. Kodanda Ramaiah, M. B. Manjunatha
2017 Indian Journal of Science and Technology  
So for the research is not successful in generating a speech signal, usually researchers extract the speech parameters from the recorded speech and synthesize the original signal from it.  ...  A synthesizer can be viewed as Mathematical modeling of the vocal tract by extracting the acoustic/vocal features to produce the artificial generated speech output.  ...  Phoneme based Synthesis This type of synthesizer synthesizes the continuous speech phonetically with un-limited vocabulary.  ... 
doi:10.17485/ijst/2017/v10i1/109410 fatcat:6nmxbugzrvcvnhtgkmngxi6iia

Can we Generate Emotional Pronunciations for Expressive Speech Synthesis?

Marie Tahon, Gwenole Lecorve, Damien Lolive
2018 IEEE Transactions on Affective Computing  
In the field of expressive speech synthesis, a lot of work has been conducted on suprasegmental prosodic features while few has been done on pronunciation variants.  ...  However, prosody is highly related to the sequence of phonemes to be expressed. This article raises two issues in the generation of emotional pronunciations for TTS systems.  ...  In order to take into consideration other aspects of expressive voice such as social cues, intention or interactive cues, the complex nature of affect in speech can be described with continuous dimensions  ... 
doi:10.1109/taffc.2018.2828429 fatcat:lkw5eb5bqzauldfai3obsehkpm

Automatic Speech Recognition (ASR) Systems for Learning Arabic Language and Al-Quran Recitation: A Review

Nazik O'mar Balula, Mohsen Rashwan, Shrief Abdou
2021 International journal of computer science and mobile computing  
of natural speech and recognition.  ...  ASR systems which developed for Arabic language help Arabs and non-Arabs in learning Arabic language and so Al-Quran recitation and memorization in proper way according to recitation rules (Tajweed).  ...  Instead of predefined dictionary the study predicted allophonic variations of speech by using a set of language-dependent grapheme-to-allophone rules.  ... 
doi:10.47760/ijcsmc.2021.v10i07.013 fatcat:kdj6bsxhhfhujapvfofzlvddq4
« Previous Showing results 1 — 15 out of 6,385 results