Filters








73 Hits in 8.7 sec

Improving Word Recognition in Speech Transcriptions by Decision-Level Fusion of Stemming and Two-Way Phoneme Pruning [chapter]

Sunakshi Mehra, Seba Susan
2021 Communications in Computer and Information Science  
After obtaining results of stemming and two-way phoneme pruning, we applied decision-level fusion and that led to an improvement of word recognition rate upto 32.96%.  ...  We introduce an unsupervised approach for correcting highly imperfect speech transcriptions based on a decision-level fusion of stemming and two-way phoneme pruning.  ...  II of two-way phoneme pruning are combined using a decision level score fusion, by which scheme, if any of the stages (stem, vowel + plosives, vowel + fricatives) identify a given word in the transcript  ... 
doi:10.1007/978-981-16-0401-0_19 fatcat:3z2yu6rolnfzrdeggxnhsalamy

Emotion recognition from speech: Putting ASR in the loop

Bjorn Schuller, Anton Batliner, Stefan Steidl, Dino Seppi
2009 2009 IEEE International Conference on Acoustics, Speech and Signal Processing  
This paper investigates the automatic recognition of emotion from spoken words by vector space modeling vs. string kernels which have not been investigated in this respect, yet.  ...  Apart from the spoken content directly, we integrate Part-of-Speech and higher semantic tagging in our analyses.  ...  This initiative was taken in the European Network of Excellence HUMAINE under the name CEICES (Combining Efforts for Improving automatic Classification of Emotional user States).  ... 
doi:10.1109/icassp.2009.4960651 dblp:conf/icassp/SchullerBSS09 fatcat:47l3qolvnnhedjsyoykemsdzvq

Advances In Large Vocabulary Continuous Speech Recognition In Greek: Modeling And Nonlinear Features

Petros Maragos, Gerasimos Potamianos, Isidoros Rodomagoulakis
2013 Zenodo  
Publication in the conference proceedings of EUSIPCO, Marrakech, Morocco, 2013  ...  Svaizer, and L. Cristoforetti of Fondazione Bruno Kessler Italy, for providing the simulated data for distant speech recognition and Figure 2 .  ...  Stem-based approaches [9] have been used to cope with the large vocabulary issues but the reported WER improvement in recognition was minor.  ... 
doi:10.5281/zenodo.43691 fatcat:z26k5fidh5cwbfji522y352qpy

Automatic Summarization

Martha Larson
2012 Foundations and Trends in Information Retrieval  
This survey provides an overview of the field of SCR encompassing component technologies, the relationship of SCR to text IR and automatic speech recognition and user interaction issues.  ...  Speech media, that is, digital audio and video containing spoken content, has blossomed in recent years. Large collections are accruing on the Internet as well as in private and enterprise settings.  ...  The advantages of using word-level transcripts are twofold. First, an LVCSR system uses a word-level language model spanning a context much larger than that covered by phoneme-based recognizers.  ... 
doi:10.1561/1500000020 fatcat:o424mjxnp5abbexhjsobtom2ry

Searching spontaneous conversational speech

Franciska de Jong, Douglas W. Oard, Roeland Ordelman, Stephan Raaijmakers
2007 SIGIR Forum  
. • The redundancy present in human language meant that search effectiveness held up well over a reasonable range of transcription accuracy. • Sufficiently accurate Large-Vocabulary Continuous Speech Recognition  ...  (LVCSR) systems could be built for the planned speech of news announcers.  ...  This work was funded in part by NSF IIS award 0122466 (MALACH).  ... 
doi:10.1145/1328964.1328982 fatcat:wwpzqq7ndrfedh4imhoznvccl4

Learning from Past Mistakes: Improving Automatic Speech Recognition Output via Noisy-Clean Phrase Context Modeling [article]

Prashanth Gurunath Shivakumar, Haoqi Li, Kevin Knight, Panayiotis Georgiou
2019 arXiv   pre-print
Automatic speech recognition (ASR) systems often make unrecoverable errors due to subsystem pruning (acoustic, language and pronunciation models); for example pruning words due to acoustics using short-term  ...  Finally, we present an extensive analysis of the type of errors corrected by our system.  ...  INTRODUCTION Due to the complexity of human language and quality of speech signals, improving performance of automatic speech recognition (ASR) is still a challenging task.  ... 
arXiv:1802.02607v2 fatcat:g54avqptmfa3xl3gr73mqkicwa

Learning from past mistakes: improving automatic speech recognition output via noisy-clean phrase context modeling

Prashanth Gurunath Shivakumar, Haoqi Li, Kevin Knight, Panayiotis Georgiou
2019 APSIPA Transactions on Signal and Information Processing  
Automatic speech recognition (ASR) systems often make unrecoverable errors due to subsystem pruning (acoustic, language and pronunciation models); for example, pruning words due to acoustics using short-term  ...  Finally, we present an extensive analysis of the type of errors corrected by our system.  ...  In all of these the errors of the ASR may stem from realistic constraints of the decoding system and pruning structure, while the proposed system could exploit very long context to improve the ASR output  ... 
doi:10.1017/atsip.2018.31 fatcat:pjrqvdkvszgi7nshq6h4bpu43y

The Automatic Recognition of Emotions in Speech [chapter]

Anton Batliner, Björn Schuller, Dino Seppi, Stefan Steidl, Laurence Devillers, Laurence Vidrascu, Thurid Vogt, Vered Aharonson, Noam Amir
2010 Cognitive Technologies  
In this chapter, we focus on the automatic recognition of emotional states using acoustic and linguistic parameters as features, and classifiers as tools to predict the 'correct' emotional states.  ...  We first sketch history and state-of-the art in this field; then we describe the process of 'corpus engineering', i.e. the design and recording of databases, the annotation of emotional states, and further  ...  Stemming stands for clustering of morphological variants, i.e. flexions (e.g. by declination or conjugation), of a word by its stem in a lexeme.  ... 
doi:10.1007/978-3-642-15184-2_6 fatcat:2n7inmzlafathg2nrrk3rzmwzy

Recognising realistic emotions and affect in speech: State of the art and lessons learnt from the first challenge

Björn Schuller, Anton Batliner, Stefan Steidl, Dino Seppi
2011 Speech Communication  
This is a lesson to be learnt by the community, the same way as the Automatic Speech Recognition (ASR) community had to learn that there is a multiplicity of registers and varieties, if it comes to non-read  ...  More than a decade has passed since research on automatic recognition of emotion from speech has become a new field of research in line with its 'big brothers' speech and speaker recognition.  ...  The former choice is called Stemming, i. e., the clustering of morphological variants of a word by its stem into a lexeme.  ... 
doi:10.1016/j.specom.2011.01.011 fatcat:x5jedtnwojdprkybbkxnah2dai

Distant speech recognition for home automation: Preliminary experimental results in a smart home

Benjamin Lecouteux, Michel Vacher, Francois Portet
2011 2011 6th Conference on Speech Technology and Human-Computer Dialogue (SpeD)  
The study focused on two tasks: distant speech recognition and sentence spotting (e.g., recognition of domotic orders).  ...  Fusion of ASR outputs by consensus and with a triggered language model (using a priori knowledge) were investigated.  ...  Bonnefond and S. Pons for their support during the experiment.  ... 
doi:10.1109/sped.2011.5940728 dblp:conf/sped/LecouteuxVP11 fatcat:2i5w5eihwne3zovrphna636p5u

Subword-based approaches for spoken document retrieval

Kenney Ng, Victor W. Zue
2000 Speech Communication  
We investigate the use of subword unit representations for SDR as an alternative to words generated by either keyword spotting or continuous speech recognition.  ...  This is accomplished by developing new recognizer and retrieval models where the interface between the two 5 6  ...  In addition, the constraints on the combinations of phonemes within words can also be expressed by using structural units intermediate between phonemes and words, i.e., syllables (Chomsky and Halle 1968  ... 
doi:10.1016/s0167-6393(00)00008-x fatcat:4jig4v5w25gqpjmbej6k2x2byq

Cross-language Information Retrieval [article]

Petra Galuščáková, Douglas W. Oard, Suraj Nair
2021 arXiv   pre-print
Two key assumptions shape the usual view of ranked retrieval: (1) that the searcher can choose words for their query that might appear in the documents that they wish to see, and (2) that ranking retrieved  ...  When the documents to be searched are in a language not known by the searcher, neither assumption is true. In such cases, Cross-Language Information Retrieval (CLIR) is needed.  ...  Acknowledgements This work has been supported in part by the Office of the Director of National Intelligence (ODNI), Intelligence Advanced Research Projects Activity (IARPA), via contract FA8650-17-C-9117  ... 
arXiv:2111.05988v1 fatcat:fgnaux4lcbe5jlpczhbxka5cqq

BASIL: Effective Near-Duplicate Image Detection Using Gene Sequence Alignment [chapter]

Hung-sik Kim, Hau-Wen Chang, Jeongkyu Lee, Dongwon Lee
2010 Lecture Notes in Computer Science  
We propose two unsupervised feature selection methods based on the notions of Non-Localness and Geometric-Localness to prune noisy data in the content.  ...  In the dominance of social networks era, vast information is created and shared across the world each day.  ...  of sub-word (phonemes, syllables, words)on perfect transcripts and ASR-generated transcript.  ... 
doi:10.1007/978-3-642-12275-0_22 fatcat:ou4wo4a6efdabkipzbkaxd5cyi

Message from the general chair

Benjamin C. Lee
2015 2015 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)  
Our end system outperforms the state-of-the-art baseline by 2 B 3 F1 points on non-transcript portion of the ACE 2004 dataset.  ...  We explore ways of using the resulting grounding to boost the performance of a state-of-the-art co-reference resolution system.  ...  in an integrated speech transcription and translation system.  ... 
doi:10.1109/ispass.2015.7095776 dblp:conf/ispass/Lee15 fatcat:ehbed6nl6barfgs6pzwcvwxria

The EVALITA Dependency Parsing Task: From 2007 to 2011 [chapter]

Cristina Bosco, Alessandro Mazzei
2013 Lecture Notes in Computer Science  
The event has been supported by the NLP Special Interest Group of the Italian Association for Artificial Intelligence (AI*IA) and by the Italian Association of Speech Science (AISV).  ...  Established in 2007, EVALITA (http://www.evalita.it) is the evaluation campaign of Natural Language Processing and Speech Technologies for the Italian language, organized around shared tasks focusing on  ...  This research is part of the HDOMO 2.0 project founded by the National Research Centre on Aging (INRCA) in partnership with the Government of the Marche region under the action "Smart Home for Active and  ... 
doi:10.1007/978-3-642-35828-9_1 fatcat:p6dyjaxm4zbitfajtciwclwipu
« Previous Showing results 1 — 15 out of 73 results