A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2017; you can also visit the original URL.
The file type is application/pdf
.
Filters
Keyword Detection for Spontaneous Speech
2009
2009 2nd International Congress on Image and Signal Processing
This paper presents a system for keyword detection in spontaneous speech. Keywords are predefined through a set of acoustic examples provided by the users. ...
Keyword detection proceeds in two steps: keyword searching and verification. ...
Guillermo Aradilla for the helpful discussions. ...
doi:10.1109/cisp.2009.5303824
fatcat:jn6xe7gflrdr5h3wrdsxmujovu
Detailed author index
2009
2009 IEEE Workshop on Automatic Speech Recognition & Understanding
Robust Vocabulary Independent Keyword Spotting with Graphical Models 376 From Speech to Letters -Using a Novel Neural Network Architecture for Grapheme Based ASR W continues on next page… [Search] A B ...
Schuller, Björn
349
Robust Vocabulary Independent Keyword Spotting with Graphical Models
376
From Speech to Letters -Using a Novel Neural Network Architecture for Grapheme
Based ASR
552
Acoustic ...
doi:10.1109/asru.2009.5373491
fatcat:tgoktcyotncatbvo3l3cslfapa
ProfLifeLog: Environmental analysis and keyword recognition for naturalistic daily audio streams
2012
2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
The ProfLifeLog corpus contains speech data in a variety of noise backgrounds which is challenging for keyword recognition. ...
This study presents keyword recognition evaluation on a new corpus named ProfLifeLog. ProfLifeLog is a collection of data captured on a portable audio recording device called the LENA unit. ...
In summary, the proposed front-end technique acts as a filter, and retains only relatively clean speech (high SNR) for keyword spotting. ...
doi:10.1109/icassp.2012.6289028
dblp:conf/icassp/SangwanZH12
fatcat:zvd6kjrwa5ekvn57hxz3fo2hqu
Utterance verification using prosodic information for Mandarin telephone speech keyword spotting
1999
1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258)
For keyword recognition, 59 context-independent subsyllables, i.e., 22 m s and 37 FINAL'S in Mandarin speech, and one backgroundkilence model, are used as the basic recognition units. ...
In this paper, the prosodic information, a very special and important feature in Mandarin speech, is used for Mandarin telephone speech utterance verification. ...
INTRODUCTION Recently. research into algorithms that are able to spot keywords has been focused on constructing hidden Markov model (HMM) based speaker independent keyword spotting systems using either ...
doi:10.1109/icassp.1999.759762
dblp:conf/icassp/ChenWY99
fatcat:zwnnm36it5dfritpavak5qapxu
Fast Keyword Spotting in Telephone Speech
2009
Radioengineering
In the paper, we present a system designed for detecting keywords in telephone speech. We focus not only on achieving high accuracy but also on very short processing time. ...
Its performance is evaluated on recordings of Czech spontaneous telephone speech using rather large and complex keyword lists. ...
Introduction Keyword spotting (KWS) has become an important branch of speech technology. ...
doaj:14b2ae1d747f452dbf3a404aa848e053
fatcat:prskw3c5hzam7a24cbnpxynhqi
Integration of phonetic and prosodic information for robust utterance verification
2000
IEE Proceedings - Vision Image and Signal Processing
Driven by this propcrty, phonetic and prosodic information are integrated and used for Mandarin telephone speech keyword spotting. ...
For keyword recognition, 132 subsyllable models, two general acoustic filler models and one background/silence model are separately trained and used as the basic recognition units. ...
As the technology improves, a user-friendly speech recognition system is equipped with a keyword spotting capability that allows users the flexibility to give a wide range of response and behaviour. ...
doi:10.1049/ip-vis:20000099
fatcat:qggelngzjjfvvoh2feai4o6mve
An Application of Recurrent Neural Networks to Discriminative Keyword Spotting
[chapter]
2007
Lecture Notes in Computer Science
In a keyword spotting task in a large database of unconstrained speech where an HMM-based speech recogniser achieves a word accuracy of only 65 %, the system achieved a keyword spotting accuracy of 84.5 ...
Keyword spotting is a detection task consisting in discovering the presence of specific spoken words in unconstrained speech. ...
We have opted to tackle a realistic keyword spotting task in a large database of spontaneous and unconstrained speech where an HMM-based speech recogniser achieves a word accuracy of only 65 %. ...
doi:10.1007/978-3-540-74695-9_23
fatcat:dofqjetmorh5blmzrrhpdos6jq
Tandem decoding of children's speech for keyword detection in a child-robot interaction scenario
2011
ACM Transactions on Speech and Language Processing
We use the FAU Aibo Emotion Corpus which contains emotionally colored spontaneous children's speech recorded in a child-robot interaction scenario and investigate various recent keyword spotting techniques ...
In this article, we focus on keyword detection in children's speech as it is needed in voice command systems. ...
The disadvantage of the method proposed by Fernandez et al. [2007a] is that it is not vocabulary independent, as it has a separate output unit for each keyword. ...
doi:10.1145/1998384.1998386
dblp:journals/tslp/WollmerSBSS11
fatcat:vrgrjzvptvdmjm7b7pfuuznrri
Confidence measure improvement using useful predictor features and support vector machines
2012
20th Iranian Conference on Electrical Engineering (ICEE2012)
In this paper a discriminative and probabilistic computation of CM based upon some useful predictor features and support vector machines (SVM) is presented for Persian conversational telephone speech KWS ...
In traditional keyword spotting (KWS) systems, confidence measure (CM) of each keyword is computed from normalized acoustic likelihoods. ...
Speech Recognition Systems A conventional HTK recognizer is used for the implementation of a speaker independent continuous speech recognizer with the following points: -Initial monophone models are first ...
doi:10.1109/iraniancee.2012.6292531
fatcat:dfnhd4ltjrd5bh45uknwigxsbu
A Russian Keyword Spotting System Based on Large Vocabulary Continuous Speech Recognition and Linguistic Knowledge
2016
Journal of Electrical and Computer Engineering
The paper describes the key concepts of a word spotting system for Russian based on large vocabulary continuous speech recognition. ...
The system is based on CMU Sphinx open-source speech recognition platform and on the linguistic models and algorithms developed by Speech Drive LLC. ...
Acknowledgments The authors would like to thank SpRecord LLC authorities for providing real-world telephone-quality data used in training and testing of the keyword spotting system described in this paper ...
doi:10.1155/2016/4062786
fatcat:7jhohy6kerbuln7drrwcqfizcq
Intelligent Call Manager Based On The Integration Of Computer Telephony, Internet And Speech Processing
1998
International Conference on Consumer Electronics
The keyword spotting subsystem was evaluated in a test set of 2400 conversational speech utterances from 20 speakers (12 males and 8 females). ...
As the technology improves, a user friendly speech recognition system is equipped with a keyword spotting capability which allows users the flexibility to give a wide range of response and behavior [1, ...
Acknowledgment The authors would like to thank the National Science Council, the Republic of China, for financial support of this work under contract No. NSC86-2622-E-006-003. ...
doi:10.1109/icce.1998.678264
fatcat:ta5pvcuq4jdgvibqeogw6nhqte
A more efficient and optimal LLR for decoding and verification
1999
1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258)
Since the traditional log likelihood ratio (LLR) is borrowed from speaker verification technique, it may not be apropriate for decoding because we do not have a good modelling and definition of LLR for ...
We propose a new confidence score for decoding and verification. ...
INTRODUCTION Nowadays, keyword spotting plays an important role in the speech recognition because it is useful for dealing with spontaneous speech. ...
doi:10.1109/icassp.1999.759760
dblp:conf/icassp/LamF99
fatcat:3w2njh5itnagpd4nus54a6xtou
Retrieving spoken documents by combining multiple index sources
1996
Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '96
Both a continuous-speech large vocabulary recognition system, and a phone-lattice word spotter, are used to locate index units within an experimental corpus of voice messages. ...
Di erent w ays of combining them are investigated, and it is shown that the best of these can increase retrieval average precision for a speakerindependent retrieval system to 85% of that achieved for ...
for the speaker-dependent monophones used in this work. ...
doi:10.1145/243199.243208
dblp:conf/sigir/JonesFJY96
fatcat:wp7tiy52wfbb7agrz7rpcgi43u
Multi-stream LSTM-HMM decoding and histogram equalization for noise robust keyword spotting
2011
Cognitive Neurodynamics
Highly spontaneous, conversational, and potentially emotional and noisy speech is known to be a challenge for today's automatic speech recognition (ASR) systems, which highlights the need for advanced ...
In this article, we propose to combine histogram equalization and multi-condition training for robust keyword detection in noisy speech. ...
In this article, we focus on keyword spotting as needed for spontaneous human-machine interaction and investigate the recognition performance when adapting both, features and acoustic models to noisy conditions ...
doi:10.1007/s11571-011-9166-9
pmid:22942915
pmcid:PMC3179540
fatcat:lfqo5jvmavgfdasr3my2j4xjvi
A multi-stream ASR framework for BLSTM modeling of conversational speech
2011
2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
We propose a novel multi-stream framework for continuous conversational speech recognition which employs bidirectional Long Short-Term Memory (BLSTM) networks for phoneme prediction. ...
In this paper, we extend the principle of joint BLSTM and triphone modeling to a multi-stream system which uses MFCC features and BLSTM predictions as observations originating from two independent data ...
] , keyword spotting [12] , and emotion recognition from speech [17] . ...
doi:10.1109/icassp.2011.5947444
dblp:conf/icassp/WollmerESR11
fatcat:lk62prmowjfedg6smfqo2osaay
« Previous
Showing results 1 — 15 out of 1,788 results