Filters








17,082 Hits in 3.4 sec

Speech and Hand Transcribed Retrieval [chapter]

Mark Sanderson, Xiao Mang Shou
2002 Lecture Notes in Computer Science  
This paper describes the issues and preliminary work involved in the creation of an information retrieval system that will manage the retrieval from collections composed of both speech recognised and ordinary  ...  Initial ideas and some preliminary results are presented. General Terms Measurement, Experimentation.  ...  From the table, it shows a magnitude difference in the percentage of punctuation in speech recognised and hand transcribed collections.  ... 
doi:10.1007/3-540-45637-6_7 fatcat:fae6t7tgcjcyndej2vet25c2ki

VOICE RETRIEVAL OF MANDARIN BROADCAST NEWS SPEECH

BERLIN CHEN
2006 International journal of pattern recognition and artificial intelligence  
This paper presents an improved framework for voice retrieval of Mandarin broadcast news speech.  ...  Finally, we used the PDA as the platform and broadcast radio programs collected in Taiwan as the document collection to establish a speech-based multimedia information retrieval prototype system.  ...  different amounts of automatically transcribed speech data with the original four-hour manually transcribed speech data.  ... 
doi:10.1142/s0218001406004521 fatcat:uhwa5rufhrcebm6635s7h7bfyq

Extra Large Vocabulary Continuous Speech Recognition Algorithm Based on Information Retrieval [chapter]

Valery Pylypenko
2010 Advances in Speech Recognition  
Future extension The importance of information retrieval for speech recognition should be underlined.  ...  Phoneme recognizer The phonetic transcribing algorithm (Vintsiuk, 2000; Vintsiuk, 2001 ) builds a phonetic sequence for speech signal regardless to the dictionary.  ...  //www.intechopen.com/books/advances-in-speech-recognition/extra-largevocabulary-continuous-speech-recognition-algorithm-based-on-information-retrieval © 2010 The Author(s).  ... 
doi:10.5772/10185 fatcat:w6fv7ilgcbdxzmglbd44kkjuhe

A method for open-vocabulary speech-driven text retrieval

Atsushi Fujii, Katunobu Itou, Tetsuya Ishikawa
2002 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - EMNLP '02  
Aiming at retrieving information with spoken queries, we fill the gap between speech recognition and text retrieval in terms of the vocabulary size.  ...  Given a spoken query, we generate a transcription and detect OOV words through speech recognition.  ...  To sum up, the OOV problem is inherent in speech-driven retrieval, and we need to fill the gap between speech recognition and text retrieval in terms of the vocabulary size.  ... 
doi:10.3115/1118693.1118718 dblp:conf/emnlp/FujiiII02 fatcat:hmpwp2pr75grnm2opnophptxfy

A Method for Open-Vocabulary Speech-Driven Text Retrieval [article]

Atsushi Fujii, Katunobu Itou, Tetsuya Ishikawa
2002 arXiv   pre-print
Aiming at retrieving information with spoken queries, we fill the gap between speech recognition and text retrieval in terms of the vocabulary size.  ...  Given a spoken query, we generate a transcription and detect OOV words through speech recognition.  ...  To sum up, the OOV problem is inherent in speech-driven retrieval, and we need to fill the gap between speech recognition and text retrieval in terms of the vocabulary size.  ... 
arXiv:cs/0206014v1 fatcat:7qqri7gldjaphibsf36eb5jehe

LODEM: A system for on-demand video lectures

Atsushi Fujii, Katunobu Itou, Tetsuya Ishikawa
2006 Speech Communication  
Experimental results showed that by adapting speech recognition to the lecture topic, the recognition accuracy increased and the retrieval accuracy was comparable with that obtained by human transcriptions  ...  Our system extracts the audio track from a target lecture video, generates a transcription by large vocabulary continuous speech recognition, and produces a textual index.  ...  On the other hand, speech contents are fundamentally used for sequential-access purposes.  ... 
doi:10.1016/j.specom.2005.08.006 fatcat:37xznmoyr5fgxga77pmj4xfdxy

A Cross-media Retrieval System for Lecture Videos [article]

Atsushi Fujii, Katunobu Itou, Tomoyosi Akiba, Tetsuya Ishikawa
2003 arXiv   pre-print
Experimental results showed that by adapting speech recognition to the topic of the lecture, the recognition accuracy increased and the retrieval accuracy was comparable with that obtained by human transcription  ...  Our system extracts the audio track from a target lecture video, generates a transcription by large vocabulary continuous speech recognition, and produces a text index.  ...  On the other hand, speech is used mainly for sequential-access purposes.  ... 
arXiv:cs/0309021v1 fatcat:irac5lgqvrfgnfntpcblho33qy

Using words and phonetic strings for efficient information retrieval from imperfectly transcribed spoken documents

Michael J. Witbrock, Alexander G. Hauptmann
1997 Proceedings of the second ACM international conference on Digital libraries - DL '97  
This paper reports on some initial experiments which compared retrieval effectiveness of spoken documents transcribed manually and by speech recognition.  ...  For documents transcribed using speech recognition, a substantial number of retrieval errors are due to query terms that occur in the spoken document, but are not transcribed because they are not within  ... 
doi:10.1145/263690.263779 dblp:conf/dl/WitbrockH97 fatcat:b6tzop4ejfhyba64emnzgeuz7i

Speech-Driven Text Retrieval: Using Target IR Collections for Statistical Language Model Adaptation in Speech Recognition [chapter]

Atsushi Fujii, Katunobu Itou, Tetsuya Ishikawa
2002 Lecture Notes in Computer Science  
Aiming at speech-driven text retrieval, which facilitates retrieving information with spoken queries, we propose a method to integrate speech recognition and retrieval methods.  ...  Since users speak contents related to a target collection, we adapt statistical language models used for speech recognition based on the target collection, so as to improve both the recognition and retrieval  ...  Text Retrieval The text retrieval module is based on an existing probabilistic retrieval method [13] , which computes the relevance score between the transcribed query and each document in the collection  ... 
doi:10.1007/3-540-45637-6_9 fatcat:622ouy5dhvd2zexa27dtkrzciq

Speech-Driven Text Retrieval: Using Target IR Collections for Statistical Language Model Adaptation in Speech Recognition [article]

Atsushi Fujii, Katunobu Itou, Tetsuya Ishikawa
2002 arXiv   pre-print
Aiming at speech-driven text retrieval, which facilitates retrieving information with spoken queries, we propose a method to integrate speech recognition and retrieval methods.  ...  Since users speak contents related to a target collection, we adapt statistical language models used for speech recognition based on the target collection, so as to improve both the recognition and retrieval  ...  Text Retrieval The text retrieval module is based on an existing probabilistic retrieval method [13] , which computes the relevance score between the transcribed query and each document in the collection  ... 
arXiv:cs/0206037v1 fatcat:6cifnoknyffdjhzns2r5lu45km

Automatic Detection, Indexing, and Retrieval of Multiple Attributes from Cross-Lingual Multimedia Data [chapter]

Qian Hu, Fred J. Goodman, Stanley M. Boykin, Randall K. Fish, Warren R. Greiff, Stephen R. Jones, Stephen R. Moore
2012 Multimedia Information Extraction  
The non-lexical audio cues include both non-speech attributes and background noise.  ...  Non-speech attributes include speech rate, vocal effort (e.g. shouting and whispering), which are indicative of the speaker's emotional state, especially when combined with adjacent keywords.  ...  These systems primarily rely on the automatic speech recognition (ASR) transcribed text for retrieval purposes and return whole documents or stories [4, 12, 17] .  ... 
doi:10.1002/9781118219546.ch14 fatcat:iz2khooigrgszf5sbwvwqpj74i

Content-based language models for spoken document retrieval

Hsin-min Wang, Berlin Chen
2000 Proceedings of the fifth international workshop on on Information retrieval with Asian languages - IRAL '00  
Keywords: spoken document retrieval (SDR); content-based language models; speech recognition. ) , ( 1 i i w w − and the word 1 − i w in the text corpora on which the baseline language models were  ...  models using automatic transcriptions of spoken documents were used to create more accurate recognition results and indexing terms from both spoken documents and speech queries.  ...  When a user enters a speech query into the retrieval system, the speech recognition module first transcribes the speech query to a word (or subword) string based on the acoustic models for speech queries  ... 
doi:10.1145/355214.355236 dblp:conf/iral/WangC00 fatcat:skg6e6mx6fgtppwfeenxfwvhgu

Content-based Language Models for Spoken Document Retrieval

HSIN-MIN WANG, BERLIN CHEN
2001 International Journal of Computer Processing Of Languages  
Keywords: spoken document retrieval (SDR); content-based language models; speech recognition. ) , ( 1 i i w w − and the word 1 − i w in the text corpora on which the baseline language models were  ...  models using automatic transcriptions of spoken documents were used to create more accurate recognition results and indexing terms from both spoken documents and speech queries.  ...  When a user enters a speech query into the retrieval system, the speech recognition module first transcribes the speech query to a word (or subword) string based on the acoustic models for speech queries  ... 
doi:10.1142/s0219427901000333 fatcat:zvb5fbwd6zaubbb64lrnd2pngm

Language Modeling for Multi-Domain Speech-Driven Text Retrieval [article]

Katunobu Itou, Atsushi Fujii, Tetsuya Ishikawa
2002 arXiv   pre-print
Since users speak contents related to a target collection, we produce language models used for speech recognition based on the target collection, so as to improve both the recognition and retrieval accuracy  ...  We report experimental results associated with speech-driven text retrieval, which facilitates retrieving information in multiple domains with spoken queries.  ...  Acknowledgments The authors would like to thank the National Institute of Informatics for their support with the NTCIR collection and the IREX committee for their support with the IREX collection.  ... 
arXiv:cs/0206036v1 fatcat:qmxaymln4vdpvephkre2mv3a3q

How do users respond to voice input errors?

Jiepu Jiang, Wei Jeng, Daqing He
2013 Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval - SIGIR '13  
However, due to voice input errors (such as speech recognition errors and improper system interruptions), users need to frequently reformulate queries to handle the incorrectly recognized queries.  ...  In this paper, we first characterize and analyze typical voice input errors in voice search and users' corresponding reformulation strategies.  ...  As shown in Table 2 , the average Jaccard similarity was only 0.118, indicating very low overlap between those retrieved by the transcribed queries and those that should have been retrieved by the voice  ... 
doi:10.1145/2484028.2484092 dblp:conf/sigir/JiangJH13 fatcat:ptkvx2x7xvbldgxo2znuofikbu
« Previous Showing results 1 — 15 out of 17,082 results