Filters








8,697 Hits in 6.6 sec

Towards Better Understanding of Spontaneous Conversations: Overcoming Automatic Speech Recognition Errors With Intent Recognition [article]

Piotr Żelasko, Jan Mizgajski, Mikołaj Morzy, Adrian Szymczak, Piotr Szymański, Łukasz Augustyniak, Yishay Carmiel
2019 arXiv   pre-print
conversations, exacerbated by speech recognition errors and scarcity of domain-specific labeled data.  ...  In this paper, we present a method for correcting automatic speech recognition (ASR) errors using a finite state transducer (FST) intent recognition framework.  ...  by the automatic speech recognition (ASR) system (Ward, 1991) .  ... 
arXiv:1908.07888v1 fatcat:hee6rhssjjh2rn7dnhzgozfbve

Interactive translation of conversational speech

A. Waibel
1996 Computer  
JANUS-II now accepts spontaneous conversational speech in a limited domain in English, German or Spanish and produces output in German, English, Spanish, Japanese and Korean.  ...  During translation, JANUS-II produces paraphrases that are used for interactive correction of translation errors.  ...  Thanks are also due to the partners and affiliates in C-STAR, who have helped define speech translation today.  ... 
doi:10.1109/2.511967 fatcat:eynsxz5oarhfrcy6lhb2vwav5a

Interactive Translation of Conversational Speech [chapter]

Alex Waibel
1999 Computational Models of Speech Pattern Processing  
JANUS-II now accepts spontaneous conversational speech in a limited domain in English, German or Spanish and produces output in German, English, Spanish, Japanese and Korean.  ...  During translation, JANUS-II produces paraphrases that are used for interactive correction of translation errors.  ...  Thanks are also due to the partners and affiliates in C-STAR, who have helped define speech translation today.  ... 
doi:10.1007/978-3-642-60087-6_33 fatcat:n5bh5t72ujewhmaco5cahxbgkq

Revisiting the Boundary between ASR and NLU in the Age of Conversational Dialog Systems [article]

Manaal Faruqui, Dilek Hakkani-Tür
2021 arXiv   pre-print
in automatic speech recognition (ASR) and natural language understanding (NLU).  ...  As more users across the world are interacting with dialog agents in their daily life, there is a need for better speech understanding that calls for renewed attention to the dynamics between research  ...  We thank Shachi Paul, Shyam Upadhyay, Amarnag Subramanya, Johan Schalkwyk, and Dave Orr for their comments on the initial draft of the paper.  ... 
arXiv:2112.05842v1 fatcat:b6woecq7tjbgzlgpee2ximlpha

Revisiting the Boundary between ASR and NLU in the Age of Conversational Dialog Systems

Manaal Faruqui, Dilek Hakkani-Tür
2021 Computational Linguistics  
in automatic speech recognition (ASR) and natural language understanding (NLU).  ...  As more users across the world are interacting with dialog agents in their daily life, there is a need for better speech understanding that calls for renewed attention to the dynamics between research  ...  errors associated with speech recognition as core part of the language understanding problem.  ... 
doi:10.1162/coli_a_00430 fatcat:6yfhq4vv3fhhlmigf5hrwzt3ui

Spoken dialogue technology: enabling the conversational user interface

Michael F. McTear
2002 ACM Computing Surveys  
This article describes the main components of the technology-speech recognition, language understanding, dialogue management, communication with an external source such as a database, language generation  ...  The origins of spoken dialogue systems can be traced back to Artificial Intelligence research in the 1950s concerned with developing conversational interfaces.  ...  Ronnie Smith, David James, and Ian O'Neill, and from the anonymous reviewers of the paper.  ... 
doi:10.1145/505282.505285 fatcat:56666shnuja5xiy3kju3v2kgbq

Developing a Production System for Purpose of Call Detection in Business Phone Conversations [article]

Elena Khasanova, Pooja Hiranandani, Shayna Gardiner, Cheng Chen, Xue-Yong Fu, Simon Corston-Oliver
2022 arXiv   pre-print
We present a detailed analysis of types of Purpose of Call statements and language patterns related to them, discuss an approach to collect rich training data by bootstrapping from a set of rules to a  ...  neural model, and describe a hybrid model which consists of a transformer-based classifier and a set of rules by leveraging insights from the analysis of call transcripts.  ...  Towards better under- standing of spontaneous conversations: Overcom- ing automatic speech recognition errors with intent recognition.  ... 
arXiv:2205.06904v1 fatcat:syqv6d4bgbe4rpz2vygayxge54

How Was Your Day? Evaluating a Conversational Companion

David Benyon, Bjorn Gamback, Preben Hansen, Oli Mival, Nick Webb
2013 IEEE Transactions on Affective Computing  
We show correlation between, for example, automatic speech recognition performance and overall system performance (as is expected in systems of this type), but beyond this, we show where individual utterances  ...  Here, we describe a paradigm and methodology for evaluating the main aspects of such functionality in conjunction with overall system behavior, with respect to three parameters: functional ability (i.e  ...  Webb was at the State University of New York, Albany, and Dr. Hansen and Prof. Gambäck were active at the Swedish Institute of Computer Science AB, Kista.  ... 
doi:10.1109/t-affc.2013.15 fatcat:djmpqxhdo5bnjirel4gaptejfy

Neural Generation Meets Real People: Towards Emotionally Engaging Mixed-Initiative Conversations [article]

Ashwin Paranjape, Abigail See, Kathleen Kenealy, Haojun Li, Amelia Hardy, Peng Qi, Kaushik Ram Sadagopan, Nguyet Minh Phu, Dilara Soylu, Christopher D. Manning
2020 arXiv   pre-print
At the end of the competition, Chirpy Cardinal progressed to the finals with an average rating of 3.6/5.0, a median conversation duration of 2 minutes 16 seconds, and a 90th percentile duration of over  ...  Neural generation plays a key role in achieving these goals, providing the backbone for our conversational and emotional tone.  ...  We thank Amazon.com, Inc. for a grant partially supporting the work of the rest of the team.  ... 
arXiv:2008.12348v2 fatcat:2vjg4zfrgffjzdpq7y764tf3v4

The challenge of spoken language systems: Research directions for the nineties

R. Cole, L. Hirschman, L. Atlas, M. Beckman, A. Biermann, M. Bush, M. Clements, L. Cohen, O. Garcia, B. Hanson, H. Hermansky, S. Levinson (+12 others)
1995 IEEE Transactions on Speech and Audio Processing  
We examine eight key areas in which basic research is needed to produce spoken language systems: 1) robust speech recognition; 2) automatic training and adaptation; 3) spontaneous speech; 4) dialogue models  ...  A spoken language system combines speech recognition, natural language processing and h h a n interface technology.  ...  Weatherill of the Center for Spoken Language Understanding at the Oregon Graduate Institute for producing and mailing several drafts of the report, and for integrating the many contributions by different  ... 
doi:10.1109/89.365385 fatcat:ogivf5rdovajrhcmez4c6hynne

Joint Syntactic and Semantic Analysis with a Multitask Deep Learning Framework for Spoken Language Understanding

Jeremie Tafforeau, Frederic Bechet, Thierry Artiere, Benoit Favre
2016 Interspeech 2016  
Spoken Language Understanding (SLU) models have to deal with Automatic Speech Recognition outputs which are prone to contain errors.  ...  Most of SLU models overcome this issue by directly predicting semantic labels from words without any deep linguistic analysis.  ...  This phenomenon is very critical when processing spontaneous speech in spoken conversations because of the high word error rate of ASR systems on such data.  ... 
doi:10.21437/interspeech.2016-851 dblp:conf/interspeech/TafforeauBAF16 fatcat:6ldzlknomvgffein3vvqpsqugy

Embodied Conversational Agents for Education in Autism [chapter]

Marissa Milne, Martin Luerssen, Trent Lewis, Richard Leibbrandt, David Powers
2011 A Comprehensive Book on Autism Spectrum Disorders  
Spontaneous maps can be challenging to automatically assess, as students are free to use any terms and interconnections they wish, however, the richness of assessment is immense, with map hierarchy indicating  ...  To meet such diverse needs, providing multiple input and output options, such as letting the user choose between speech recognition and keyboard input, can be beneficial.  ...  Different people with autism can have very different symptoms. Autism is considered to be a â€oespectrum†disorder, a group of disorders with similar features.  ... 
doi:10.5772/18688 fatcat:b2c4hsze35cm7jdwhlqrre5yxi

A User Perception--Based Approach to Create Smiling Embodied Conversational Agents

Magalie Ochs, Catherine Pelachaud, Gary Mckeown
2017 ACM transactions on interactive intelligent systems (TiiS)  
In order to improve the social capabilities of embodied conversational agents, we propose a computational model to enable agents to automatically select and display appropriate smiling behavior during  ...  As a second step, we propose a probabilistic model to automatically compute the user's potential perception of the embodied conversational agent's social stance depending on its smiling behavior and on  ...  smiles are better classified (18% of error with a confidence interval of ±1.8%) than the polite (34% of error with a confidence interval of ±1.7%) and the embarrassed smiles (31% of error with a confidence  ... 
doi:10.1145/2925993 fatcat:332pgu43zzaqxn6vy5d37duu6y

Automatic recognition and understanding of spoken language - a first step toward natural human-machine communication

Bing-Hwang Juang, S. Furui
2000 Proceedings of the IEEE  
Automatic recognition and understanding of spoken language is the first and probably the most important step toward natural human-machine interaction.  ...  Statistical methods are designed to allow the machine to learn, directly from data, structure regularities in the speech signal for the purpose of automatic speech recognition and understanding.  ...  RESEARCH ISSUES IN AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING The progress toward automatic speech recognition and understanding achieved in the past two decades is quite remarkable.  ... 
doi:10.1109/5.880077 fatcat:6ca4ebtwcbg4tl6bgcvgtr2gry

Content-based access to spoken audio

K. Koumpis, S. Renals
2005 IEEE Signal Processing Magazine  
Since speech recognition systems can label automatic transcriptions with exact time stamps, their output can be viewed as an annotation with which the other tasks can synchronize.  ...  Recognition Speech recognition, the task of converting the input speech signal into word sequences, is most often associated with systems for command and control, or for dialogs in limited domains.  ...  Koumpis and an MSc and a PhD from the University of Edinburgh. His research is in the areas of speech recognition, information access from spoken audio and models for multimodal data.  ... 
doi:10.1109/msp.2005.1511824 fatcat:a7p7ay3lmfen5brbmvqwoi6ete
« Previous Showing results 1 — 15 out of 8,697 results