27 Hits in 18.4 sec

Disfluency Insertion for Spontaneous TTS: Formalization and Proof of Concept [chapter]

Raheel Qader, Gwénolé Lecorvé, Damien Lolive, Pascale Sébillot
2018 Lecture Notes in Computer Science  
This paper presents an exploratory work to automatically insert disfluencies in text-to-speech (TTS) systems. The objective is to make TTS more spontaneous and expressive.  ...  The objective and perceptual evalation conducted on an English corpus of spontaneous speech show that our proposition is effective to generate disfluencies, and highlights perspectives for future improvements  ...  This tends to validate the proof-of-concept im-plementation and the underlying proposed formalization.  ... 
doi:10.1007/978-3-030-00810-9_4 fatcat:pcko3mgs3rgn3lxc7td4ocjvve

Creating conversational interfaces for children

S. Narayanan, A. Potamianos
2002 IEEE Transactions on Speech and Audio Processing  
Results of using these data in developing novel language and dialog models as well as in a unified maximum likelihood framework for acoustic decoding in ASR and semantic classification for spoken language  ...  Details of the architecture and application details are described. Informal evaluation by children was found positive especially for the animated agent and the speech interface.  ...  Einbinder, and R. Barkan for organizing the "Carmen Sandiego" Wizard of Oz experiments and for collecting and analyzing the user experience data; to S. Chu and H.  ... 
doi:10.1109/89.985544 fatcat:kwb6q3u3fvaaxmvol3jywvaofu

Hesitations in Spoken Dialogue Systems

Simon Betz
This ensures a higher voice quality compared to unmodified Mary TTS output, to control for quality judgments of disfluency synthesis really reflecting disfluency quality and not synthesis quality in general  ...  , and of the underlying concepts.  ...  Stimuli for the baseline condition are the same, except without lengthenings and pauses. Introduction  ... 
doi:10.4119/unibi/2942254 fatcat:yy4bl2jjxbfx3hxxhl4x5gegyi

Spoken dialogue technology: enabling the conversational user interface

Michael F. McTear
2002 ACM Computing Surveys  
As a result many major telecommunications and software companies have become aware of the potential for spoken dialogue technology to provide solutions in newly developing areas such as computer-telephony  ...  , reviews some currently available dialogue development toolkits, and outlines prospects for future development.  ...  Ronnie Smith, David James, and Ian O'Neill, and from the anonymous reviewers of the paper.  ... 
doi:10.1145/505282.505285 fatcat:56666shnuja5xiy3kju3v2kgbq

Analysis and synthesis of intonation using the Tilt model

Paul Taylor
2000 Journal of the Acoustical Society of America  
The features and parameters of the event detector are discussed and performance figures are shown on a variety of read and spontaneous speaker independent conversational speech databases.  ...  in that it has the right number of degrees of freedom to be able to describe and synthesize intonation accurately.  ...  The maps are designed to be confusing, with the aim of eliciting interesting dialogue structures from the participants. The speech is fully spontaneous and contains many disfluencies.  ... 
doi:10.1121/1.428453 pmid:10738822 fatcat:joo3ufw7pjckrikq44epyylsga

Disfluency in Swedish human–human and human–machine travel booking dialogues [article]

Robert Eklund
2015 unpublished
Chapter 5 presents the analysis and results for all different categories of disfluencies.  ...  Disfluency in Swedish human-human and human-machine travel booking dialogues Abstract This thesis studies disfluency in spontaneous Swedish speech, i.e., the occurrence of hesitation phenomena like eh,  ...  "Bernstein Ratner", and it not always clear which of the names is the look-up name.  ... 
doi:10.13140/rg.2.1.3015.0882 fatcat:dfuq2qdz7fgf7ehkeg2mu5zgtq

Integrating laughter into spoken dialogue systems: preliminary analysis and suggested programme

Vladislav Maraev, Chiara Mazzocconi, Christine Howes, Jonathan Ginzburg
2018 FAIM/ISCA Workshop on Artificial Intelligence for Multimodal Human Robot Interaction   unpublished
We present the results of a preliminary study and sketch an updated questionnaire on laughables types and laughter functions aimed to be used for Amazon Mechanical Turk experiments.  ...  Furthermore we present preliminary programme for integrating laughter into spoken dialogue systems.  ...  Acknowledgements This research was supported by a grant from the Swedish Research Council for the establishment of the Centre for Linguistic Theory and Studies in Probability (CLASP) at the University  ... 
doi:10.21437/ai-mhri.2018-3 fatcat:4mrhvzbkgzg7xkg7u35m23kssi

Vol. 2, No. 2, June 2009

Editor ELT
2009 English Language Teaching  
With the globalization of the economy, in the usual English teaching, the students' ability of critical reading and English discourse analysis should be strengthened, their sensitivity for the ideology  ...  The analysis of two expresses indicated that the form embodied the meaning, and to express the meaning of ideology, the language system offered various measures, and the form was always selected, and CDA  ...  TT is reasonable.  ... 
doi:10.5539/elt.v2n2p0 fatcat:x4yt2apsvbgk5oe7roeqgsizby


Acknowledgment I am grateful to Forlì colleague Giuseppe Nocella, without whose statistical expertise the data would have been far less amenable to interpretation and comment.  ...  Even if there is no formal correspondence, some occurrences in TT are caused by the need to wait for new items, delayed because of a speaker's pause and/or interruption.  ...  of you who are married need no further proof.  ... 

Automatic Speech Recognition (ASR) and NMT for Interlingual and Intralingual Communication: Speech to Text Technology for Live Subtitling and Accessibility

Alessandro Gregori
in multilingual communications and for the purposes of accessibility has become an important element in the production of translation and interpreting services (Zetzsche, 2019) .  ...  Considered the increasing demand for institutional translation and the multilingualism of population in public space across Italy and Europe, the application of Artificial Intelligence (AI) technologies  ...  Also in formal contexts, speakers may use spontaneous speech features, or read aloud prepared texts, or use a mixture of both.  ... 
doi:10.48676/unibo/amsdottorato/9931 fatcat:f74jsoe5mzdrrdmlejg5gcbwre

The Attention-Hesitation Model. A Non-Intrusive Intervention Strategy for Incremental Smart Home Dialogue Management

Birte Richter
Two main concepts for dialogue modeling are identified: (1) the use of interaction patterns with system task descriptions for generalizability and (2) the concept of the IU model to deal with the incremental  ...  Smart homes are one of the most emergent research fields and provide fundamentally new means of interaction.  ...  Bortfeld et al. analyzed different factors for disfluencies such as gender, age, and the difficulty of the topic in a corpus analysis of 40 hours of spontaneous speech [Bor+01] .  ... 
doi:10.4119/unibi/2959410 fatcat:j7i7zms6abaxnn4athsxyv7vo4

A Study of Accomodation of Prosodic and Temporal Features in Spoken Dialogues in View of Speech Technology Applications

Spyridon Kousidis
The study was proposed as a proof-of-concept for the proposed methodology, which is a time series approach to measuring convergence continuously.  ...  There are many unpredictable topic changes, and there is a fair amount of spontaneous dialogue acts (interruptions, laughter, disfluencies, repairs), which would classify this speech as spontaneous.  ...  %check for NaN if (isfinite(matrix(k,columnnumber))==1) %add to weigthed sum wsum = wsum + matrix(k,columnnumber)*duration; %add to duration sum sdur = sdur + duration; end end end k = k + 1; if k > n  ... 
doi:10.21427/d7vc8s fatcat:nxyrhdhtbvh6tar6hj52bhej5m

A System for Simultaneous Translation of Lectures and Speeches

Christian Fügen
The focus of this system is the automatic translation of (technical oriented) lectures and speeches from English to Spanish, but the different aspects described in this thesis will also be helpful for  ...  developing simultaneous translation systems for other domains or languages.  ...  Often, the estimation of the probabilities for inserting disfluencies is treated separately from statistical n-gram language modeling.  ... 
doi:10.5445/ir/1000013594 fatcat:mg4lineq3jgereb6dpb2iebssy

Intonational pitch accent distribution in Egyptian Arabic

Samantha Jane Hellmuth
A survey of EA prosodic phrasing and of the relative accentuation of function words and content words shows that the correct generalisation for EA is that there is a pitch accent on every Prosodic Word  ...  In a corpus of read and (semi-)spontaneous EA speech a pitch accent was found on (almost) every content word, and in the overwhelming majority of cases the same pitch accent type is observed on every word  ...  Thus T -»Ft stands for T -»p(Ft), and PW d->T stands for p(PW d)-TT, and so forth.  ... 
doi:10.25501/soas.00028680 fatcat:2cholhfcwnc5djmvuywwor3gsu


Mark Brady, David Katan, Cristina Palazzi, David Snelling, Christopher Taylor, Laila Wadia
Ray Moody of the University of Hawaii for his invaluable assistance with the statistical analysis and insights regarding the study's findings.  ...  One click produced an overview of the number of words for each TT and a total for a group of TTs (grouped according to text and mode). This operation took about 10 seconds.  ...  , the proof of the pudding…").  ... 
« Previous Showing results 1 — 15 out of 27 results