356 Hits in 8.8 sec

Large vocabulary Russian speech recognition using syntactico-statistical language modeling

Alexey Karpov, Konstantin Markov, Irina Kipyatkova, Daria Vazhenina, Andrey Ronzhin
2014 Speech Communication  
In this paper, we describe our efforts to build an automatic speech recognition (ASR) system for the Russian language with a large vocabulary.  ...  Speech is the most natural way of human communication and in order to achieve convenient and efficient human-computer interaction implementation of state-of-the-art spoken language technology is necessary  ...  Acknowledgements This research is supported by the Ministry of Education and Science of Russia (contract No. 07.514.11.4139), by the grant of the President of Russia (project No.  ... 
doi:10.1016/j.specom.2013.07.004 fatcat:hq2vkvwdlzgqlhyi44duyh44hq

Phonemic Transcription of Low-Resource Languages: To What Extent can Preprocessing be Automated?

Guillaume Wisniewski, Severine Guillaume, Alexis Michaud
2020 Workshop on Spoken Language Technologies for Under-resourced Languages  
Automatic Speech Recognition for low-resource languages has been an active field of research for more than a decade.  ...  : assessing to what extent it is possible to bypass the involvement of language experts for menial tasks of data preparation for Natural Language Processing (NLP) purposes.  ...  to the three reviewers for detailed comments, and to Jesse Gates for proofreading.  ... 
dblp:conf/sltu/WisniewskiGM20 fatcat:amnda4ywh5dond2kgax67noihe

Connected Speech in Neurodegenerative Language Disorders: A Review

Veronica Boschi, Eleonora Catricalà, Monica Consonni, Cristiano Chesi, Andrea Moro, Stefano F. Cappa
2017 Frontiers in Psychology  
The analysis of extended speech production is a precious source of information encompassing the phonetic, phonological, lexico-semantic, morpho-syntactic, and pragmatic levels of language organization.  ...  speech analysis for both research and clinical purposes.  ...  These features have been used mainly for speech recognition, and measure the power spectrum of a speech signal: for example, the peakedness of the signal (kurtosis) or the lack of symmetry (skewness)  ... 
doi:10.3389/fpsyg.2017.00269 pmid:28321196 pmcid:PMC5337522 fatcat:opuohd3uo5d4ljr3mgbc2yor3y

Clearing the Transcription Hurdle in Dialect Corpus Building: The Corpus of Southern Dutch Dialects as Case Study

Anne-Sophie Ghyselen, Anne Breitbarth, Melissa Farasyn, Jacques Van Keymeulen, Arjan van Hessen
2020 Frontiers in Artificial Intelligence  
These questions are tackled using the Southern Dutch dialects (SDDs) as case study, for which the usefulness of automatic speech recognition (ASR), respeaking, and forced alignment is considered.  ...  of statistical dialect-specific models, the degree of linguistic differentiation between the dialects and the standard language, and the goals the transcripts have to serve.  ...  ACKNOWLEDGMENTS We kindly thank (i) our student-transcribers for their diligent transcription work, (ii) all volunteers correcting the student transcriptions, (iii) Lien Hellebaut, for her conscientious  ... 
doi:10.3389/frai.2020.00010 pmid:33733130 pmcid:PMC7861295 fatcat:nzsria4sdjbfxjbyreuvysghce

What we learn about language from Spoken Corpus Linguistics?

Miriam Voghera
2020 Caplletra: Revista Internacional de Filologia  
Think, for example, of the relations between prosody, syntax, information structure and pragmatics or between segmental phonetic realization and linguistic perception.  ...  Linguistic perception is not in fact a process of recognition, but rather a process of reconstruction and interpretation (Lindblom 2004; Holt & Lotto 2010) .  ... 
doi:10.7203/caplletra.69.17267 fatcat:bflabsa3lveidl5uerxfumfsdm

Comparative Analysis of Automatic Sign Language Generation Systems

Rakesh Kumar, Vishal Goyal, Lalit Goyal
2021 Journal of scientific research  
The development of automatic sign language generation systems, which are very rarely available, is the need of the hour to solve their communication problem.  ...  It is evident from our study that the lack of proper grammar rules of sign language and the non-availability of large bilingual corpora are the main hurdles in developing sign language generation systems  ...  The proposed system used phrase-based SMT with the addition of pre-and post-processing steps taking into account morpho-syntax analysis of German language.  ... 
doi:10.37398/jsr.2021.650528 fatcat:l5vyyyjcbjcjfmxamqf2kahbwi

Research in the supporting sciences

1983 Language Teaching  
languages; as a result, they argue, only a grammar generating context-free language can serve as a model for natural language processing.  ...  Gazdar and others have claimed that the set of context-free languages includes the class of natural languages, and also that context-free languages are more efficiently parsable than context-sensitive  ...  practise the recognition and acquisition of speech acts. 83-241 Jefferson, Gail (U. of Manchester) and Lee, John R.  ... 
doi:10.1017/s0261444800010016 fatcat:sdlrjknqpbeg3pfgwnk73vl3zi

Better data for more researchers – using the audio features of BNCweb

Sebastian Hoffmann, Sabine Arndt-Lappe
2021 ICAME Journal  
of language usage, are usually restricted to orthographic transcriptions of spoken language.  ...  In spite of the wide agreement among linguists as to the significance of spoken language data, actual speech data have not formed the basis of empirical work on English as much as one would think.  ...  The numbers for , involved some manual cleaning, e.g. the removal of interjections (e.g. Yaaaaaa) and mismatches between tokenisation and word type (e.g. ca > can't)  ... 
doi:10.2478/icame-2021-0004 fatcat:d2yejvpfanhwpoux2beuulg7da

RoboCup@Home Spoken Corpus: Using Robotic Competitions for Gathering Datasets [chapter]

Emanuele Bastianelli, Luca Iocchi, Daniele Nardi, Giuseppe Castellucci, Danilo Croce, Roberto Basili
2015 Lecture Notes in Computer Science  
We regard the construction of the dataset as a first step towards a full benchmarking methodology for spoken language interaction in service robotics.  ...  The annotated data set is publicly available for developing, testing and comparing speech understanding functionalities of domestic and service robots, not only for teams involved in RoboCup@Home or in  ...  Authors are thankful to Cristina Giannone for her indispensable support in the development of the DAP system.  ... 
doi:10.1007/978-3-319-18615-3_2 fatcat:n6o5vcvucbgdvjdzxa6s55la6m

Non-traditional prosodic features for automated phrase break prediction

C. Brierley, E. Atwell
2011 Literary and Linguistic Computing  
The candidate confirms that the work submitted is her own and that appropriate credit has been given where reference has been made to the work of others.  ...  Also, for doctoral degrees:- This copy has been supplied on the understanding that it is copyright material and that no quotation from the thesis may be published without proper acknowledgement.  ...  the main English language corpora used in speech and language processing.  ... 
doi:10.1093/llc/fqr023 fatcat:je7ou3vkz5eaznt4zzqhjm5hfu

Challenges of releasing audio material for spoken data: The case of the London-Lund Corpus 2

Nele Põldvere, Johan Frid, Victoria Johansson, Carita Paradis
2021 Research in Corpus Linguistics  
This article aims to describe key challenges of preparing and releasing audio material for spoken data and to propose solutions to these challenges.  ...  First, audio-to-text alignment was solved through the insertion of timestamps in front of speaker turns in the transcription stage, which, as we show in the article, may later be used as a valuable complement  ...  Forced alignment is the process of automatic alignment of an audio recording to a given transcript.  ... 
doi:10.32714/ricl.09.01.04 fatcat:fr5fmmerujdwrhg3xon3kgkshi

Language Technology 2020: The Meta-Net Priority Research Themes [chapter]

Georg Rehm, Hans Uszkoreit
2013 META-NET Strategic Research Agenda for Multilingual Europe 2020  
voices • Robust dialogue systems • From speech recognition to speech understanding • Develop methods for the support of incremental conversational speech • Context-aware semantic and pragmatic  ...  learning toolk- its, speech recognition and speech synthesis engines, and integrated architectures such as GATE and UIMA.  ... 
doi:10.1007/978-3-642-36349-8_6 fatcat:jezmk52phre4lguizqtyh7gule

Preface [chapter]

Amina Mettouchi, Martine Vanhove, Dominique Caubet
2015 Studies in Corpus Linguistics  
Still, some inner-unit prosodic features can be used for the recognition and segmenting of a speech stretch into intonation units.  ...  We therefore use Praat for that purpose (see below). Speech is naturally segmented into prosodic units.  ...  Coded text: yes Select the text and convert it into a table Table, Convert to table, 5 columns Delete the empty column if necessary Sort the table by morphemes Select the table Table,  ... 
doi:10.1075/scl.68.00pre fatcat:dfhshc352vd25eikggdza6vcf4

Prepositions and Results in Italian and English: An Analysis from Event Decomposition [chapter]

Raffaella Folli, Gillian Ramchand
2005 Perspectives on Aspect  
Among other features, 1 the special nature of legal language is also reflected in the etymological background of its lexicon.  ...  This is the last moment in which in two separate kingdoms of Scotland and England two different, though related, languages are on their way towards standardisation (Devitt 1989, Bugaj 2004.  ...  The discussion here uses some evidence from Bošković and Franks (2000) and single cycle syntax.  ... 
doi:10.1007/1-4020-3232-3_5 fatcat:kwnuln4njbbgxpxsbkbl5c7mhe


1982 Solid State Nuclear Track Detectors  
Among other features, 1 the special nature of legal language is also reflected in the etymological background of its lexicon.  ...  This is the last moment in which in two separate kingdoms of Scotland and England two different, though related, languages are on their way towards standardisation (Devitt 1989, Bugaj 2004.  ...  The discussion here uses some evidence from Bošković and Franks (2000) and single cycle syntax.  ... 
doi:10.1016/b978-0-08-026509-4.50005-2 fatcat:6entfandzfbs5ot33as5cva64m
« Previous Showing results 1 — 15 out of 356 results