Filters








120 Hits in 6.6 sec

Bytes are All You Need: End-to-End Multilingual Speech Recognition and Synthesis with Bytes [article]

Bo Li, Yu Zhang, Tara Sainath, Yonghui Wu, William Chan
2018 arXiv   pre-print
We present two end-to-end models: Audio-to-Byte (A2B) and Byte-to-Audio (B2A), for multilingual speech recognition and synthesis.  ...  We show that bytes are superior to grapheme characters over a wide variety of languages in monolingual end-to-end speech recognition.  ...  Table 1 : 1 Speech recognition performance of monolingual and multilingual with Audio-to-Byte (A2B) or Audio-to-Char (A2C) models.  ... 
arXiv:1811.09021v1 fatcat:axsm5xwqrva3bgiasy3ttnn5rq

Speech-translation: from domain-limited to domain-unlimited translation tasks

Stephan Vogel
2007 2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU)  
have to be (mentally) undone o Speech synthesis o Rather monotone and unstructured o Overlay with original speech o Real-time processing by user is hardly possible Next Grand SLT Challenge: Multilingual  ...  How to build a translation system which can be used when you are "On the Run" Speech translator requires real-time translations o CPUs on hand-held devices are usually slow (e.g. 600M Hz) and o There are  ... 
doi:10.1109/asru.2007.4430141 dblp:conf/asru/Vogel07 fatcat:3ith3heda5dm3nrpou555aji44

Design of Chinese-English Wireless Simultaneous Interpretation System Based on Speech Recognition Technology

Fengzhen Liu, Jin He
2021 International Journal of Antennas and Propagation  
Speech recognition technology is used by the system software to create a speech recognition process that properly produces speech-related semantics.  ...  A Chinese-English wireless simultaneous interpretation system based on speech recognition technology is suggested to solve the problems of low translation accuracy and a high number of ambiguous terms  ...  It can be integrated with other natural language processing technologies such as spoken language recognition, speech synthesis, and machine translation to create more complicated and intelligent applications  ... 
doi:10.1155/2021/7346984 fatcat:rtyz5iysx5cvdg77tfxkmnzbnu

Development of Multi-lingual Spoken Corpora of Indian Languages [chapter]

K. Samudravijaya
2006 Lecture Notes in Computer Science  
An account of care taken to collect speech data that is as close to real world as possible is given. The current status of the programme and the set of actions planned to achieve the goal is given.  ...  This paper describes a recently initiated effort for collection and transcription of read as well as spontaneous speech data in four Indian languages.  ...  So, developers of real-life speech applications would need to pay attention not only to speech recognition but also to other aspects of spoken dialogue.  ... 
doi:10.1007/11939993_79 fatcat:4bhyoimb7nd7lcxord75373jkq

D4.1 Report on Multimodal Machine Translation

Stig-Arne Grönroos, Umut Sulubacak, Jörg Tiedemann
2018 Zenodo  
Both of these tasks are championed by evaluation campaigns, acting as competitions to stimulate research and to serve as a regulated platform investigating evaluation methodologies.  ...  In this deliverable, we present a survey of the state of the art in machine translation with an emphasis on multimodal tasks and systems.  ...  We would also like to acknowledge the support by NVIDIA and their GPU grant.  ... 
doi:10.5281/zenodo.3690761 fatcat:n3b34ooubfayxphgyf6bli6bya

Neural Polysynthetic Language Modelling [article]

Lane Schwartz, Francis Tyers, Lori Levin, Christo Kirov, Patrick Littell, Chi-kiu Lo, Emily Prud'hommeaux, Hyunji Hayley Park, Kenneth Steimel, Rebecca Knowles, Jeffrey Micher, Lonny Strunk (+9 others)
2020 arXiv   pre-print
Yet, when considering all of the world's languages, Finnish and Turkish are closer to the average case.  ...  Research in natural language processing commonly assumes that approaches that work well for English and and other widely-used languages are "language agnostic".  ...  In both cases, all of the available data was verse-aligned data drawn from the Bible. For St. Lawrence Island Yupik, we had access to New Testament data only.  ... 
arXiv:2005.05477v2 fatcat:nzw5w2ueznhpbfocqvlmbalkyi

All Together Now: The Living Audio Dataset

David A. Braude, Matthew P. Aylett, Caoimhín Laoide-Kemp, Simone Ashby, Kristen M. Scott, Brian Ó Raghallaigh, Anna Braudo, Alex Brouwer, Adriana Stan
2019 Interspeech 2019  
The aim is to provide audio data that is in the public domain, multilingual, and expandable by communities.  ...  We discuss the role of linguistic resources, given the success of systems such as Tacotron which use direct text-to-speech mappings, and consider how data provenance could be built into such resources.  ...  Recently, powerful machine learning approaches to speech synthesis and speech recognition have called into question the value of linguistic resources such as pronunciation lexicons.  ... 
doi:10.21437/interspeech.2019-2448 dblp:conf/interspeech/BraudeALASRBBS19 fatcat:sbgcpeiwubgpfcjnoxloihejvq

CITISEN: A Deep Learning-Based Speech Signal-Processing Mobile Application [article]

Yu-Wen Chen, Kuo-Hsuan Hung, You-Jin Li, Alexander Chao-Fu Kang, Ya-Hsin Lai, Kai-Chun Liu, Sze-Wei Fu, Syu-Siang Wang, Yu Tsao
2022 arXiv   pre-print
A few audio samples recording on a noisy environment are uploaded and used to adapt the pretrained SE model on the server.  ...  Therefore, the proposed BNC can effectively convert the background noise of a speech signal and be a data augmentation method when clean speech signals are unavailable.  ...  Specifically, in online testing, the input speech needs to be transferred from byte to float before enhancing and has to be transferred back to the byte before playing by the mobile devices; in offline  ... 
arXiv:2008.09264v4 fatcat:urtd5veorzeq5hwfyf5o2kfjku

Automatic Language Identification in Texts: A Survey [article]

Tommi Jauhiainen, Marco Lui, Marcos Zampieri, Timothy Baldwin, Krister Lindén
2018 arXiv   pre-print
Finally, we identify open issues, survey the work to date on each issue, and propose future directions for research in LI.  ...  We discuss evaluation methods, applications of LI, as well as off-the-shelf LI systems that do not require training by the end user.  ...  We would like to thank Kimmo Koskenniemi for many valuable discussions and comments concerning the early phases of the features and the methods sections.  ... 
arXiv:1804.08186v2 fatcat:4rmixp4i5fb55itb7ze5avkgqy

Automatic Language Identification in Texts: A Survey

Tommi Jauhiainen, Marco Lui, Marcos Zampieri, Timothy Baldwin, Krister Lindén
2019 The Journal of Artificial Intelligence Research  
We describe the features and methods using a unified notation, to make the relationships between methods clearer.  ...  Finally, we identify open issues, survey the work to date on each issue, and propose future directions for research in LI.  ...  We would like to thank Kimmo Koskenniemi for many valuable discussions and comments concerning the early phases of the features and the methods sections.  ... 
doi:10.1613/jair.1.11675 fatcat:axugpuogyne3nptvamgd3zwgty

The Spoken Wikipedia Corpus collection: Harvesting, alignment and an application to hyperlistening

Timo Baumann, Arne Köhn, Felix Hennig
2018 Language Resources and Evaluation  
Spoken corpora are important for speech research, but are expensive to create and do not necessarily reflect (read or spontaneous) speech 'in the wild'.  ...  We turn these semi-structured collections into structured and time-aligned corpora, keeping the exact correspondence with the original hypertext as well as all available metadata.  ...  Acknowledgments We would like to thank all Wikipedia authors and speakers for creating this tremendous amount of data.  ... 
doi:10.1007/s10579-017-9410-y fatcat:2u4wfkcqknfdxcwwx3wc764tu4

TELEMORPH: BANDWIDTH-DETERMINED MOBILE MULTIMODAL PRESENTATION

ANTHONY SOLON, PAUL McKEVITT, KEVIN CURRAN
2004 Information Technology & Tourism  
TeleMorph aims to dynamically generate multimedia presentations using output modalities that are determined by the bandwidth available on a mobile device's wireless connection.  ...  This article does not focus on the multimodal content composition but rather concentrates on the motivation for and issues surrounding such intelligent tourist systems.  ...  He has written over 90 academic research papers on areas such as distributed computing, emerging trends within wireless ad-hoc networks, dynamic protocol stacks, and mobile systems.  ... 
doi:10.3727/109830504784531903 fatcat:nw7hesjve5ebzoql7sbwdv6pmq

Automatic Summarization [chapter]

Lamia Hadrich Belguith, Mariem Ellouze, Mohamed Hedi Maaloul, Maher Jaoua, Fatma Kallel Jaoua, Philippe Blache
2014 Natural Language Processing of Semitic Languages  
A critical summary of the Gettysburg Address might be: "The Gettsyburg Address, though short, is one of the greatest of all American speeches, with its ending words being especially powerful-'that government  ...  The collection may range from gigabytes to bytes, so different methods may be needed for different sizes.  ...  He was involved in the development of a large-scale Germanlanguage text summarization system (Topic) and has written or coauthored four books and more than 140 journal articles, contributions to collected  ... 
doi:10.1007/978-3-642-45358-8_12 dblp:series/tanlp/BelguithEMJJB14 fatcat:zxihwcr2azdjlnio53cgrqk6hy

Automatic Summarization

Ani Nenkova
2011 Foundations and Trends in Information Retrieval  
A critical summary of the Gettysburg Address might be: "The Gettsyburg Address, though short, is one of the greatest of all American speeches, with its ending words being especially powerful-'that government  ...  The collection may range from gigabytes to bytes, so different methods may be needed for different sizes.  ...  He was involved in the development of a large-scale Germanlanguage text summarization system (Topic) and has written or coauthored four books and more than 140 journal articles, contributions to collected  ... 
doi:10.1561/1500000015 fatcat:gfli2ecy55a2dkwleu5b522au4

The challenges of automatic summarization

U. Hahn, I. Mani
2000 Computer  
A critical summary of the Gettysburg Address might be: "The Gettsyburg Address, though short, is one of the greatest of all American speeches, with its ending words being especially powerful-'that government  ...  The collection may range from gigabytes to bytes, so different methods may be needed for different sizes.  ...  He was involved in the development of a large-scale Germanlanguage text summarization system (Topic) and has written or coauthored four books and more than 140 journal articles, contributions to collected  ... 
doi:10.1109/2.881692 fatcat:tddpln5jsrfhdplkbpf64yqvhu
« Previous Showing results 1 — 15 out of 120 results