Filters








347 Hits in 6.0 sec

MARS: A Statistical Semantic Parsing and Generation-Based Multilingual Automatic tRanslation System

Yuqing Gao, Bowen Zhou, Zijian Diao, Jeffrey Sorensen, Michael Picheny
2002 Machine Translation  
We present MARS (Multilingual Automatic tRanslation System), a research prototype speech-to-speech translation system.  ...  Many new features and innovations have been incorporated into MARS: the translation is based on understanding the meaning of the sentence; the semantic parser uses a statistical model and is trained from  ...  The authors also thank the Machine Translation Special Issue editors and two reviewers for their careful review and useful suggestions.  ... 
doi:10.1023/b:coat.0000010802.38267.29 fatcat:6wjmgrzi7famlojqzy3vuogncm

High-quality speech-to-speech translation for computer-aided language learning

Chao Wang, Stephanie Seneff
2006 ACM Transactions on Speech and Language Processing  
We were able to utilize a large corpus of English weather-domain queries to explore and compare a variety of translation strategies: formal, example-based, and statistical.  ...  The best speech translation performance (89.9% correct, 6.1% incorrect, and 4.0% rejected), is achieved by a system which combines the formal and example-based methods, using parsability by a domain-specific  ...  Philipp Koehn for his generous help with the Pharaoh decoder for phrase-based statistical machine translation models.  ... 
doi:10.1145/1149290.1149291 dblp:journals/tslp/WangS06 fatcat:iue2g5sjafedfnwrudpctkyel4

Page 1483 of Linguistics and Language Behavior Abstracts: LLBA Vol. 25, Issue 3 [page]

1991 Linguistics and Language Behavior Abstracts: LLBA  
Yokoyama’s Discourse and Word Order evaluation; 9106739 semi-automatic multilingual translation, Distributed Language a. project (Utrecht, Netherlands) progress report; Serbo-Croatian fragment generation  ...  subject index ill-formedness resolution, pragmatic context knowledge impor- tance; 9106724 Bro ‘sesaaites degrees of truth, logical formalization; interleaved semantic parsing update, integrated parsing  ... 

A Semantic Analyzer for the Comprehension of the Spontaneous Arabic Speech [article]

Mourad Mars, Mounir Zrigui, Mohamed Belgacem, Anis Zouaghi
2016 arXiv   pre-print
This work is part of a large research project entitled "Or\'eodule" aimed at developing tools for automatic speech recognition, translation, and synthesis for Arabic language.  ...  Our attention has mainly been focused on an attempt to improve the probabilistic model on which our semantic decoder is based.  ...  Mars M., Zrigui M., Belgacem M. and Zouaghi A. Mars M., Zrigui M., Belgacem M. and Zouaghi A. Mars M., Zrigui M., Belgacem M. and Zouaghi A.  ... 
arXiv:1610.02493v1 fatcat:6qrq5p7nkfalvjcqjtfya4wbv4

Message from the general chair

Benjamin C. Lee
2015 2015 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)  
To maximize the utility of the injected knowledge, we deploy a learning-based multi-sieve approach and develop novel entity-based features.  ...  Compared with the best system from CoNLL-2011, which employs a rule-based method, our system shows competitive performance.  ...  Whitney and A. Crosslingual Induction of Semantic Roles I. Titov and A. Exploiting Multiple Tree- banks for Parsing with Quasi-synchronous Gram- mars Z. Li, T.  ... 
doi:10.1109/ispass.2015.7095776 dblp:conf/ispass/Lee15 fatcat:ehbed6nl6barfgs6pzwcvwxria

Parallel Text Processing: Alignment and Use of Translation Corpora Jean Véronis (editor) (Université de Provence) Dordrecht: Kluwer Academic Publishers (Text, speech and language technology series, edited by Nancy Ide and Jean Véronis, volume 13), 2000, xxiii+402 pp; hardbound, ISBN 0-7923-6546-1, $160.00, £99.00, Dfl 300.00

Philip Resnik
2001 Computational Linguistics  
He has developed STRAND, a system for automatically finding parallel texts on the Web, and is working on linguistically informed statistical methods for machine translation.  ...  general length correlation and a general matching predicate that can exploit dictionary- or cognate-based lexical anchors as available.  ... 
doi:10.1162/coli.2000.27.4.592 fatcat:mzhyua6fzrbx5dcqowy6lodd6y

RuBQ: A Russian Dataset for Question Answering over Wikidata [chapter]

Vladislav Korablinov, Pavel Braslavski
2020 Lecture Notes in Computer Science  
The freely available dataset will be of interest for a wide community of researchers and practitioners in the areas of Semantic Web, NLP, and IR, especially for those working on multilingual question answering  ...  The data underwent automatic filtering, crowdassisted entity linking, automatic generation of SPARQL queries, and their subsequent in-house verification.  ...  We thank Mikhail Galkin, Svitlana Vakulenko, Vladimir Kovalenko, Yaroslav Golubev, and Rishiraj Saha Roy for their valuable comments and fruitful discussion on the paper draft.  ... 
doi:10.1007/978-3-030-62466-8_7 fatcat:bo2c5mp7unhhhbdxkuzfv5ujpy

RuBQ: A Russian Dataset for Question Answering over Wikidata [article]

Vladislav Korablinov, Pavel Braslavski
2020 arXiv   pre-print
The data underwent automatic filtering, crowd-assisted entity linking, automatic generation of SPARQL queries, and their subsequent in-house verification.  ...  The high-quality dataset consists of 1,500 Russian questions of varying complexity, their English machine translations, SPARQL queries to Wikidata, reference answers, as well as a Wikidata sample of triples  ...  We thank Mikhail Galkin, Svitlana Vakulenko, Vladimir Kovalenko, Yaroslav Golubev, and Rishiraj Saha Roy for their valuable comments and fruitful discussion on the paper draft.  ... 
arXiv:2005.10659v1 fatcat:4fyptlackrafngtbehb4mlmkhm

On the Limitations of Cross-lingual Encoders as Exposed by Reference-Free Machine Translation Evaluation [article]

Wei Zhao, Goran Glavaš, Maxime Peyrard, Yang Gao, Robert West, Steffen Eger
2020 arXiv   pre-print
We find that they perform poorly as semantic encoders for reference-free MT evaluation and identify their two key limitations, namely, (a) a semantic mismatch between representations of mutual translations  ...  We systematically investigate a range of metrics based on state-of-the-art cross-lingual semantic representations obtained with pretrained M-BERT and LASER.  ...  Acknowledgments We thank the anonymous reviewers for their insightful comments and suggestions, which greatly improved the final version of the paper.  ... 
arXiv:2005.01196v3 fatcat:w4qe2vwes5gpbdn2ipmkzr5mki

Pay Attention when you Pay the Bills. A Multilingual Corpus with Dependency-based and Semantic Annotation of Collocations

Marcos Garcia, Marcos García Salido, Susana Sotelo, Estela Mosqueira, Margarita Alonso-Ramos
2019 Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics  
This paper presents a multilingual corpus with semantic annotation of collocations in English, Portuguese, and Spanish.  ...  Each collocation was annotated by three linguists and the final resource was revised by a team of experts.  ...  Marcos Garcia has been funded by a Juan de la Cierva-incorporación grant (IJCI-2016-29598), and Marcos García-Salido by a post-doctoral grant from Xunta de Galicia (ED481D-2017-009).  ... 
doi:10.18653/v1/p19-1392 dblp:conf/acl/GarciaGSSR19 fatcat:vfkturchojhjjocdpbeqzvb2ru

MCoNaLa: A Benchmark for Code Generation from Multiple Natural Languages [article]

Zhiruo Wang, Grace Cuenca, Shuyan Zhou, Frank F. Xu, Graham Neubig
2022 arXiv   pre-print
We present a quantitative evaluation of performance on the MCoNaLa dataset by testing with state-of-the-art code generation systems.  ...  While there has been a recent burgeoning of applications at the intersection of natural and programming languages, such as code generation and code summarization, these applications are usually English-centric  ...  TAE is a generic transformer-based seq2seq model for code synthesis.  ... 
arXiv:2203.08388v1 fatcat:z3tuikmmbfc7ffac65flzy3jhe

Language Embeddings for Typology and Cross-lingual Transfer Learning [article]

Dian Yu and Taiqi He and Kenji Sagae
2021 arXiv   pre-print
We generate dense embeddings for 29 languages using a denoising autoencoder, and evaluate the embeddings using the World Atlas of Language Structures (WALS) and two extrinsic tasks in a zero-shot setting  ...  : cross-lingual dependency parsing and cross-lingual natural language inference.  ...  Google's multilingual neural machine translation system: En- abling zero-shot translation. Transactions of the As- sociation for Computational Linguistics, 5:339-351.  ... 
arXiv:2106.02082v1 fatcat:l4sdumpujvg2dbipbq4fz6pj5a

Collection and Annotation of the Romanian Legal Corpus

Dan Tufis, Maria Mitrofan, Vasile Florian Pais, Radu Ion, Andrei Coman
2020 International Conference on Language Resources and Evaluation  
We present the Romanian legislative corpus which is a valuable linguistic asset for the development of machine translation systems, especially for under-resourced languages.  ...  This corpus is processed and annotated at different levels: linguistically (tokenized, lemmatized and POS-tagged), dependency parsed, chunked, named entities identified and labeled with IATE terms and  ...  INEA/CEF/ICT/A2017/1565710 for the Action no. 2017-EU-IA-0136 entitled "Multilingual Resources for CEF.AT in the legal domain" (MARCELL).  ... 
dblp:conf/lrec/TufisMPIC20 fatcat:fecbhton6rgf3gff3hkh5c7ou4

indic-punct: An automatic punctuation restoration and inverse text normalization framework for Indic languages [article]

Anirudh Gupta, Neeraj Chhimwal, Ankur Dhuriya, Rishabh Gaur, Priyanshi Shah, Harveen Singh Chadha, Vivek Raghavan
2022 arXiv   pre-print
Automatic Speech Recognition (ASR) generates text which is most of the times devoid of any punctuation. Absence of punctuation is text can affect readability.  ...  Also, down stream NLP tasks such as sentiment analysis, machine translation, greatly benefit by having punctuation and sentence boundary information.  ...  All authors gratefully acknowledge Ekstep Foundation for supporting this project financially and providing infrastructure. A special thanks to Dr.  ... 
arXiv:2203.16825v1 fatcat:ud7xmgra3rar3mqshftyobdlxu

CLiFF Notes: Research in the Language, Information and Computation Laboratory of the University of Pennsylvania [article]

Editors: Matthew Stone, Libby Levison
1995 arXiv   pre-print
Short abstracts by computational linguistics researchers at the University of Pennsylvania describing ongoing individual and joint projects.  ...  The STAG Machine Translation Project is developing a prototype system for machine translation between English and Korean.  ...  This ranking is determined using a combination of the weights obtained prior to parsing and weightings based on the derivation trees of the generated parses (e.g. for right-vs. leftbranching structures  ... 
arXiv:cmp-lg/9506008v1 fatcat:aidgv3pdzvfatc7yjygdvthcjq
« Previous Showing results 1 — 15 out of 347 results