469 Hits in 8.9 sec

Evaluation of Methods for Sentence and Lexical Alignment of Brazilian Portuguese and English Parallel Texts [chapter]

Helena de Medeiros Caseli, Aline Maria da Paz Silva, Maria das Graças Volpe Nunes
2004 Lecture Notes in Computer Science  
In this paper we describe some experiments that have being carried out with Brazilian Portuguese and English parallel texts by the use of well known alignment methods: five methods for sentence alignment  ...  and two methods for lexical alignment.  ...  Acknowledgments We would like to thank FAPESP, CAPES and CNPq for financial support.  ... 
doi:10.1007/978-3-540-28645-5_19 fatcat:46biiz2klfb77d6p4wjzfb2z3e

BP2EP - Adaptation of Brazilian Portuguese texts to European Portuguese

Luís Marujo, Nuno Grazina, Tiago Luís, Wang Ling, Luísa Coheur, Isabel Trancoso
2011 European Association for Machine Translation Conferences/Workshops  
This paper describes a method to efficiently leverage Brazilian Portuguese resources as European Portuguese resources.  ...  Brazilian Portuguese and European Portuguese are two Portuguese varieties very close and usually mutually intelligible, but with several known differences, which are studied in this work.  ...  Acknowledgements The authors would like to thank Nuno Mamede, Amália Mendes, and the anonymous reviewers for many helpful comments.  ... 
dblp:conf/eamt/MarujoGLLCT11 fatcat:sqwbnqqwrvaaxiyrb6leg5vt2e

Children's literature parallel corpora: a hybrid experimental model to evaluate transfers of language complexity via linguistic transcoding

Adja Balbino de Amorim Barbieri Durão, Paulo Roberto Kloeppel
2018 Ilha do Desterro  
The article aims at proposing a hybrid model to evaluate language complexity of source and target texts written both in English and Portuguese so that one can analyse at what extent language complexity  ...  In it, hybrid model points to paralleled approaches to lexical repetition, lexical diversity and lexical density, readability and word unusualness with the help of some Corpus Linguistics tools.  ...  complexity of source text (ST) and target texts (TT); (ii) presents an experimental corpus-assisted method for evaluating language complexity, crossculturally; (iii) for the sake of being more pedagogical  ... 
doi:10.5007/2175-8026.2018v71n1p27 fatcat:eqaldndnm5ggrde6yhfsb4gsf4

Back-Translation as Strategy to Tackle the Lack of Corpus in Natural Language Generation from Semantic Representations

Marco Antonio Sobrevilla Cabezudo, Simon Mille, Thiago Pardo
2019 Proceedings of the 2nd Workshop on Multilingual Surface Realisation (MSR 2019)  
This paper presents an exploratory study that aims to evaluate the usefulness of backtranslation in Natural Language Generation (NLG) from semantic representations for non-English languages.  ...  Specifically, Abstract Meaning Representation and Brazilian Portuguese (BP) are chosen as semantic representation and language, respectively.  ...  Acknowledgments The authors are grateful to CAPES and USP Research Office for supporting this work, and would like to thank NVIDIA for donating the GPU.  ... 
doi:10.18653/v1/d19-6313 dblp:conf/emnlp/CabezudoMP19 fatcat:debflcnad5dk5gsazhocaypjk4

Fostering Digital Inclusion and Accessibility: The PorSimples project for Simplification of Portuguese Texts

Sandra M. Aluísio, Caroline Gasperin
2010 North American Chapter of the Association for Computational Linguistics  
In this paper we present the PorSimples project, whose aim is to develop text adaptations tools for Brazilian Portuguese.  ...  Here we describe the tools and resources developed over two years of this project and point directions for future work and collaboration.  ...  Acknowledgments We thank FAPESP and Microsoft Research for supporting the PorSimples project.  ... 
dblp:conf/naacl/AluisioG10 fatcat:4lik2y6svjbahnujapsgdjj4bm

An Experiment in Spanish-Portuguese Statistical Machine Translation [chapter]

Wilker Ferreira Aziz, Thiago Alexandre Salgueiro Pardo, Ivandré Paraboni
2008 Lecture Notes in Computer Science  
Statistical approaches to machine translation have long been successfully applied to a number of 'distant' language pairs such as English-Arabic and English-Chinese.  ...  In this work we describe an experiment in statistical machine translation between two 'related' languages: European Spanish and Brazilian Portuguese.  ...  use for Brazilian Portuguese speakers.  ... 
doi:10.1007/978-3-540-88190-2_30 fatcat:pjzk3oa6rrcqtaqtb7f3xnhlzq

Extending the Galician Wordnet Using a Multilingual Bible Through Lexical Alignment and Semantic Annotation

Alberto Simões, Xavier Gómez Guinovart, Michael Wagner
2018 Symposium on Languages, Applications and Technologies  
For this experiment we used the Galician, Portuguese, Spanish, Catalan and English versions of the Bible. They were annotated with part-of-speech and WordNet sense using FreeLing.  ...  In this paper we describe the methodology and evaluation of the expansion of Galnet -the Galician wordnet -using a multilingual Bible through lexical alignment and semantic annotation.  ...  Our contribution is another method to obtain candidate variants, applied to the Galician wordnet (Galnet), using other languages wordnets (English, Catalan, Spanish and Portuguese) together with a sentence-aligned  ... 
doi:10.4230/oasics.slate.2018.14 dblp:conf/slate/SimoesG18 fatcat:fb2exkqmgvbtbkoogzvgkmzddq

Initial Approaches on Cross-Lingual Information Retrieval Using SMT on User-Queries

Marta R. Costa-jussà, Christian Paz-Trillo, Renata Wassermann
2012 Joint Seminar on Ontology Research in Brazil / International Workshop on Metamodels, Ontologies and Semantic Technologies  
Particularly, the pair of languages we work in this paper are English and Portuguese. In order to perform query translation we use a statistical machine translation approach.  ...  Our experiments show that the multilingual system is capable of achieving almost the same quality of that obtained by the monolingual system. Resumo.  ...  Acknowledgements This work has been supported by FAPESP through the OnAir project (2010/19111-9) and the visiting researcher program (2012/02131-2), and by the Spanish Ministry of Economy and Competitiveness  ... 
dblp:conf/ontobras/Costa-JussaPW12 fatcat:d27o7eoeibg3ljiwbjoncodmla

A Survey of Automated Text Simplification

Matthew Shardlow
2014 International Journal of Advanced Computer Science and Applications  
Text simplification modifies syntax and lexicon to improve the understandability of language for an end user.  ...  There are many approaches to the simplification task, including: lexical, syntactic, statistical machine translation and hybrid techniques.  ...  For example, in a parallel corpus of simplified English and regular English, the former will be called simple and the latter complex.  ... 
doi:10.14569/specialissue.2014.040109 fatcat:fbskuhircjgo3nykcfbnir7gwi

Exploring content selection strategies for Multilingual Multi-Document Summarization based on the Universal Network Language (UNL)

Matheus Rigobelo Chaud, Ariani Di Felippo
2017 Revista de Estudos da Linguagem  
We used a bilingual corpus (Brazilian Portuguese-English) encoded in UNL (Universal Network Language) with source and summary sentences aligned based on content overlap.  ...  Multilingual Multi-Document Summarization aims at ranking the sentences of a cluster with (at least) 2 news texts (1 in the user's language and 1 in a foreign language), and select the top-ranked sentences  ...  Acknowledgements The authors thank Coordination for the Improvement of Higher Education Personnel CAPES, CNPq, and State of São Paulo Research Foundation (FAPESP) for the financial support.  ... 
doi:10.17851/2237-2083.26.1.45-71 fatcat:zdfws25qbjg3hhqbslru45c7ki

On the Automatic Learning of Bilingual Resources: Some Relevant Factors for Machine Translation [chapter]

Helena de M. Caseli, Maria das Graças V. Nunes, Mikel L. Forcada
2008 Lecture Notes in Computer Science  
The experiments were carried out with Brazilian Portuguese (pt), English (en) and Spanish (es) texts in two parallel corpora: pt-en and pt-es.  ...  In this paper we present experiments concerned with automatically learning bilingual resources for machine translation: bilingual dictionaries and transfer rules.  ...  Acknowledgements We thank the financial support of the Brazilian agencies FAPESP, CAPES and CNPq, and of the Spanish Ministry of Education and Science.  ... 
doi:10.1007/978-3-540-88190-2_31 fatcat:esrasxv7arbevfwmh7vyce43ee

Tree-Based Statistical Machine Translation: Experiments with the English and Brazilian Portuguese Pair

Daniel Beck, Helena Caseli
2013 Learning and Nonlinear Models  
We perform experiments with English and Brazilian Portuguese, providing the first known results in syntax-based Statistical Machine Translation for this language pair.  ...  In previous work, [5] performed experiments in PB-SMT between Brazilian Portuguese and both English and Spanish languages.  ...  We used the corpus version 1, composed by 646 articles and 17.397 sentence pairs in English and Brazilian Portuguese 7 .  ... 
doi:10.21528/lnlm-vol11-no1-art2 fatcat:3lrwss65bzhyfpjhhx5jqbva64

Improving Statistical Machine Translation for a Resource-Poor Language Using Related Resource-Rich Languages

P. Nakov, H. T. Ng
2012 The Journal of Artificial Intelligence Research  
The evaluation for Indonesian- >English using Malay and for Spanish -> English using Portuguese and pretending Spanish is resource-poor shows an absolute gain of up to 1.35 and 3.37 BLEU points, respectively  ...  More precisely, we improve the translation from a resource-poor source language X_1 into a resource-rich language Y given a bi-text containing a limited number of parallel sentences for X_1-Y and a larger  ...  Acknowledgments We would like to thank the anonymous reviewers for their constructive comments and suggestions, which have helped us improve the quality of the manuscript.  ... 
doi:10.1613/jair.3540 fatcat:sfyq5gxyzrbtljgq3tpnbl4uma

Segmentation Strategies to Face Morphology Challenges in Brazilian-Portuguese/English Statistical Machine Translation and Its Integration in Cross-Language Information Retrieval

Marta Ruiz Costa-jussà
2015 Journal of Computacion y Sistemas  
Experiments show significant improvements from the enhanced system over the baseline system on the Brazilian-Portuguese/English language pair.  ...  The use of morphology is particularly interesting in the context of statistical machine translation in order to reduce data sparseness and compensate a lack of training corpus.  ...  Tagnin for providing the out-of-domain corpus and Fabiano Luz for his dedication to parallelize this corpus.  ... 
doi:10.13053/cys-19-2-1550 fatcat:oa3dn2anyjcavjebgwu7xyopom

Constructing a Family Tree of Ten Indo-European Languages with Delexicalized Cross-linguistic Transfer Patterns [article]

Yuanyuan Zhao, Weiwei Sun, Xiaojun Wan
2020 arXiv   pre-print
This allows us to quantitatively probe cross-linguistic transfer and extend inquiries of SLA.  ...  It is reasonable to hypothesize that the divergence patterns formulated by historical linguists and typologists reflect constraints on human languages, and are thus consistent with Second Language Acquisition  ...  Based on reliable syntactic analysis for aligned parallel data 3 , we can generate such patterns with grammar induction technologies.  ... 
arXiv:2007.09076v1 fatcat:bpzg5ww4bzgfdjpq3rmjqfueyi
« Previous Showing results 1 — 15 out of 469 results