Filters








374 Hits in 4.7 sec

Portuguese variety identification on broadcast news

Jean-Luc Rouas, Isabel Trancoso, Ceu Viana, Monica Abreu
2008 Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing  
The system is designed to be used as a pre-processing module for the Portuguese Automatic Speech Recognition system developed at INESC-ID.  ...  In terms of variety identification, the overall rate of correct identification is 69.0% if all 7 varieties are considered, and the best results are obtained for Brazilian Portuguese, also the variety that  ...  In fact, whereas the word error rate (WER) of an ASR trained for EP is around 24% for this variety, for African Portuguese (AP) it can go from 30% to 38%, and for Brazilian Portuguese (BP), it may exceed  ... 
doi:10.1109/icassp.2008.4518588 dblp:conf/icassp/RouasTVA08 fatcat:3qls6373nnblxgp55skkpzv2le

RDF2PT: Generating Brazilian Portuguese Texts from RDF Data [article]

Diego Moussallem, Thiago Castro Ferreira, Marcos Zampieri, Maria Claudia Cavalcanti, Geraldo Xexéo, Mariana Neves, Axel-Cyrille Ngonga Ngomo
2018 arXiv   pre-print
A number of these approaches generate natural language in languages other than English, however, no work has been proposed to generate Brazilian Portuguese texts out of RDF.  ...  We address this research gap by presenting RDF2PT, an approach that verbalizes RDF data to Brazilian Portuguese language.  ...  Acknowledgments This work has been supported by the H2020 project HOB-BIT (GA no. 688227) and supported by the Brazilian National Council for Scientific and Technological Development (CNPq) (no. 206971  ... 
arXiv:1802.08150v1 fatcat:4n7ppcb7rfbgzpjj5itqdcvfwm

Transformers and Transfer Learning for Improving Portuguese Semantic Role Labeling [article]

Sofia Oliveira and Daniel Loureiro and Alípio Jorge
2021 arXiv   pre-print
Semantic Role Labeling (SRL) is a core Natural Language Processing task.  ...  However, for low resource languages, and in particular for Portuguese, currently available SRL models are hindered by scarce training data.  ...  funding agency, FCT -Fundação para a Ciência e a Tecnologia within project PTDC/CCI-COM/31857/2017 (NORTE-01-0145-FEDER-03185).  ... 
arXiv:2101.01213v2 fatcat:7ys3wuao2nfajndywwao6y5tsm

Recognizing Textual Entailment: Challenges in the Portuguese Language

Gil Rocha, Henrique Lopes Cardoso
2018 Information  
In addition, we conclude that semantic-based approaches are promising in this task, and that combining data from European and Brazilian Portuguese is less straightforward than it may initially seem.  ...  As in many NLP tasks, textual entailment corpora for English abound, while the same is not true for more resource-scarce languages such as Portuguese.  ...  Henrique Lopes Cardoso has supervised the work, namely for defining the methods proposed, designing the experimental settings and analyzing results.  ... 
doi:10.3390/info9040076 fatcat:e7n6olpwvjbbjpo3wjss5wkoxy

Benchmarking Natural Language Inference and Semantic Textual Similarity for Portuguese

Pedro Fialho, Luísa Coheur, Paulo Quaresma
2020 Information  
We developed several models for natural language inference and semantic textual similarity for the Portuguese language.  ...  Besides obtaining state-of-the-art results, this is, to the best of our knowledge, the most all-inclusive study about natural language inference and semantic textual similarity for the Portuguese language  ...  A benchmark for systems aimed at performing RTE was initially developed in the PASCAL challenge series [6] , where RTE was defined as the task of labelling two sentences as entailed or not entailed.  ... 
doi:10.3390/info11100484 fatcat:dpf5ax2wf5as7hxwnmwos7z7wi

Learning parts-of-speech through distributional analysis. Further results from Brazilian Portuguese

Pablo Faria
2019 Diacrítica  
s (1998) model, but the input data used comes from publicly available corpora of both child-directed speech and speech between adults in Brazilian Portuguese.  ...  A model of part-of-speech (or syntactic category) learning through distributional analysis – as a task in the language acquisition process – is presented here. It is based on Redington et al.'  ...  s (1998) 2 model, applying it to Brazilian Portuguese (BP) data.  ... 
doi:10.21814/diacritica.415 fatcat:3o4by7q2rjb7dmgrarbnf34ot4

Towards a General Abstract Meaning Representation Corpus for Brazilian Portuguese

Marco Antonio Sobrevilla Cabezudo, Thiago Pardo
2019 Proceedings of the 13th Linguistic Annotation Workshop  
In this context, this paper presents an effort to build a general purpose AMR-annotated corpus for Brazilian Portuguese by translating and adapting AMR English guidelines.  ...  Meaning Representation (AMR) is a recent and prominent semantic representation with good acceptance and several applications in the Natural Language Processing area.  ...  Acknowledgments The authors are grateful to CAPES and USP Research Office for supporting this work and to the several corpus annotators that have collaborated with this research.  ... 
doi:10.18653/v1/w19-4028 dblp:conf/acllaw/CabezudoP19 fatcat:3jfbpl4u7zadnjtg7xkbdx4sem

Robust Complaint Processing in Portuguese

Henrique Lopes-Cardoso, Tomás Freitas Osório, Luís Vilar Barbosa, Gil Rocha, Luís Paulo Reis, João Pedro Machado, Ana Maria Oliveira
2021 Information  
This paper exposes a set of challenges encountered when dealing with a real-world complex NLP problem, based on user-generated complaint data in Portuguese.  ...  This case study meets the needs of a country-wide governmental institution responsible for food safety and economic surveillance, and its responsibilities in handling a high number of citizen complaints  ...  Despite the fact that there exist BERT models for (Brazilian) Portuguese [39] , we leave for future work evaluating if using such models brings significant improvements over the multilingual variant we  ... 
doi:10.3390/info12120525 fatcat:q4qf5lotvfecfek5k6woar7q5i

Distributional and Knowledge-Based Approaches for Computing Portuguese Word Similarity

Hugo Gonçalo Oliveira
2018 Information  
In fact, it can also be seen as a survey of resources-semantic models and benchmarks-currently available for this purpose.  ...  The remainder of this paper starts with a brief overview on semantic similarity, variants, common approaches, and a focus on this topic for Portuguese.  ...  played a positive role.  ... 
doi:10.3390/info9020035 fatcat:2dwrx5synvdvpnesoe7jbidyym

A Design Proposal of an Online Corpus-Driven Dictionary of Portuguese for University Students

Tanara Zingano Kuhn
2019 Journal of Portuguese Linguistics  
benchmark.  ...  One of the challenges of working on a pioneering Portuguese lexicographic project such as DOPU is the lack of a benchmark against which parameters for resources creation can be measured and judged.  ... 
doi:10.5334/jpl.209 fatcat:ofqxdeezmnav3mzbujrgmzt3j4

A review on Relation Extraction with an eye on Portuguese

Sandra Collovini de Abreu, Tiago Luis Bonamigo, Renata Vieira
2013 Journal of the Brazilian Computer Society  
We present a review of the state-of-the-art for Relation Extraction in free texts, addressing the progress and difficulties of the area, and situating Portuguese in that frame.  ...  We also give special attention to the literature for Portuguese tools, which need further progress.  ...  We thank the Brazilian funding agency FAPERGS/CAPES for the scholarship granted.  ... 
doi:10.1007/s13173-013-0116-8 fatcat:rxrt7yieija3hndl7qzhazdtqq

Language and variety verification on broadcast news for Portuguese

Jean-Luc Rouas, Isabel Trancoso, Céu Viana, Mónica Abreu
2008 Speech Communication  
This paper describes a language/accent verification system for Portuguese, that explores different type of properties: acoustic, phonotactic and prosodic.  ...  The two-stage system is designed to be used as a pre-processing module for the Portuguese Automatic Speech Recognition (ASR) system developed at INESC-ID.  ...  The authors would like to thank our colleagues Hugo Meinedo and Ernesto de Andrade for helpful comments.  ... 
doi:10.1016/j.specom.2008.05.006 fatcat:227n3l7sgndhnkiweewkiycdii

Pirá: A Bilingual Portuguese-English Dataset for Question-Answering about the Ocean [article]

André F. A. Paschoal, Paulo Pirozelli, Valdinei Freire, Karina V. Delgado, Sarajane M. Peres, Marcos M. José, Flávio Nakasato, André S. Oliveira, Anarosa A. F. Brandão, Anna H. R. Costa, Fabio G. Cozman
2022 arXiv   pre-print
This paper presents the Pir\'a dataset, a large set of questions and answers about the ocean and the Brazilian coast both in Portuguese and English.  ...  Pir\'a is, to the best of our knowledge, the first QA dataset with supporting texts in Portuguese, and, perhaps more importantly, the first bilingual QA dataset that includes this language.  ...  ACKNOWLEDGMENTS The authors are grateful for the collaboration of Dr. Eduardo Aoun Tannuri (Escola Politécnica -Universidade de São Paulo) for his valuable contributions regarding domain knowledge.  ... 
arXiv:2202.02398v1 fatcat:wkjdhoo5pfbl5mk5anjdjzjbnq

#PraCegoVer: A Large Dataset for Image Captioning in Portuguese [article]

Gabriel Oliveira dos Santos and Esther Luna Colombini and Sandra Avila
2021 arXiv   pre-print
It is the first large dataset for image captioning in Portuguese with freely annotated images.  ...  Thus, inspired by this movement, we have proposed the #PraCegoVer, a multi-modal dataset with Portuguese captions based on posts from Instagram.  ...  Is there a label or target associated with each instance?  ... 
arXiv:2103.11474v2 fatcat:liph7sbgnfdyzoz6zo7mxpcrrq

#PraCegoVer: A Large Dataset for Image Captioning in Portuguese

Gabriel Oliveira dos Santos, Esther Luna Colombini, Sandra Avila
2022 Data  
We introduce the #PraCegoVer, a multi-modal dataset with Portuguese captions based on posts from Instagram. It is the first large dataset for image captioning in Portuguese.  ...  We hope that #PraCegoVer dataset encourages more works addressing the automatic generation of descriptions in Portuguese.  ...  Acknowledgments: We would like to thank Artificial Intelligence (Recod.ai) lab. of the Institute of Computing at University of Campinas (Unicamp) for permitting us to use resources to carry out our research  ... 
doi:10.3390/data7020013 fatcat:cr3mahuy7fdyzj5vphydospm4e
« Previous Showing results 1 — 15 out of 374 results