Filters








391 Hits in 6.8 sec

Project PIAF: Building a Native French Question-Answering Dataset [article]

Rachel Keraron, Guillaume Lancrenon, Mathilde Bras, Frédéric Allary, Gilles Moyse, Thomas Scialom, Edmundo-Pavel Soriano-Morales, Jacopo Staiano
2020 arXiv   pre-print
Motivated by the lack of data for non-English languages, in particular for the evaluation of downstream tasks such as Question Answering, we present a participatory effort to collect a native French Question  ...  Furthermore, we describe and publicly release the annotation tool developed for our collection effort, along with the data obtained and preliminary baselines.  ...  Conclusion Motivated by the scarcity of non-English data, we described our ongoing effort towards gathering native QA samples for the French language, using a participatory approach.  ... 
arXiv:2007.00968v1 fatcat:ksqezgkdpbcxxmj4jhlp5bgswq

Overview of the SPMRL 2013 Shared Task: A Cross-Framework Evaluation of Parsing Morphologically Rich Languages

Djamé Seddah, Reut Tsarfaty, Sandra Kübler, Marie Candito, Jinho D. Choi, Richárd Farkas, Jennifer Foster, Iakes Goenaga, Koldo Gojenola Galletebeitia, Yoav Goldberg, Spence Green, Nizar Habash (+11 others)
2013 Workshop on Statistical Parsing of Morphologically Rich Languages  
We report on the preparation of the data sets, on the proposed parsing scenarios, and on the evaluation metrics for parsing MRLs given different representation types.  ...  This paper reports on the first shared task on statistical parsing of morphologically rich languages (MRLs).  ...  We thank Alon Itai and MILA, the knowledge center for processing Hebrew, for kindly making the Hebrew treebank and morphological analyzer available for us, Anne Abeillé for allowing us to use the French  ... 
dblp:conf/acl-spmrl/SeddahTKCCFFGGG13 fatcat:5ngltl7jr5hu5o72goudyy4tuq

Phrase-Based Document Categorization [chapter]

Cornelis H. A. Koster, Jean G. Beney, Suzan Verberne, Merijn Vogel
2011 Current Challenges in Patent Information Retrieval  
The CLEF-IP corpora The Intellectual Property Evaluation Campaign (CLEF-IP) 4 is an ongoing benchmarking activity on evaluating retrieval techniques in the patent domain.  ...  The French parser FR4IR The French documents were parsed with the French parser FR4IR developed by Jean Beney at the INSA de Lyon.  ... 
doi:10.1007/978-3-642-19231-9_13 fatcat:nezj4ilem5bcncwccxiay5i2hi

A Survey of Automated Text Simplification

Matthew Shardlow
2014 International Journal of Advanced Computer Science and Applications  
Text simplification modifies syntax and lexicon to improve the understandability of language for an end user.  ...  There are many approaches to the simplification task, including: lexical, syntactic, statistical machine translation and hybrid techniques.  ...  Syntactic Simplification Rules for French [71] A comprehensive list of rules for sim- plifying the French language. 2012 Sentence Splitting for Vietnamese- English Machine Translation [72] Vietnamese  ... 
doi:10.14569/specialissue.2014.040109 fatcat:fbskuhircjgo3nykcfbnir7gwi

Istex: A Database of Twenty Million Scientific Papers with a Mining Tool Which Uses Named Entities

Denis Maurel, Enza Morale, Nicolas Thouvenin, Patrice Ringot, Angel Turri
2019 Information  
The results of its evaluation showed a good Precision measure, even if the Recall was not very good.  ...  The second challenge was the implementation of Unitex to parse around twenty millions of documents. We used a dockerized application.  ...  Acknowledgments: The authors thank Julien Franck, Anubhav Gupta and Sevil Zeynali for their contributions to the project. Conflicts of Interest: The authors declare no conflict of interest  ... 
doi:10.3390/info10050178 fatcat:whthljufj5hrhmkysgvjxpkolq

EVALITA4ELG: Italian Benchmark Linguistic Resources, NLP Services and Tools for the ELG Platform

Viviana Patti, Valerio Basile, Cristina Bosco, Rossella Varvara, Michael Fell, Andrea Bolioli, Alessio Bosca
2020 Italian Journal of Computational Linguistics  
Starting from the first edition held in 2007, EVALITA is the initiative for the evaluation of Natural Language Processing tools for Italian.  ...  This paper describes the EVALITA4ELG project, whose main aim is at systematically collecting the resources released as benchmarks for this evaluation campaign, and making them easily accessible through  ...  Acknowledgments The EVALITA4ELG project was supported by the European Language Grid project through its open call for pilot projects.  ... 
doi:10.4000/ijcol.754 fatcat:qiwz4yjaf5ecfg5k4tdkfxlcii

Message from the general chair

Benjamin C. Lee
2015 2015 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)  
We explore ways of using the resulting grounding to boost the performance of a state-of-the-art co-reference resolution system.  ...  Our end system outperforms the state-of-the-art baseline by 2 B 3 F1 points on non-transcript portion of the ACE 2004 dataset.  ...  in the syntactic parse tree of a source sentence.  ... 
doi:10.1109/ispass.2015.7095776 dblp:conf/ispass/Lee15 fatcat:ehbed6nl6barfgs6pzwcvwxria

On the Linguistic Representational Power of Neural Machine Translation Models

Yonatan Belinkov, Nadir Durrani, Fahim Dalvi, Hassan Sajjad, James Glass
2020 Computational Linguistics  
Notable findings include the following observations: i) Word morphology and part-of-speech information are captured at the lower layers of the model; (ii) In contrast, lexical semantics or non-local syntactic  ...  We analyze the representations learned by neural machine translation (NMT) models at various levels of granularity and evaluate their quality through relevant extrinsic properties.  ...  Acknowledgments This work was funded by the QCRI, HBKU, as part of the collaboration with the MIT, CSAIL.  ... 
doi:10.1162/coli_a_00367 fatcat:5xux2ogrbfhozncsgkneuaz2k4

On the Linguistic Representational Power of Neural Machine Translation Models [article]

Yonatan Belinkov, Nadir Durrani, Fahim Dalvi, Hassan Sajjad, James Glass
2019 arXiv   pre-print
Notable findings include: i) Word morphology and part-of-speech information are captured at the lower layers of the model; (ii) In contrast, lexical semantics or non-local syntactic and semantic dependencies  ...  We analyze the representations learned by neural machine translation models at various levels of granularity and evaluate their quality through relevant extrinsic properties.  ...  Acknowledgements This work was funded by the QCRI, HBKU, as part of the collaboration with the MIT, CSAIL.  ... 
arXiv:1911.00317v1 fatcat:uncw5nmpefhrpo7jtverg6c7qy

On the Integration of LinguisticFeatures into Statistical and Neural Machine Translation [article]

Eva Vanmassenhove
2020 arXiv   pre-print
Establishing the discrepancies between the strengths of statistical approaches to MT and the way humans translate has been the starting point of our research.  ...  We cover a series of problems related to the integration of specific linguistic features into statistical and neural MT, aiming to analyse and provide a solution to some of them.  ...  The yearly PAN evaluation campaigns 13 have led to the development of state-of-the-art in-domain gender prediction models on Twitter data for English achieving accuracies up to 80%−85% (Alvarez-Carmona  ... 
arXiv:2003.14324v1 fatcat:hjmii6oob5cx3oxxmwlkzuqlgu

The EVALITA Dependency Parsing Task: From 2007 to 2011 [chapter]

Cristina Bosco, Alessandro Mazzei
2013 Lecture Notes in Computer Science  
Following the success of previous editions, we organized EVALITA 2014, the fourth evaluation campaign with the aim of continuing to provide a forum for the comparison and evaluation of research outcomes  ...  Established in 2007, EVALITA (http://www.evalita.it) is the evaluation campaign of Natural Language Processing and Speech Technologies for the Italian language, organized around shared tasks focusing on  ...  by the French National Research Agency (ANR).  ... 
doi:10.1007/978-3-642-35828-9_1 fatcat:p6dyjaxm4zbitfajtciwclwipu

Biomedical Natural Language ProcessingKevin Bretonnel Cohen and Dina Demner-Fushman (University of Colorado School of Medicine, and National Library of Medicine)John Benjamins Publishing (Book series on Natural Language Processing, edited by Ruslan Mitkov, volume 11), 2014, 160 pp; hardbound, ISBN 978-90-272-4997-5

Jin-Dong Kim
2017 Computational Linguistics  
Acknowledgments The work presented in this paper comes from a 3 year project (ALADIN) started in 2009 and funded by the French Agence Nationale de la Recherche (National Research Agency -ANR) in the context  ...  Acknowledgments The research work presented here is supported by grant DO 02-292 "Effective search of conceptual information with applications in medical informatics", funded by the Bulgarian National  ...  One such type of processing is at the syntactic level, i.e. syntactic parsing.  ... 
doi:10.1162/coli_r_00281 fatcat:6abwqppd3bgyvkzfc2mq6kzo6u

The Search for Genericity in Graphics Recognition Applications: Design Issues of the Qgar Software System [chapter]

Jan Rendek, Gérald Masini, Philippe Dosch, Karl Tombre
2004 Lecture Notes in Computer Science  
The paper also presents a quick tour of the various components of the Qgar environment, and concentrates on the usefulness of this kind of system for testing and evaluation purposes.  ...  This paper presents the main design and development issues of the Qgar software environment for graphics recognition applications.  ...  Such a technique implies ongoing modification of the interfaces of existing classes, whenever new methods are added to the library.  ... 
doi:10.1007/978-3-540-28640-0_35 fatcat:lr2rcalakrdupio5c3eoby3llu

Extracting Temporal and Causal Relations between Events [article]

Paramita Mirza
2016 arXiv   pre-print
We then combine the two extraction components into an integrated relation extraction system, CATENA---CAusal and Temporal relation Extraction from NAtural language texts---, by utilizing the presumption  ...  Structured information resulting from temporal information processing is crucial for a variety of natural language processing tasks, for instance to generate timeline summarization of events from news  ...  campaigns TempEval is a series of evaluation campaigns, which are part of SemEval (Semantic Evaluation), an ongoing series of evaluations of computational semantic analysis systems.  ... 
arXiv:1604.08120v1 fatcat:fmd7z6hwyjhgphbrnc3mgpifde

A LINGUISTICS EVALUATION OF YORUBA ENGLISH STATISTICAL MACHINE TRANSLATION

ODOJE CLEMENT, OYE TAIWO
2016 Zenodo  
A Ph.D Thesis submitted to the Department of Linguistics and African Languages, University of Ibadan, Ibadan, Nigeria.  ...  Consideration is also given to overall syntactic parse probability (Koehn 2010: 27) .  ...  It is easy to see the benefits of such a model.  ... 
doi:10.5281/zenodo.3626787 fatcat:pyimkuhcrbbabjy535yaxo26zq
« Previous Showing results 1 — 15 out of 391 results