A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Filters
Project PIAF: Building a Native French Question-Answering Dataset
[article]
2020
arXiv
pre-print
Motivated by the lack of data for non-English languages, in particular for the evaluation of downstream tasks such as Question Answering, we present a participatory effort to collect a native French Question ...
Furthermore, we describe and publicly release the annotation tool developed for our collection effort, along with the data obtained and preliminary baselines. ...
Conclusion Motivated by the scarcity of non-English data, we described our ongoing effort towards gathering native QA samples for the French language, using a participatory approach. ...
arXiv:2007.00968v1
fatcat:ksqezgkdpbcxxmj4jhlp5bgswq
Overview of the SPMRL 2013 Shared Task: A Cross-Framework Evaluation of Parsing Morphologically Rich Languages
2013
Workshop on Statistical Parsing of Morphologically Rich Languages
We report on the preparation of the data sets, on the proposed parsing scenarios, and on the evaluation metrics for parsing MRLs given different representation types. ...
This paper reports on the first shared task on statistical parsing of morphologically rich languages (MRLs). ...
We thank Alon Itai and MILA, the knowledge center for processing Hebrew, for kindly making the Hebrew treebank and morphological analyzer available for us, Anne Abeillé for allowing us to use the French ...
dblp:conf/acl-spmrl/SeddahTKCCFFGGG13
fatcat:5ngltl7jr5hu5o72goudyy4tuq
Phrase-Based Document Categorization
[chapter]
2011
Current Challenges in Patent Information Retrieval
The CLEF-IP corpora The Intellectual Property Evaluation Campaign (CLEF-IP) 4 is an ongoing benchmarking activity on evaluating retrieval techniques in the patent domain. ...
The French parser FR4IR The French documents were parsed with the French parser FR4IR developed by Jean Beney at the INSA de Lyon. ...
doi:10.1007/978-3-642-19231-9_13
fatcat:nezj4ilem5bcncwccxiay5i2hi
A Survey of Automated Text Simplification
2014
International Journal of Advanced Computer Science and Applications
Text simplification modifies syntax and lexicon to improve the understandability of language for an end user. ...
There are many approaches to the simplification task, including: lexical, syntactic, statistical machine translation and hybrid techniques. ...
Syntactic Simplification
Rules for French [71]
A comprehensive list of rules for sim-
plifying the French language.
2012 Sentence Splitting for Vietnamese-
English Machine Translation [72]
Vietnamese ...
doi:10.14569/specialissue.2014.040109
fatcat:fbskuhircjgo3nykcfbnir7gwi
Istex: A Database of Twenty Million Scientific Papers with a Mining Tool Which Uses Named Entities
2019
Information
The results of its evaluation showed a good Precision measure, even if the Recall was not very good. ...
The second challenge was the implementation of Unitex to parse around twenty millions of documents. We used a dockerized application. ...
Acknowledgments: The authors thank Julien Franck, Anubhav Gupta and Sevil Zeynali for their contributions to the project.
Conflicts of Interest: The authors declare no conflict of interest ...
doi:10.3390/info10050178
fatcat:whthljufj5hrhmkysgvjxpkolq
EVALITA4ELG: Italian Benchmark Linguistic Resources, NLP Services and Tools for the ELG Platform
2020
Italian Journal of Computational Linguistics
Starting from the first edition held in 2007, EVALITA is the initiative for the evaluation of Natural Language Processing tools for Italian. ...
This paper describes the EVALITA4ELG project, whose main aim is at systematically collecting the resources released as benchmarks for this evaluation campaign, and making them easily accessible through ...
Acknowledgments The EVALITA4ELG project was supported by the European Language Grid project through its open call for pilot projects. ...
doi:10.4000/ijcol.754
fatcat:qiwz4yjaf5ecfg5k4tdkfxlcii
Message from the general chair
2015
2015 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)
We explore ways of using the resulting grounding to boost the performance of a state-of-the-art co-reference resolution system. ...
Our end system outperforms the state-of-the-art baseline by 2 B 3 F1 points on non-transcript portion of the ACE 2004 dataset. ...
in the syntactic parse tree of a source sentence. ...
doi:10.1109/ispass.2015.7095776
dblp:conf/ispass/Lee15
fatcat:ehbed6nl6barfgs6pzwcvwxria
On the Linguistic Representational Power of Neural Machine Translation Models
2020
Computational Linguistics
Notable findings include the following observations: i) Word morphology and part-of-speech information are captured at the lower layers of the model; (ii) In contrast, lexical semantics or non-local syntactic ...
We analyze the representations learned by neural machine translation (NMT) models at various levels of granularity and evaluate their quality through relevant extrinsic properties. ...
Acknowledgments This work was funded by the QCRI, HBKU, as part of the collaboration with the MIT, CSAIL. ...
doi:10.1162/coli_a_00367
fatcat:5xux2ogrbfhozncsgkneuaz2k4
On the Linguistic Representational Power of Neural Machine Translation Models
[article]
2019
arXiv
pre-print
Notable findings include: i) Word morphology and part-of-speech information are captured at the lower layers of the model; (ii) In contrast, lexical semantics or non-local syntactic and semantic dependencies ...
We analyze the representations learned by neural machine translation models at various levels of granularity and evaluate their quality through relevant extrinsic properties. ...
Acknowledgements This work was funded by the QCRI, HBKU, as part of the collaboration with the MIT, CSAIL. ...
arXiv:1911.00317v1
fatcat:uncw5nmpefhrpo7jtverg6c7qy
On the Integration of LinguisticFeatures into Statistical and Neural Machine Translation
[article]
2020
arXiv
pre-print
Establishing the discrepancies between the strengths of statistical approaches to MT and the way humans translate has been the starting point of our research. ...
We cover a series of problems related to the integration of specific linguistic features into statistical and neural MT, aiming to analyse and provide a solution to some of them. ...
The yearly PAN evaluation campaigns 13 have led to the development of state-of-the-art in-domain gender prediction models on Twitter data for English achieving accuracies up to 80%−85% (Alvarez-Carmona ...
arXiv:2003.14324v1
fatcat:hjmii6oob5cx3oxxmwlkzuqlgu
The EVALITA Dependency Parsing Task: From 2007 to 2011
[chapter]
2013
Lecture Notes in Computer Science
Following the success of previous editions, we organized EVALITA 2014, the fourth evaluation campaign with the aim of continuing to provide a forum for the comparison and evaluation of research outcomes ...
Established in 2007, EVALITA (http://www.evalita.it) is the evaluation campaign of Natural Language Processing and Speech Technologies for the Italian language, organized around shared tasks focusing on ...
by the French National Research Agency (ANR). ...
doi:10.1007/978-3-642-35828-9_1
fatcat:p6dyjaxm4zbitfajtciwclwipu
Biomedical Natural Language ProcessingKevin Bretonnel Cohen and Dina Demner-Fushman (University of Colorado School of Medicine, and National Library of Medicine)John Benjamins Publishing (Book series on Natural Language Processing, edited by Ruslan Mitkov, volume 11), 2014, 160 pp; hardbound, ISBN 978-90-272-4997-5
2017
Computational Linguistics
Acknowledgments The work presented in this paper comes from a 3 year project (ALADIN) started in 2009 and funded by the French Agence Nationale de la Recherche (National Research Agency -ANR) in the context ...
Acknowledgments The research work presented here is supported by grant DO 02-292 "Effective search of conceptual information with applications in medical informatics", funded by the Bulgarian National ...
One such type of processing is at the syntactic level, i.e. syntactic parsing. ...
doi:10.1162/coli_r_00281
fatcat:6abwqppd3bgyvkzfc2mq6kzo6u
The Search for Genericity in Graphics Recognition Applications: Design Issues of the Qgar Software System
[chapter]
2004
Lecture Notes in Computer Science
The paper also presents a quick tour of the various components of the Qgar environment, and concentrates on the usefulness of this kind of system for testing and evaluation purposes. ...
This paper presents the main design and development issues of the Qgar software environment for graphics recognition applications. ...
Such a technique implies ongoing modification of the interfaces of existing classes, whenever new methods are added to the library. ...
doi:10.1007/978-3-540-28640-0_35
fatcat:lr2rcalakrdupio5c3eoby3llu
Extracting Temporal and Causal Relations between Events
[article]
2016
arXiv
pre-print
We then combine the two extraction components into an integrated relation extraction system, CATENA---CAusal and Temporal relation Extraction from NAtural language texts---, by utilizing the presumption ...
Structured information resulting from temporal information processing is crucial for a variety of natural language processing tasks, for instance to generate timeline summarization of events from news ...
campaigns TempEval is a series of evaluation campaigns, which are part of SemEval (Semantic Evaluation), an ongoing series of evaluations of computational semantic analysis systems. ...
arXiv:1604.08120v1
fatcat:fmd7z6hwyjhgphbrnc3mgpifde
A LINGUISTICS EVALUATION OF YORUBA ENGLISH STATISTICAL MACHINE TRANSLATION
2016
Zenodo
A Ph.D Thesis submitted to the Department of Linguistics and African Languages, University of Ibadan, Ibadan, Nigeria. ...
Consideration is also given to overall syntactic parse probability (Koehn 2010: 27) . ...
It is easy to see the benefits of such a model. ...
doi:10.5281/zenodo.3626787
fatcat:pyimkuhcrbbabjy535yaxo26zq
« Previous
Showing results 1 — 15 out of 391 results