Filters








1,012 Hits in 7.3 sec

Grammatical Error Correction in Low Error Density Domains: A New Benchmark and Analyses [article]

Simon Flachs, Ophélie Lacroix, Helen Yannakoudakis, Marek Rei, Anders Søgaard
2020 arXiv   pre-print
We demonstrate that a factor behind this is the inability of systems to rely on a strong internal language model in low error density domains.  ...  We aim to broaden the target domain of GEC and release CWEB, a new benchmark for GEC consisting of website text generated by English speakers of varying levels of proficiency.  ...  We argue that a factor behind this is the inability of systems to rely on a strong internal language model in low error density domains.  ... 
arXiv:2010.07574v1 fatcat:xpzl26blezcy7fxo5jggp5qbii

Grammatical Error Correction in Low Error Density Domains: A New Benchmark and Analyses

Simon Flachs, Ophélie Lacroix, Helen Yannakoudakis, Marek Rei, Anders Søgaard
2020 Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)   unpublished
We demonstrate that a factor behind this is the inability of systems to rely on a strong internal language model in low error density domains.  ...  We aim to broaden the target domain of GEC and release CWEB, a new benchmark for GEC consisting of website text generated by English speakers of varying levels of proficiency.  ...  This work highlights two major prevailing challenges of current approaches to GEC: domain adaptation and low precision in texts with low error density.  ... 
doi:10.18653/v1/2020.emnlp-main.680 fatcat:44i27oczzbfslhfftrhq4q7p54

VALUE: Understanding Dialect Disparity in NLU [article]

Caleb Ziems, Jiaao Chen, Camille Harris, Jessica Anderson, Diyi Yang
2022 arXiv   pre-print
Experiments show that these new dialectal features can lead to a drop in model performance.  ...  To understand disparities in current models and to facilitate more dialect-competent NLU systems, we introduce the VernAcular Language Understanding Evaluation (VALUE) benchmark, a challenging variant  ...  This work is funded in part by Amazon Research Award under the Alexa Fairness in AI.  ... 
arXiv:2204.03031v1 fatcat:gd4crwqdjzbktdfhhmuzfqhtvu

THE ROLE OF AGE OF ACQUISITION IN LATE SECOND LANGUAGE ORAL PROFICIENCY ATTAINMENT

Kazuya Saito
2015 Studies in Second Language Acquisition  
) and was then submitted to segmental, prosodic, temporal, lexical, and grammatical analyses.  ...  The results suggest that AOA plays a key role in determining the extent to which learners can attain advanced-level L2 oral abilities via improving the phonological domain of language (e.g., correct consonant  ...  I am grateful to Pavel Trofimovich and anonymous SSLA reviewers for their helpful input and feedback on the content of this manuscript, and to Ze Shan Yao and George Smith who helped data analyses.  ... 
doi:10.1017/s0272263115000248 fatcat:wiff4vhsnvetdc737kbhvlix6e

The Role of Errors in Validating a Large-Scale Assessment of Adolescent English Writing in Austria

Samuel Hafner, Günther Sigott
2021 Colloquium - New Philologies  
There is a negative relationship between human ratings and the presence of errors; a low error density is associated with higher ratings and a high error density with lower ratings.  ...  Substance wo, cls, and x error densities play an important role in the rating in most dimensions; errors with a larger scope also have a strong effect.  ...  A low error density is associated with high ratings and a high error density with low ratings. substance wo, cls, and x error densities play an important role in the rating in most dimensions;  ... 
doi:10.23963/cnp.2021.6.2.4 fatcat:zuchl42vj5agvkuax3xknbwuke

Automatic Correction of Human Translations [article]

Jessy Lin, Geza Kovacs, Aditya Shastry, Joern Wuebker, John DeNero
2022 arXiv   pre-print
We show that human errors in TEC exhibit a more diverse range of errors and far fewer translation fluency errors than the MT errors in automatic post-editing datasets, suggesting the need for dedicated  ...  We conducted a human-in-the-loop user study with nine professional translation editors and found that the assistance of our TEC system led them to produce significantly higher quality revised translations  ...  We thank Morgan Raymond and Spence Green for their support in releasing the dataset.  ... 
arXiv:2206.08593v1 fatcat:ejozxe5w5rhf7lwq3lfl7k36zi

Students' use of academic vocabulary in comparison to that of published writers: A corpus-driven analysis

Trish Cooper
2017 Stellenbosch Papers in Linguistics  
The aim of this paper is to shed light on some of the vocabulary features of both L1 and AL student writing in relation to published writing as a benchmark.  ...  One of the patterns that emerged serves to support the assumption that L1 students have a better grasp of academic vocabulary than AL students, as there are a greater number of grammatical, semantic and  ...  While a fairly uncommon collocation (LL: 10.70), this is both grammatically and semantically correct, and reflects a good understanding of the words key and instance.  ... 
doi:10.5774/47-0-266 fatcat:rjhv2zco7nbwplvwtyydlsk7cu

Genotype representations in grammatical evolution

Jonatan Hugosson, Erik Hemberg, Anthony Brabazon, Michael O'Neill
2010 Applied Soft Computing  
Abstract Grammatical evolution (GE) is a form of grammar-based genetic programming.  ...  For the first time we analyse and compare these two representations to determine if one has a performance advantage over the other.  ...  Acknowledgements This publication has emanated from research conducted with the financial support of Science Foundation Ireland under grant No. 08/IN.1/I1868.  ... 
doi:10.1016/j.asoc.2009.05.003 fatcat:bcqcohvl3jbxnoauk52ywbmvh4

Single language corpus, multilingual background

Geoffrey Williams, Claude Sionis, Paul Boucher
2002 ASp  
This paper does not seek to answer precise questions as to NSE and NNSE usage, but rather to outline a data-driven approach to corpus analysis of genre-specific discourse.  ...  It starts out by demonstrating the danger of hasty judgements as to NSE and NNSE status in Single language corpus, multilingual background ASp, 37-38 | 2002 INDEX  ...  However this does raise questions with benchmarks as to publishable standards in science writing. 46 If we draw up a list of grammatical functions that "typify" science writing, are we reflecting a developing  ... 
doi:10.4000/asp.1479 fatcat:mcfd3fpnjrfwzonb4oq2mffw2a

ComSum: Commit Messages Summarization and Meaning Preservation [article]

Leshem Choshen, Idan Amit
2021 arXiv   pre-print
Along with its growing size, practicality and challenging language domain, the data set benefits from the living field of empirical software engineering.  ...  We present ComSum, a data set of 7 million commit messages for text summarization. When documenting commits, software code changes, both a message and its summary are posted.  ...  ComSum is not only of a large size, it provides new challenges such as summarizing in a new domain, where a lot of terms appear and constantly change.  ... 
arXiv:2108.10763v1 fatcat:vbyda6jb7rh73p6xcylgut5qqi

The Spelling Errors of French and English Children With Developmental Language Disorder at the End of Primary School

Nelly Joye, Julie E. Dockrell, Chloë R. Marshall
2020 Frontiers in Psychology  
Spelling errors were analyzed to capture areas of difficulty in each language, in the phonological, morphological, orthographic and semantic domains.  ...  The present study also provides a detailed breakdown of the spelling errors found in both languages for children with DLD and typical peers aged 5-11.  ...  school recruitment, Jennifer Donovan and Laurie Brunet for their help with data coding, and all the schools and children who took part.  ... 
doi:10.3389/fpsyg.2020.01789 pmid:32793078 pmcid:PMC7386207 fatcat:r2htmfz6y5gytmdqs7m34nz2bm

Open Stylometric System WebSty: Integrated Language Processing, Analysis and Visualisation

Maciej Piasecki, Tomasz Walkowiak, Maciej Eder
2018 Computational Methods in Science and Technology  
In conclusions, we present preliminary evaluation of WebSty on the corpus of 1000 literary works, and we report on the results of the first research applications of WebSty.  ...  WebSty does not require local installation by users, can be used via any web browser, offers rich set-up, and runs on a computing cluster.  ...  Acknowledgements Works funded by the Polish Ministry of Science and Higher Education within CLARIN-PL Research Infrastructure.  ... 
doi:10.12921/cmst.2018.0000007 fatcat:pditv66ns5emzj6xbs4fi46ssu

A Survey of Advances in Landscape Analysis for Optimisation

Katherine Mary Malan
2021 Algorithms  
With this widened scope, new types of landscapes have emerged such as multiobjective landscapes, violation landscapes, dynamic and coupled landscapes and error landscapes.  ...  The last ten years has seen the field of fitness landscape analysis develop from a largely theoretical idea in evolutionary computation to a practical tool applied in optimisation in general and more recently  ...  A number of new techniques for analysing landscapes have been developed and these are described as an extension to the original survey [1] in Section 3, followed by a summary of contributions related  ... 
doi:10.3390/a14020040 fatcat:wrbotr6q2jbq3nhso7x7dgklfm

Automatic Transformation of Natural to Unified Modeling Language: A Systematic Review [article]

Sharif Ahmed and Arif Ahmed and Nasir U. Eisty
2022 arXiv   pre-print
In addition, it creates a path forward for future research.  ...  We conducted quantitative and qualitative analyses by manually extracting information, cross-checking, and validating our findings.  ...  However, most implementations had constraints such as satisfying a specific grammatical structure, using a domain ontology, sentence length, and absence of ambiguity or anaphora.  ... 
arXiv:2204.00932v2 fatcat:jshakcyekbbopov65ehpqkbjzu

Towards Affordable Disclosure of Spoken Heritage Archives

Roeland Ordelman, Willemijn Heeren, Marijn Huijbregts, Franciska de Jong, Djoerd Hiemstra
2009 Journal of Digital Information  
This paper presents and discusses ongoing work aiming at affordable disclosure of real-world spoken heritage archives in general, and in particular of a collection of recorded interviews with Dutch survivors  ...  Given such collections, we at least want to provide search at different levels and a flexible way of presenting results.  ...  Acknowledgments This paper is based on research funded by the NWO program CATCH (http://www.nwo.nl/catch) and by bsik program MultimediaN (http://www.multimedian.nl).  ... 
dblp:journals/jodi/OrdelmanHHJH09 fatcat:etnicjnlrzdapml6encxn6v3ey
« Previous Showing results 1 — 15 out of 1,012 results