Filters








39,890 Hits in 4.7 sec

Strong Baselines for Complex Word Identification across Multiple Languages [article]

Pierre Finnimore, Elisabeth Fritzsch, Daniel King, Alison Sneyd, Aneeq Ur Rehman, Fernando Alva-Manchego, Andreas Vlachos
2019 arXiv   pre-print
Complex Word Identification (CWI) is the task of identifying which words or phrases in a sentence are difficult to understand by a target audience.  ...  We show that carefully selected features and simple learning models can achieve state-of-the-art performance, and result in strong baselines for future development in this area.  ...  Acknowledgements This work was initiated in a class project for the NLP module at the University of Sheffield.  ... 
arXiv:1904.05953v1 fatcat:w654do6n2rgepotwmd5yw7cw4q

Strong Baselines for Complex Word Identification across Multiple Languages

Pierre Finnimore, Elisabeth Fritzsch, Daniel King, Alison Sneyd, Aneeq Ur Rehman, Fernando Alva-Manchego, Andreas Vlachos
2019 Proceedings of the 2019 Conference of the North  
Complex Word Identification (CWI) is the task of identifying which words or phrases in a sentence are difficult to understand by a target audience.  ...  We show that carefully selected features and simple learning models can achieve state-of-the-art performance, and result in strong baselines for future development in this area.  ...  Acknowledgements This work was initiated in a class project for the NLP module at the University of Sheffield.  ... 
doi:10.18653/v1/n19-1102 dblp:conf/naacl/FinnimoreFKSRAV19 fatcat:zapo22cqarhdfppuj4enki2v2u

The Effect of Narrative Language Intervention on the Language Skills of Children With Hearing Loss

Stephanie M. Raymond, Trina D. Spencer
2021 Perspectives of the ASHA Special Interest Groups  
Method A multiple baseline design (for retelling) and a repeated acquisition design (for vocabulary) were used to fulfill the purpose of the study.  ...  complex sentences.  ...  Procedures The multiple baseline design was implemented across three conditions: baseline, intervention, and maintenance.  ... 
doi:10.1044/2021_persp-20-00239 fatcat:k2j5gnwmofhqhcmdxf6mh76usi

Fusion of Simple Models for Native Language Identification

Fabio Kepler, Ramón Astudillo, Alberto Abad
2017 Proceedings of the 12th Workshop on Innovative Use of NLP for Building Educational Applications  
In this paper we describe the approaches we explored for the 2017 Native Language Identification shared task. We focused on simple word and sub-word units avoiding heavy use of hand-crafted features.  ...  After the task was closed, we carried on further experiments and relied on a late fusion strategy for combining our simple proposed approaches with modifications of the baselines provided by the task.  ...  Acknowledgements Fabio Kepler gratefully acknowledge the support of NVIDIA Corporation with the donation of the Titan Xp GPU used for several experiments reported in this paper.  ... 
doi:10.18653/v1/w17-5048 dblp:conf/bea/KeplerAA17 fatcat:7ulrukvoifcxng7qwkwiy5xctm

On the Strength of Character Language Models for Multilingual Named Entity Recognition [article]

Xiaodong Yu, Stephen Mayhew, Mark Sammons, Dan Roth
2018 arXiv   pre-print
However, to date there has been no direct investigation of the inherent differences between name and non-name tokens in text, nor whether this property holds across multiple languages.  ...  Moreover, by adding very simple CLM-based features we can significantly improve the performance of an off-the-shelf NER system for multiple languages.  ...  Approved for Public Release, Distribution Unlimited. The views expressed are those of the authors and do not reflect the official policy or position of the Department of Defense or the U.S.  ... 
arXiv:1809.05157v2 fatcat:x5fs7cw4s5hcxisf5tvm4pwv6q

On the Strength of Character Language Models for Multilingual Named Entity Recognition

Xiaodong Yu, Stephen Mayhew, Mark Sammons, Dan Roth
2018 Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing  
However, to date there has been no direct investigation of the inherent differences between name and nonname tokens in text, nor whether this property holds across multiple languages.  ...  Moreover, by adding very simple CLM-based features we can significantly improve the performance of an off-the-shelf NER system for multiple languages. 1  ...  Approved for Public Release, Distribution Unlimited. The views expressed are those of the authors and do not reflect the official policy or position of the Department of Defense or the U.S.  ... 
doi:10.18653/v1/d18-1345 dblp:conf/emnlp/YuMSR18 fatcat:ri6rzu33zzg2tghg6c3s4zmsta

ConvAI at SemEval-2019 Task 6: Offensive Language Identification and Categorization with Perspective and BERT

John Pavlopoulos, Nithum Thain, Lucas Dixon, Ion Androutsopoulos
2019 Proceedings of the 13th International Workshop on Semantic Evaluation  
The main contribution of this paper is the assessment of two strong baselines for the identification (Perspective) and the categorization (BERT) of offensive language with little or no additional training  ...  This paper presents the application of two strong baseline systems for toxicity detection and evaluates their performance in identifying and categorizing offensive language in social media.  ...  Furthermore, while new competitions and corpora are being introduced (Zampieri et al., 2019a) , 2 there is a need for strong baselines to assess the performance of more complex systems.  ... 
doi:10.18653/v1/s19-2102 dblp:conf/semeval/PavlopoulosTDA19 fatcat:ptx74u3eprdcxdemsig5hbxcf4

Moving towards accurate and early prediction of language delay with network science and machine learning approaches

Arielle Borovsky, Donna Thal, Laurence B Leonard
2021 Scientific Reports  
Grammatical and lexico-semantic measures ranked highly in predictive classification, highlighting promising avenues for early screening and delineating the roots of language disorders.  ...  Due to wide variability of typical language development, it has been historically difficult to distinguish typical and delayed trajectories of early language growth.  ...  We wish to dedicate this work to Elizabeth Bates and Jeffrey Elman, whose fundamental insights into measurement and computation of early language skills made this work possible.  ... 
doi:10.1038/s41598-021-85982-0 pmid:33854086 pmcid:PMC8047042 fatcat:f3sgkefgzbdqvluhclxfzevny4

One Size Does Not Fit All: The Case for Personalised Word Complexity Models [article]

Sian Gooding, Manuel Tragut
2022 arXiv   pre-print
Complex Word Identification (CWI) aims to detect words within a text that a reader may find difficult to understand.  ...  In this paper, we show that personal models are best when predicting word complexity for individual readers.  ...  This type of unsupervised clustering has been used across multiple natural language processing tasks including word clustering, co-reference resolution and word sense disambiguation (Chen and Ji, 2010  ... 
arXiv:2205.02564v1 fatcat:nygypwgdgbdkpci2wkdqbuxyrq

Language Identification and Analysis of Code-Switched Social Media Text

Deepthi Mave, Suraj Maharjan, Thamar Solorio
2018 Proceedings of the Third Workshop on Computational Approaches to Linguistic Code-Switching  
In this paper, we detail our work on comparing different word-level language identification systems for codeswitched Hindi-English data and a standard Spanish-English dataset.  ...  In this regard, we build a new code-switched dataset for Hindi-English. To understand the code-switching patterns in these language pairs, we investigate different codeswitching metrics.  ...  Department of Defense for the support received under grant W911NF-16-1-0422.  ... 
doi:10.18653/v1/w18-3206 dblp:conf/acl-codeswitch/MaveMS18 fatcat:bjhhzc5fgresrpz4mtk3him2au

Multilingual and Cross-Lingual ComplexWord Identification

Seid Muhie Yimam, Sanja Štajner, Martin Riedl, Chris Biemann
2017 RANLP 2017 - Recent Advances in Natural Language Processing Meet Deep Learning  
Complex Word Identification (CWI) is an important task in lexical simplification and text accessibility.  ...  We collect complex words/phrases (CP) for English, German and Spanish, annotated by both native and non-native speakers, and propose language independent features that can be used to train multilingual  ...  /inst/ab/lt/resources/data/ complex-word-identification-dataset. html.  ... 
doi:10.26615/978-954-452-049-6_104 dblp:conf/ranlp/YimamSRB17 fatcat:g3vf64xx3rbqjhilwvrxsyvgga

Curriculum vocabulary learning intervention for children with social, emotional and behavioural difficulties (SEBD): findings from a case study series

Judy Clegg
2013 Emotional and Behavioural Difficulties  
The present study evaluates a combined phonological and semantic approach to new word learning that is reported to be effective for other populations of children with language impairment (Parsons, Law  ...  Assessment identified lower than average language and literacy abilities although the profiles varied across the participants.  ...  Matching the target words to the control words for frequency, length and complexity was not feasible.  ... 
doi:10.1080/13632752.2013.854958 fatcat:kb5vm6ijkzgkhakf74mpoyvvfi

Fermi at

Vijayasaradhi Indurthi, Bakhtiyar Syed, Manish Shrivastava, Manish Gupta, Vasudeva Varma
2019 Proceedings of the 13th International Workshop on Semantic Evaluation  
This paper describes our system (Fermi) for Task 6: OffensEval: Identifying and Categorizing Offensive Language in Social Media of SemEval-2019.  ...  We evaluate multiple sentence embeddings in conjunction with various supervised machine learning algorithms and evaluate the performance of simple yet effective embedding-ML combination algorithms.  ...  Multiple ways of generating word embeddings exist, such as Neural Probabilistic Language Model (Bengio et al., 2003) , Word2Vec (Mikolov et al., 2013) , GloVe (Pennington et al., 2014) , and more recently  ... 
doi:10.18653/v1/s19-2109 dblp:conf/semeval/IndurthiS00V19 fatcat:sqvsihj6urecbj26crhdmhyrmq

Multilingual Embeddings Jointly Induced from Contexts and Concepts: Simple, Strong and Scalable [article]

Philipp Dufter, Mengjie Zhao, Hinrich Schütze
2020 arXiv   pre-print
We show that Co+Co performs well for two different application scenarios: the Parallel Bible Corpus (1000+ languages, low-resource) and EuroParl (12 languages, high-resource).  ...  From a sentence aligned corpus, concepts are extracted via sampling; words are then associated with their concept ID and sentence ID in embedding learning.  ...  Lardilleux and Lepage (2009) propose Anymalign, an algorithm originally intended for obtaining word alignments. Consider a parallel corpus V across multiple languages.  ... 
arXiv:1811.00586v2 fatcat:jskg526mbfgm3fsxdm2frnq6my

Constraint-Based Models of Lexical Borrowing

Yulia Tsvetkov, Waleed Ammar, Chris Dyer
2015 Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies  
lexical information across languages. donor word To IPA loan word Ranking with OT constraints  ...  Borrowed words are found in all languages, and-in contrast to cognate relationships-borrowing relationships may exist across unrelated languages (for example, about 40% of Swahili's vocabulary is borrowed  ...  We are grateful to Nathan Schneider, David Mortensen, Archna Bhatia, Shuly Wintner, Shay Cohen, David Bamman, Noah Smith, and Lori Levin for extensive discussions and constructive feedback.  ... 
doi:10.3115/v1/n15-1062 dblp:conf/naacl/TsvetkovAD15 fatcat:drrkgcyhqfcntfrv73fn7qht7i
« Previous Showing results 1 — 15 out of 39,890 results