Filters








189 Hits in 3.4 sec

Polysemy and Brevity versus Frequency in Language

Bernardino Casas, Antoni Hernández-Fernández, Neus Català, Ramon Ferrer-i-Cancho, Jaume Baixeries
2019 Computer Speech and Language  
Our correlation analysis indicates that both the meaning-frequency law and the law of abbreviation hold overall in all the analyzed languages.  ...  In the present article, we extend our study to other languages (Dutch and Spanish) and introduce two additional measures of length: syllabic length and phonemic length.  ...  This research work was supported by the grant SGR2014-890 (MACDA) and the recognition 2017SGR-856 (MACDA) from AGAUR (Generalitat de Catalunya), and also the grants TIN2014-57226-P (APCOM), TIN2017-89244  ... 
doi:10.1016/j.csl.2019.03.007 fatcat:xc74pr2hvfeddnbex4snxw5c5e

Testing the Robustness of Laws of Polysemy and Brevity Versus Frequency [chapter]

Antoni Hernández-Fernández, Bernardino Casas, Ramon Ferrer-i-Cancho, Jaume Baixeries
2016 Lecture Notes in Computer Science  
Zipf on the relationship between word frequency and other word features led to the formulation of various linguistic laws.  ...  Here we focus on a couple of them: the meaning-frequency law, i.e. the tendency of more frequent words to be more polysemous, and the law of abbreviation, i.e. the tendency of more frequent words to be  ...  Acknowledgments The authors thank Pedro Delicado and the reviewers for their helpful comments.  ... 
doi:10.1007/978-3-319-45925-7_2 fatcat:u2d5ndijiba2dd65o36gsky5oe

Is it a Fruit, an Apple or a Granny Smith? Predicting the Basic Level in a Concept Hierarchy [article]

Laura Hollink, Aysenur Bilgin, Jacco van Ossenbruggen
2019 arXiv   pre-print
The "basic level", according to experiments in cognitive psychology, is the level of abstraction in a hierarchy of concepts at which humans perform tasks quicker and with greater accuracy than at other  ...  We test the utility of three types of concept features, that were inspired by the basic level theory: lexical features, structural features and frequency features.  ...  Other frequency features could include: the fre-quency of occurrence of words in specific corpora such as children's books or language-learning resources, and the frequency of occurrence of concepts in  ... 
arXiv:1910.12619v1 fatcat:5qzt327iircizljl2ipf7y246m

Least effort and the origins of scaling in human language

R. F. i. Cancho, R. V. Sole
2003 Proceedings of the National Academy of Sciences of the United States of America  
These principles seem to be common to all languages. The best known is the so-called Zipf's law, which states that the frequency of a word decays as a (universal) power law of its rank.  ...  Zipf's law is found in the transition between referentially useless systems and indexical reference systems.  ...  Fig. 3 . 3 Signal normalized frequency, P(k), versus rank, k, for ϭ 0.3 (A), ϭ * ϭ 0.41 (B), and ϭ 0.5 (B and C) (averages over 30 replicas: n ϭ m ϭ 150 and T ϭ 2nm).  ... 
doi:10.1073/pnas.0335980100 pmid:12540826 pmcid:PMC298679 fatcat:wfhdn5hadjf7ph7pkww3uszsya

The Role of Word Sense Disambiguation in Automated Text Categorization [chapter]

José María Gómez Hidalgo, Manuel de Buenaga Rodríguez, José Carlos Cortizo Pérez
2005 Lecture Notes in Computer Science  
However, performance of this approach is damaged by the problems derived from language variation (specially polysemy and synonymy).  ...  Being ATC different to IR in many ways, we focus on integrating lexicalsemantic resources in it, and studying the role of WSD.  ...  The underlying idea is to overcome language variation problems, specially polysemy and synonymy.  ... 
doi:10.1007/11428817_27 fatcat:g2we2sp75vbs3apzmsgxwjucum

A Large-Scale Pseudoword-Based Evaluation Framework for State-of-the-Art Word Sense Disambiguation

Mohammad Taher Pilehvar, Roberto Navigli
2014 Computational Linguistics  
Word Sense Disambiguation (WSD) is a case in point, as hand-labeled datasets are particularly hard and time-consuming to create.  ...  Using this framework, we study the impact of supervision and knowledge on the two major disambiguation paradigms and perform an in-depth analysis of the factors which affect their performance.  ...  This explains the fact why the average node degree values belonging to the two highly different sizes of the training data (i.e., 80 and 800 sentences) are comparable in Figure B .2.  ... 
doi:10.1162/coli_a_00202 fatcat:4jyf4y5pu5ddnbnaeu4i2dgfja

A parallel corpus approach to investigating semantic change [chapter]

Kate Beeching
2013 Studies in Corpus Linguistics  
the core meaning + contextual side-effects approach versus the polysemy 'coded meaning' approach.  ...  It takes a quantitative parallel corpus approach, regarding the evolution of polysemies to be a question of distributional frequency.  ... 
doi:10.1075/scl.54.07bee fatcat:mzgy2hfgrbgntgoiegnnu7vmjq

Semi-supervised Learning with Induced Word Senses for State of the Art Word Sense Disambiguation

Osman Başkaya, David Jurgens
2016 The Journal of Artificial Intelligence Research  
Word Sense Disambiguation (WSD) aims to determine the meaning of a word in context, and successful approaches are known to benefit many applications in Natural Language Processing.  ...  We anticipate that our results and released software will also benefit evaluation practices for sense induction systems and those working in low-resource languages by demonstrating how to quickly produce  ...  Acknowledgments We thank Mohammad Taher Pilehvar for many thoughtful discussions and his assistance with the pseudoword dataset. We also thank the reviewers for their comments and suggestions.  ... 
doi:10.1613/jair.4917 fatcat:w2xovb6f5jgtpnjrf77s7amogm

TRANSLATION LANGUAGE: THE MAJOR FORCE IN SHAPING MODERN LATVIAN

Andrejs Veisbergs
2017 Vertimo studijos  
We cannot speak anymore of a clear dichotomy of 'translation language' versus the real lan­guage – there is no isolation in the modern world.  ...  Yet it can hardly be affected, as language change is inevitable, and in the modern world translation functions as a major vehicle of change.  ...  Their frequency of use is very high in colloquial language.  ... 
doi:10.15388/vertstud.2009.2.10603 fatcat:uvumyxrm5bebnigqlmfqgp7vla

Speakers Fill Lexical Semantic Gaps with Context [article]

Tiago Pimentel, Rowan Hall Maudslay, Damián Blasi, Ryan Cotterell
2021 arXiv   pre-print
Lexical ambiguity is widespread in language, allowing for the reuse of economical word forms and therefore making language more efficient.  ...  If ambiguous words cannot be disambiguated from context, however, this gain in efficiency might make language less clear – resulting in frequent miscommunication.  ...  Acknowledgments Damán Blasi acknowledges funding from the framework of the HSE University Basic Research Program and is funded by the Russian Academic Excellence Project '5-100'.  ... 
arXiv:2010.02172v3 fatcat:jbbio5pqezhsth2wjgbi7xtfkm

Contrasting the form and use of reformulation markers

Maria Josep Cuenca, Carme Bach
2007 Discourse Studies  
This article deals with the form and use of reformulation markers in research papers written in English, Spanish and Catalan.  ...  Considering the form and frequency of the markers, English papers tend to prefer simple fixed markers and include fewer reformulators than Spanish and Catalan.  ...  in meaning and especially in frequency must be taken into account.  ... 
doi:10.1177/1461445607075347 fatcat:7ytxyb4d2rhm5e74bwht6rt6xq

Multi-SimLex: A Large-Scale Evaluation of Multilingual and Cross-Lingual Lexical Semantic Similarity [article]

Ivan Vulić, Simon Baker, Edoardo Maria Ponti, Ulla Petti, Ira Leviant, Kelly Wing, Olga Majewska, Eden Bar, Matt Malone, Thierry Poibeau, Roi Reichart, Anna Korhonen
2020 arXiv   pre-print
, adjectives, adverbs), frequency ranks, similarity intervals, lexical fields, and concreteness levels.  ...  in multilingual lexical semantics and representation learning -- available via a website which will encourage community effort in further expansion of Multi-Simlex to many more languages.  ...  Acknowledgments This work is supported by the ERC Consolidator Grant LEXICAL: Lexical Acquisition Across Languages (no 648909).  ... 
arXiv:2003.04866v1 fatcat:5mp5s7ehyzdshnywt2zvqverwu

Lexical Retention in Contact Grammaticalisation: Already In Southeast Asian Englishes

Debra Ziegeler, Sarah Lee
2019 Journal of Language Contact  
Amongst the problems of contact grammaticalisation research in past studies has been, first, the problem of searching for diachronic evidence in relatively 'new' language situations, something which was  ...  The present study reveals the presence of lexical persistence in the age-graded distribution of the perfective marker already in Singaporean and Malaysian English, and demonstrates that even ordinary contact-induced  ...  , it was the use of the periphrastic future tense form versus the more conservative inflected future form in Montreal French.  ... 
doi:10.1163/19552629-01203006 fatcat:prjuzt5kjjdibnwveaqy4f5kje

Modelling loanword success – a sociolinguistic quantitative study of Māori loanwords in New Zealand English

Andreea Simona Calude, Steven Miller, Mark Pagel
2017 Corpus Linguistics and Linguistic Theory  
Following a new wave of studies which look at loans from a quantitatively more informed standpoint, modelling "success" by taking into account frequency of the counterparts available in the language adopting  ...  AbstractLoanword use has dominated the literature on language contact and its salient nature continues to draw interest from linguists and non-linguists.  ...  Acknowledgments: The authors wish to thank Paul James for his Python code, Peter Keegan for his help and expertise in Te Reo Māori, and the anonymous referees for their meticulous comments and useful suggestions  ... 
doi:10.1515/cllt-2017-0010 fatcat:3ix5ikcqsnf3db5tjiymsrl5j4

Multi-SimLex: A Large-Scale Evaluation of Multilingual and Cross-Lingual Lexical Semantic Similarity

Ivan Vulić, Edoardo Maria Ponti, Ira Leviant, Olga Majewska, Matt Malone, Roi Reichart, Simon Baker, Ulla Petti, Kelly Wing, Eden Bar, Thierry Poibeau, Anna Korhonen
2020 Computational Linguistics  
, adjectives, adverbs), frequency ranks, similarity intervals, lexical fields, and concreteness levels.  ...  in multilingual lexical semantics and representation learning - available via a website which will encourage community effort in further expansion of Multi-Simlex to many more languages.  ...  Acknowledgments This work is supported by the ERC Consolidator Grant LEXICAL: Lexical Acquisition Across Languages (no 648909).  ... 
doi:10.1162/coli_a_00391 fatcat:42esnmz2gvgs7irdhigl6t7xtm
« Previous Showing results 1 — 15 out of 189 results