Filters








2,135 Hits in 7.3 sec

Learning Embeddings from Scientific Corpora using Lexical, Grammatical and Semantic Information

Andres Garcia-Silva, Ronald Denaux, José Manuel Gómez-Pérez
2019 Zenodo  
The current trend, based on deep learning and embeddings, uses representations at the (sub)word level that require large amounts of training data and neural architectures with millions of parameters to  ...  We learn embeddings from different linguistic annotations on the text and evaluate them through a classification task over the SciGraph taxonomy, showing that our representations outperform (sub)word-level  ...  ACKNOWLEDGMENTS This research has been supported by The European Language Grid project funded by the European Unions Horizon 2020 research and innovation programme undergrant agreement No 825627 (ELG).  ... 
doi:10.5281/zenodo.4059300 fatcat:qigywgsx2ng5dgh23pe4mvusny

On the impact of knowledge-based linguistic annotations in the quality of scientific embeddings

Andres Garcia-Silva, Ronald Denaux, Jose Manuel Gomez-Perez
2021 Future generations computer systems  
In addition to single words or word pieces, other features which result from the linguistic analysis of text, including lexical, grammatical and semantic information, can be used to improve the quality  ...  In this paper, we conduct a comprehensive study on the use of explicit linguistic annotations to generate embeddings from a scientific corpus and quantify their impact in the resulting representations.  ...  Acknowledgements We gratefully acknowledge the EU Horizon 2020 research and innovation programme under grant agreement No. 825627 (ELG).  ... 
doi:10.1016/j.future.2021.02.019 fatcat:5lxgpsauvvab7de5p7hndiwmfe

Distributional semantic modeling: a revised technique to train term/word vector space models applying the ontology-related approach [article]

Oleksandr Palagin, Vitalii Velychko, Kyrylo Malakhov, Oleksandr Shchurov
2020 arXiv   pre-print
The semantic map can be represented as a graph using Vec2graph - a Python library for visualizing word embeddings (term embeddings in our case) as dynamic and interactive graphs.  ...  This gives us an opportunity to changeover from distributed word representations (or word embeddings) to distributed term representations (or term embeddings).  ...  Introduction Distributional semantic modeling (word embeddings) are now arguably the most popular way to computationally handle lexical semantics.  ... 
arXiv:2003.03350v1 fatcat:l5r5dvmqpff4liomz4tezkzcru

Semantically linking molecular entities in literature through entity relationships

Sofie Van Landeghem, Jari Björne, Thomas Abeel, Bernard De Baets, Tapio Salakoski, Yves Van de Peer
2012 BMC Bioinformatics  
Conclusions: The results from this study will enable us in the near future to annotate semantic relations between molecular entities in the entire scientific literature available through PubMed.  ...  It is crucial that such tools extract information with a sufficient level of detail to be applicable in real life scenarios.  ...  Acknowledgements This article has been published as part of BMC Bioinformatics Volume 13 Supplement 11, 2012: Selected articles from BioNLP Shared Task 2011.  ... 
doi:10.1186/1471-2105-13-s11-s6 pmid:22759460 pmcid:PMC3384255 fatcat:jih7cgrb2zgjlggvskbnyruxue

Distributional semantic modeling: a revised technique to train term/word vector space models applying the ontology-related approach

O.V. Palagin, Glushkov Institute of Cybernetics NAS of Ukraine, V.Yu. Velychko, K.S. Malakhov, O.S. Shchurov, Glushkov Institute of Cybernetics NAS of Ukraine, Glushkov Institute of Cybernetics NAS of Ukraine, Glushkov Institute of Cybernetics NAS of Ukraine
2020 PROBLEMS IN PROGRAMMING  
This gives us an opportunity to changeover from distributed word representations (or word embeddings) to distributed term representations (or term embeddings).  ...  We design a new technique for the distributional semantic modeling with a neural network-based approach to learn distributed term representations (or term embeddings) – term vector space models as a result  ...  Introduction Distributional semantic modeling (word embeddings) are now arguably the most popular way to computationally handle lexical semantics.  ... 
doi:10.15407/pp2020.02-03.341 fatcat:swnscokdengateyd7u37xb4yue

Linguistic Variation and Change in 250 Years of English Scientific Writing: A Data-Driven Approach

Yuri Bizzoni, Stefania Degaetano-Ortlieb, Peter Fankhauser, Elke Teich
2020 Frontiers in Artificial Intelligence  
We pursue an exploratory, data-driven approach using state-of-the-art computational language models and combine them with selected information-theoretic measures (entropy, relative entropy) for comparing  ...  , the first and longest-running English scientific journal established in 1665.  ...  AUTHOR CONTRIBUTIONS YB curated the analyses on word embeddings showing the diachronic expansion of the scientific semantic space and lexical-semantic specialisation.  ... 
doi:10.3389/frai.2020.00073 pmid:33733190 pmcid:PMC7861277 doaj:193b75c1263d438b8677a51b0c9878a1 fatcat:6u54minfsjfldbvqljmed3apau

A Study on the Application of Data-driven Learning in Vocabulary Teaching and Leaning in China's EFL Class

Xiaowei Guan
2013 Journal of Language Teaching and Research  
real corpora concordances.  ...  Data-driven learning (DDL) developed from corpus linguistics plays a pioneering role in the evolution of EFL teaching, allowing the learners to indentify and induce language rules by observing numerous  ...  and help them to use context to obtain the word semantics and summarize the grammatical rules.  ... 
doi:10.4304/jltr.4.1.105-112 fatcat:cz7fzettrzfdfpcvbjwqdlqrvu

LL(O)D and NLP perspectives on semantic change for humanities research

Florentina Armaselu, Elena-Simona Apostol, Anas Fahad Khan, Chaya Liebeskind, Barbara McGillivray, Ciprian-Octavian Truică, Andrius Utka, Giedrė Valūnaitė Oleškevičienė, Marieke van Erp, Philipp Cimiano
2022 Semantic Web Journal  
This paper presents an overview of the LL(O)D and NLP methods, tools and data for detecting and representing semantic change, with its main application in humanities research.  ...  The paper's aim is to provide the starting point for the construction of a workflow and set of multilingual diachronic ontologies within the humanities use case of the COST Action Nexus Linguarum, European  ...  Acknowledgement This article is based upon work from COST Action Nexus Linguarum, European network for Web-centred linguistic data science, supported by COST (European Cooperation in Science and Technology  ... 
doi:10.3233/sw-222848 fatcat:qqbmouaquzbmtecm7bmqpofbva

Beyond the Benchmarks: Toward Human-Like Lexical Representations

Suzanne Stevenson, Paola Merlo
2022 Frontiers in Artificial Intelligence  
We discuss key concepts and issues that underlie the scientific understanding of the human lexicon: its richly structured semantic representations, their ready and continual adaptability, and their grounding  ...  In this article, we concentrate on word-level semantics.  ...  Here we describe some relevant cognitively-inspired work from recent years, and Structure in Lexical Representations and Learning Word embeddings are largely founded on the notion of semantic similarity  ... 
doi:10.3389/frai.2022.796741 fatcat:n5ry6bpr6rghjldc4w4lkkrdba

Verb Argument Structure Alternations in Word and Sentence Embeddings [article]

Katharina Kann, Alex Warstadt, Adina Williams, Samuel R. Bowman
2018 arXiv   pre-print
We then test whether models can distinguish acceptable English verb-frame combinations from unacceptable ones using a sentence embedding alone.  ...  Further, differences between the word- and sentence-level models show that some information present in word embeddings is not passed on to the down-stream sentence embeddings.  ...  Acknowledgments This project has benefited from financial support to SB and KK from Samsung Research, and to SB from Google. 7 To be exact, the set of classes was extended to a superset of the original  ... 
arXiv:1811.10773v1 fatcat:4cv5tczjzfc6vk22ueh4cvoc3q

Doublets in Legal Discourse: Data-Driven Insights for Enhancing the Phraseological Competence of EFL Law Students

Waheed Mohammed Altohami
2020 International Journal of Emerging Technologies in Learning (iJET)  
lexemes, and the remarkable collocates used with them.  ...  Based on the framework of data-driven learning approach (DDL), it assumes that getting EFL law students to use online available, user-friendly online corpus tools would help to enhance their phraseological  ...  Acknowledgement I take this opportunity to thank Prince Sattam Bin Abdulaziz University in Saudi Arabia alongside its Scientific Deanship, for all the technical support it has unstintingly provided towards  ... 
doi:10.3991/ijet.v15i20.13985 fatcat:cnfdjwe2ibes3hckuyimncv5ja

What can linguistic approaches bring to English for Specific Purposes?

Christopher Gledhill, Natalie Kübler
2016 ASp  
The authors of this paper are grateful to two anonymous reviewers for their valuable suggestions and comments.  ...  corpora are widely used for today, our approach also involves the detection of important lexico-grammatical information in the second learning phase mentioned above, namely, the translation process. 71  ...  It is only by exposure to "specific" language that learners can learn the appropriate grammatical and lexical dependencies […] .  ... 
doi:10.4000/asp.4804 fatcat:gwi6tfem5zemjn45iq44qi3h2q

Computational linguistic assessment of textbook and online learning media by means of threshold concepts in business education [article]

Andy Lücking and Sebastian Brückner and Giuseppe Abrami and Tolga Uslu and Alexander Mehler
2020 arXiv   pre-print
However, they also occur in informal learning environments like newspapers.  ...  Wikipedia is (one of) the largest and most widely used online resources.  ...  Starting from these considerations we arrive at the following hypothesis about the difference between formal and informal language corpora (manifesting formal and informal learning contexts) in terms of  ... 
arXiv:2008.02096v1 fatcat:qwlkza4iqjapji5rdzyezzhe54

A Computational Theory for the Emergence of Grammatical Categories in Cortical Dynamics

Dario Dematties, Silvio Rizzi, George K. Thiruvathukal, Mauricio David Pérez, Alejandro Wainselboim, B. Silvano Zanutto
2020 Frontiers in Neural Circuits  
by the sole correlation of lexical information from different sources without applying complex optimization methods.  ...  The model presented in this paper combines semantic and coarse-grained syntactic constraints for each word in a sentence context until grammatically related word function discrimination emerges spontaneously  ...  Afferent Distributional Semantic Constraints We generate Distributional Semantic (DS) constraints using Word Embedding approaches.  ... 
doi:10.3389/fncir.2020.00012 pmid:32372918 pmcid:PMC7179825 fatcat:tb2evigg7rfmnnjf26hr6vosui

Methodological Framework for the Development of an English-Lithuanian Cybersecurity Termbase

Sigita Rackevičienė, Liudmila Mockienė, Andrius Utka, Aivaras Rokas
2021 Studies About Languages  
The theoretical analysis and a pilot study allow arguing that: 1) a combination of parallel and comparable corpora enable to considerably expand the amount and variety of data sources that can be used  ...  for terminology extraction; this methodology is especially important for less-resourced languages which often lack parallel data; 2) deep learning systems trained by using manually annotated data (gold  ...  Besides, BiTE from comparable corpora, used in addition to BiTE from parallel corpora, allows extracting and comparing terminology formed and used in various settings.  ... 
doi:10.5755/j01.sal.1.39.29156 fatcat:mrmv4n4qnfbk3kswehn7b34jyq
« Previous Showing results 1 — 15 out of 2,135 results