A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Filters
Learning Embeddings from Scientific Corpora using Lexical, Grammatical and Semantic Information
2019
Zenodo
The current trend, based on deep learning and embeddings, uses representations at the (sub)word level that require large amounts of training data and neural architectures with millions of parameters to ...
We learn embeddings from different linguistic annotations on the text and evaluate them through a classification task over the SciGraph taxonomy, showing that our representations outperform (sub)word-level ...
ACKNOWLEDGMENTS This research has been supported by The European Language Grid project funded by the European Unions Horizon 2020 research and innovation programme undergrant agreement No 825627 (ELG). ...
doi:10.5281/zenodo.4059300
fatcat:qigywgsx2ng5dgh23pe4mvusny
On the impact of knowledge-based linguistic annotations in the quality of scientific embeddings
2021
Future generations computer systems
In addition to single words or word pieces, other features which result from the linguistic analysis of text, including lexical, grammatical and semantic information, can be used to improve the quality ...
In this paper, we conduct a comprehensive study on the use of explicit linguistic annotations to generate embeddings from a scientific corpus and quantify their impact in the resulting representations. ...
Acknowledgements We gratefully acknowledge the EU Horizon 2020 research and innovation programme under grant agreement No. 825627 (ELG). ...
doi:10.1016/j.future.2021.02.019
fatcat:5lxgpsauvvab7de5p7hndiwmfe
Distributional semantic modeling: a revised technique to train term/word vector space models applying the ontology-related approach
[article]
2020
arXiv
pre-print
The semantic map can be represented as a graph using Vec2graph - a Python library for visualizing word embeddings (term embeddings in our case) as dynamic and interactive graphs. ...
This gives us an opportunity to changeover from distributed word representations (or word embeddings) to distributed term representations (or term embeddings). ...
Introduction Distributional semantic modeling (word embeddings) are now arguably the most popular way to computationally handle lexical semantics. ...
arXiv:2003.03350v1
fatcat:l5r5dvmqpff4liomz4tezkzcru
Semantically linking molecular entities in literature through entity relationships
2012
BMC Bioinformatics
Conclusions: The results from this study will enable us in the near future to annotate semantic relations between molecular entities in the entire scientific literature available through PubMed. ...
It is crucial that such tools extract information with a sufficient level of detail to be applicable in real life scenarios. ...
Acknowledgements This article has been published as part of BMC Bioinformatics Volume 13 Supplement 11, 2012: Selected articles from BioNLP Shared Task 2011. ...
doi:10.1186/1471-2105-13-s11-s6
pmid:22759460
pmcid:PMC3384255
fatcat:jih7cgrb2zgjlggvskbnyruxue
Distributional semantic modeling: a revised technique to train term/word vector space models applying the ontology-related approach
2020
PROBLEMS IN PROGRAMMING
This gives us an opportunity to changeover from distributed word representations (or word embeddings) to distributed term representations (or term embeddings). ...
We design a new technique for the distributional semantic modeling with a neural network-based approach to learn distributed term representations (or term embeddings) – term vector space models as a result ...
Introduction Distributional semantic modeling (word embeddings) are now arguably the most popular way to computationally handle lexical semantics. ...
doi:10.15407/pp2020.02-03.341
fatcat:swnscokdengateyd7u37xb4yue
Linguistic Variation and Change in 250 Years of English Scientific Writing: A Data-Driven Approach
2020
Frontiers in Artificial Intelligence
We pursue an exploratory, data-driven approach using state-of-the-art computational language models and combine them with selected information-theoretic measures (entropy, relative entropy) for comparing ...
, the first and longest-running English scientific journal established in 1665. ...
AUTHOR CONTRIBUTIONS YB curated the analyses on word embeddings showing the diachronic expansion of the scientific semantic space and lexical-semantic specialisation. ...
doi:10.3389/frai.2020.00073
pmid:33733190
pmcid:PMC7861277
doaj:193b75c1263d438b8677a51b0c9878a1
fatcat:6u54minfsjfldbvqljmed3apau
A Study on the Application of Data-driven Learning in Vocabulary Teaching and Leaning in China's EFL Class
2013
Journal of Language Teaching and Research
real corpora concordances. ...
Data-driven learning (DDL) developed from corpus linguistics plays a pioneering role in the evolution of EFL teaching, allowing the learners to indentify and induce language rules by observing numerous ...
and help them to use context to obtain the word semantics and summarize the grammatical rules. ...
doi:10.4304/jltr.4.1.105-112
fatcat:cz7fzettrzfdfpcvbjwqdlqrvu
LL(O)D and NLP perspectives on semantic change for humanities research
2022
Semantic Web Journal
This paper presents an overview of the LL(O)D and NLP methods, tools and data for detecting and representing semantic change, with its main application in humanities research. ...
The paper's aim is to provide the starting point for the construction of a workflow and set of multilingual diachronic ontologies within the humanities use case of the COST Action Nexus Linguarum, European ...
Acknowledgement This article is based upon work from COST Action Nexus Linguarum, European network for Web-centred linguistic data science, supported by COST (European Cooperation in Science and Technology ...
doi:10.3233/sw-222848
fatcat:qqbmouaquzbmtecm7bmqpofbva
Beyond the Benchmarks: Toward Human-Like Lexical Representations
2022
Frontiers in Artificial Intelligence
We discuss key concepts and issues that underlie the scientific understanding of the human lexicon: its richly structured semantic representations, their ready and continual adaptability, and their grounding ...
In this article, we concentrate on word-level semantics. ...
Here we describe some relevant cognitively-inspired work from recent years, and
Structure in Lexical Representations and Learning Word embeddings are largely founded on the notion of semantic similarity ...
doi:10.3389/frai.2022.796741
fatcat:n5ry6bpr6rghjldc4w4lkkrdba
Verb Argument Structure Alternations in Word and Sentence Embeddings
[article]
2018
arXiv
pre-print
We then test whether models can distinguish acceptable English verb-frame combinations from unacceptable ones using a sentence embedding alone. ...
Further, differences between the word- and sentence-level models show that some information present in word embeddings is not passed on to the down-stream sentence embeddings. ...
Acknowledgments This project has benefited from financial support to SB and KK from Samsung Research, and to SB from Google. 7 To be exact, the set of classes was extended to a superset of the original ...
arXiv:1811.10773v1
fatcat:4cv5tczjzfc6vk22ueh4cvoc3q
Doublets in Legal Discourse: Data-Driven Insights for Enhancing the Phraseological Competence of EFL Law Students
2020
International Journal of Emerging Technologies in Learning (iJET)
lexemes, and the remarkable collocates used with them. ...
Based on the framework of data-driven learning approach (DDL), it assumes that getting EFL law students to use online available, user-friendly online corpus tools would help to enhance their phraseological ...
Acknowledgement I take this opportunity to thank Prince Sattam Bin Abdulaziz University in Saudi Arabia alongside its Scientific Deanship, for all the technical support it has unstintingly provided towards ...
doi:10.3991/ijet.v15i20.13985
fatcat:cnfdjwe2ibes3hckuyimncv5ja
What can linguistic approaches bring to English for Specific Purposes?
2016
ASp
The authors of this paper are grateful to two anonymous reviewers for their valuable suggestions and comments. ...
corpora are widely used for today, our approach also involves the detection of important lexico-grammatical information in the second learning phase mentioned above, namely, the translation process. 71 ...
It is only by exposure to "specific" language that learners can learn the appropriate grammatical and lexical dependencies […] . ...
doi:10.4000/asp.4804
fatcat:gwi6tfem5zemjn45iq44qi3h2q
Computational linguistic assessment of textbook and online learning media by means of threshold concepts in business education
[article]
2020
arXiv
pre-print
However, they also occur in informal learning environments like newspapers. ...
Wikipedia is (one of) the largest and most widely used online resources. ...
Starting from these considerations we arrive at the following hypothesis about the difference between formal and informal language corpora (manifesting formal and informal learning contexts) in terms of ...
arXiv:2008.02096v1
fatcat:qwlkza4iqjapji5rdzyezzhe54
A Computational Theory for the Emergence of Grammatical Categories in Cortical Dynamics
2020
Frontiers in Neural Circuits
by the sole correlation of lexical information from different sources without applying complex optimization methods. ...
The model presented in this paper combines semantic and coarse-grained syntactic constraints for each word in a sentence context until grammatically related word function discrimination emerges spontaneously ...
Afferent Distributional Semantic Constraints We generate Distributional Semantic (DS) constraints using Word Embedding approaches. ...
doi:10.3389/fncir.2020.00012
pmid:32372918
pmcid:PMC7179825
fatcat:tb2evigg7rfmnnjf26hr6vosui
Methodological Framework for the Development of an English-Lithuanian Cybersecurity Termbase
2021
Studies About Languages
The theoretical analysis and a pilot study allow arguing that: 1) a combination of parallel and comparable corpora enable to considerably expand the amount and variety of data sources that can be used ...
for terminology extraction; this methodology is especially important for less-resourced languages which often lack parallel data; 2) deep learning systems trained by using manually annotated data (gold ...
Besides, BiTE from comparable corpora, used in addition to BiTE from parallel corpora, allows extracting and comparing terminology formed and used in various settings. ...
doi:10.5755/j01.sal.1.39.29156
fatcat:mrmv4n4qnfbk3kswehn7b34jyq
« Previous
Showing results 1 — 15 out of 2,135 results