A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Filters
CBOW Is Not All You Need: Combining CBOW with the Compositional Matrix Space Model
[article]
2019
arXiv
pre-print
Motivated by these findings, we propose a hybrid model that combines the strengths of CBOW and CMOW. ...
However, CBOW is not capable of capturing the word order. The reason is that the computation of CBOW's word embeddings is commutative, i.e., embeddings of XYZ and ZYX are the same. ...
ACKNOWLEDGEMENT This research was supported by the Swiss National Science Foundation under the project Learning Representations of Abstraction for Opinion Summarisation (LAOS), grant number "FNS-30216" ...
arXiv:1902.06423v1
fatcat:darf7t3ruzd3nhllzwnii2u7gy
Text/Conference Paper
2019
Jahrestagung der Gesellschaft für Informatik
We summarize our contribution to the International Conference on Learning Representations CBOW Is Not All You Need: Combining CBOW with the Compositional Matrix Space Model, 2019.We construct a text encoder ...
Across all 16 tasks, the hybrid model achieves an average improvement of 1.2%. ...
Acknowledgement This research was supported by the Swiss National Science Foundation under grant number ŞFNS-30216Ť. ...
doi:10.18420/inf2019_47
dblp:conf/gi/GalkeMS19
fatcat:q73h64dyrjbbzpwmi2a62aiwx4
Beyond Context: A New Perspective for Word Embeddings
2019
Proceedings of the Eighth Joint Conference on Lexical and Computational Semantics (*
We show via experiments that by combining feature engineering with embedding learning, our method can outperform CBOW using only 10% of the training data in both the standard word embedding evaluations ...
Indeed, standard models such as CBOW and FAST-TEXT are specific choices along each of these axes. ...
Acknowledgments We would like to thank members of the Utah NLP group for several valuable discussions and the anonymous reviewers for their feedback. ...
doi:10.18653/v1/s19-1003
dblp:conf/starsem/ZhouS19
fatcat:udz2aedmdnd2nnjh5rtgatrfr4
A Generalized Idiom Usage Recognition Model Based on Semantic Compatibility
2019
PROCEEDINGS OF THE THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE AND THE TWENTY-EIGHTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE
We propose a novel semantic compatibility model by adapting the training of a Continuous Bag-of-Words (CBOW) model for the purpose of idiom usage recognition. ...
There is no need to annotate idiom usage examples for training. ...
To represent the context, CBOW simply uses the average of all the context embeddings, thus the order information is not preserved. 2) Not all words are equal In CBOW, all words contribute equally to the ...
doi:10.1609/aaai.v33i01.33016738
fatcat:6nualsyts5b5vmxgku3hv22zhi
Semantic Holism and Word Representations in Artificial Neural Networks
[article]
2020
arXiv
pre-print
Taking Tugendhat's formal reinterpretation of Frege's work as a starting point, we demonstrate that it is analogical to the process of training the Skip-gram model and offers a possible explanation of ...
This is usually explained by referring to the general distributional hypothesis, which states that the meaning of the word is given by the contexts where it occurs. ...
ACKNOWLEDGEMENTS This work has been supported by the grant 18-02196S of the Czech Science Foundation. This research was partially supported by SVV project number 260 575. ...
arXiv:2003.05522v1
fatcat:lfozzfry4zgo5inixcrsyqh5ly
Word Representation
[chapter]
2020
Representation Learning for Natural Language Processing
After that, we present two widely used evaluation tasks for measuring the quality of word embeddings. Finally, we introduce the recent extensions for word representation learning models. ...
Word representation, aiming to represent a word with a vector, plays an essential role in NLP. ...
However, TWE simply combines the LDA with word embeddings and lacks statistical foundations. The LDA topic model needs numerous documents to learn semantically coherent topics. ...
doi:10.1007/978-981-15-5573-2_2
fatcat:hfq6zmdt2fbypb36hvc2e5ifmq
Understanding the Downstream Instability of Word Embeddings
[article]
2020
arXiv
pre-print
This retraining exacerbates a large challenge facing ML systems today: model training is unstable, i.e., small changes in training data can cause significant changes in the model's predictions. ...
---affects the instability of downstream NLP models. ...
ACKNOWLEDGEMENTS We thank Charles Kuang, Shoumik Palkar, Fred Sala, Paroma Varma, and the anonymous reviewers for their valuable feedback. We gratefully acknowledge the support of DARPA under Nos. ...
arXiv:2003.04983v1
fatcat:khcop4757ja5pdbjqmok7ze2pm
Paraphrasing verbal metonymy through computational methods
[article]
2017
arXiv
pre-print
Furthermore, the Skip-gram model is found to operate with better-than-chance accuracy and there is a strong positive relationship (phi coefficient = 0.61) between the model's classification and human judgement ...
Verbal metonymy has received relatively scarce attention in the field of computational linguistics despite the fact that a model to accurately paraphrase metonymy has applications both in academia and ...
have been generated, they are not needed again First, the BNC Baby is scraped for sentences which contain one of the three target verbs. ...
arXiv:1709.06162v1
fatcat:mgupa6ufcjcixioss6ilcabxxu
A Survey Of Cross-lingual Word Embedding Models
[article]
2019
arXiv
pre-print
The recurring theme of the survey is that many of the models presented in the literature optimize for the same objectives, and that seemingly different models are often equivalent modulo optimization strategies ...
In this survey, we provide a comprehensive typology of cross-lingual word embedding models. We compare their data requirements and objective functions. ...
Ivan's work is supported by the ERC Consolidator Grant LEXICAL: Lexical Acquisition Across Languages (no 648909). ...
arXiv:1706.04902v3
fatcat:lts6uop77zaazhzlbygqmdsama
Syntax-Ignorant N-gram Embeddings for Sentiment Analysis of Arabic Dialects
2019
Proceedings of the Fourth Arabic Natural Language Processing Workshop
With the free word order and the varying syntax nature across the different Arabic dialects, a sentiment analysis system developed for one dialect might not be efficient for the others. ...
Arabic sentiment analysis models have employed compositional embedding features to represent the Arabic dialectal content. ...
The resulting word embeddings were then combined using the compositional model SOWE which is applied by the next linear layer Lambda. ...
doi:10.18653/v1/w19-4604
dblp:conf/wanlp/MulkiHGB19
fatcat:sjmoaohvlzhknbnrscjd5l22tm
Compositional Approaches for Representing Relations Between Words: A Comparative Study
[article]
2017
arXiv
pre-print
A popular approach to represent the relations between a pair of words is to extract the patterns in which the words co-occur with from a corpus, and assign each word-pair a vector of pattern frequencies ...
Despite the simplicity of this approach, it suffers from data sparseness, information scalability and linguistic creativity as the model is unable to handle previously unseen word pairs in a corpus. ...
The number of all pair-wise combinations between words grows quadratically with the number of words in the vocabulary. ...
arXiv:1709.01193v1
fatcat:wh2eguvlvjbojhrvj2ld35sbwi
"The Sum of Its Parts": Joint Learning of Word and Phrase Representations with Autoencoders
[article]
2015
arXiv
pre-print
To embed sequences of words (i.e. phrases) with different sizes into a common semantic space, we propose to average word vector representations. ...
We introduce a novel model that jointly learns word vector representations and their summation. Word representations are learnt using the word co-occurrence statistical information. ...
Acknowledgements This work was supported by the HASLER foundation through the grant "Information and Communication Technology for a Better World 2020" (SmartWorld). ...
arXiv:1506.05703v1
fatcat:o2lzlpm47bf2fi35dqujxmfq3q
A Survey of Cross-lingual Word Embedding Models
2019
The Journal of Artificial Intelligence Research
The recurring theme of the survey is that many of the models presented in the literature optimize for the same objectives, and that seemingly different models are often equivalent, modulo optimization ...
In this survey, we provide a comprehensive typology of cross-lingual word embedding models. We compare their data requirements and objective functions. ...
Ivan's work is supported by the ERC Consolidator Grant LEXICAL: Lexical Acquisition Across Languages (no 648909). Sebastian is now affiliated with DeepMind. ...
doi:10.1613/jair.1.11640
fatcat:vwlgtzzmhfdlnlyaokx2whxgva
Evaluation of Unsupervised Compositional Representations
[article]
2018
arXiv
pre-print
We evaluated various compositional models, from bag-of-words representations to compositional RNN-based models, on several extrinsic supervised and unsupervised evaluation benchmarks. ...
We analyzed some of the evaluation datasets to identify the aspects of meaning they measure and the characteristics of the various models that explain their performance variance. ...
Background: Unsupervised Compositional Models
Baselines The simplest way of representing a sentence is a binary bag-of-words representation, where each word is a feature in the vector space. ...
arXiv:1806.04713v2
fatcat:dxb47tq22zfqdhesa52dr5yz2i
Identifying Pupylation Proteins and Sites by Incorporating Multiple Methods
2022
Frontiers in Endocrinology
To improve the pupylation protein prediction model, the KNN scoring matrix model based on functional domain GO annotation and the Word Embedding model were used to extract the features and Random Under-sampling ...
Pupylation is an important posttranslational modification in proteins and plays a key role in the cell function of microorganisms; an accurate prediction of pupylation proteins and specified sites is of ...
GO-KNN GO-KNN (10) is based on the KNN scoring matrix of functional domain GO annotations to extract features. In this study, we need to obtain the GO information of all proteins. ...
doi:10.3389/fendo.2022.849549
pmid:35557849
pmcid:PMC9088680
fatcat:2bgfanbkljhghbdcb7tj6i54qq
« Previous
Showing results 1 — 15 out of 219 results