A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Filters
Siamese CBOW: Optimizing Word Embeddings for Sentence Representations
[article]
2016
arXiv
pre-print
We present the Siamese Continuous Bag of Words (Siamese CBOW) model, a neural network for efficient estimation of high-quality sentence embeddings. ...
However, word embeddings trained with the methods currently available are not optimized for the task of sentence representation, and, thus, likely to be suboptimal. ...
Many thanks to Christophe Van Gysel for implementation-related help. ...
arXiv:1606.04640v1
fatcat:gebil5wpingavjchb5cuequdxa
Siamese CBOW: Optimizing Word Embeddings for Sentence Representations
2016
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Siamese CBOW: Optimizing word embeddings for sentence representations Kenter, T.; Borisov, A.; de Rijke, M. ...
Many thanks to Christophe Van Gysel for implementation-related help. ...
Acknowledgments The authors wish to express their gratitude for the valuable advice and relevant pointers of the anonymous reviewers. ...
doi:10.18653/v1/p16-1089
dblp:conf/acl/KenterBR16
fatcat:dcedeelxvnd5pmxctilpqog6ey
A Source Code Similarity Based on Siamese Neural Network
2020
Applied Sciences
The experimental results show that our method improves some performance over single word embedding method. ...
Then, a Siamese Neural Network trained model is constructed to learn semantic vector representation of code snippets. ...
Finally, the cosine similarity scores of source code pairs are calculated based on their representations. We call this approach Word Information for Code Embedding-Siamese Neural Networks (WICE-SNN). ...
doi:10.3390/app10217519
fatcat:msik2vttabeezgb5aoncejazk4
Dependency-based Siamese long short-term memory network for learning sentence representations
2018
PLoS ONE
of words (CBOW) and skip-gram models, and they have been extensively employed in a variety of NLP tasks. ...
Because of the complex structure generated by the longer text lengths, such as sentences, algorithms appropriate for learning short textual representations are not applicable for learning long textual ...
Their work highlighted that word embeddings trained with the currently available methods are not optimized for the task of sentence representation, whereas Siamese CBOW handles this problem by directly ...
doi:10.1371/journal.pone.0193919
pmid:29513748
pmcid:PMC5841810
fatcat:fwo7rdfc75fatozew4o5jwsjqi
Improving the Community Question Retrieval Performance Using Attention-Based Siamese LSTM
[chapter]
2020
Lecture Notes in Computer Science
We propose a deep learning approach based on a Siamese architecture with LSTM networks, augmented with an attention mechanism. ...
The major challenges in this crucial task are the shortness of the questions as well as the word mismatch problem as users can formulate the same query using different wording. ...
Recent works focused on the representation learning for questions, relying on the Word Embedding model for learning distributed representations of words in a lowdimensional vector space. ...
doi:10.1007/978-3-030-51310-8_23
fatcat:gagfhb3p3vflrgbez65s25tr4m
Query-focused Scientific Paper Summarization with Localized Sentence Representation
2018
Annual International ACM SIGIR Conference on Research and Development in Information Retrieval
In our approach, we first calculate word importance scores for each target document using a word-level random walk. Next, we optimize sentence embedding vectors using a Siamese neural network. ...
Here, we utilize localized sentence representations obtained as the weighted average of word embeddings where the weights are determined by the word importance scores. ...
Secondly, we train a Siamese neural network to obtain optimal sentence embeddings. ...
dblp:conf/sigir/ShinodaA18
fatcat:3qrwsk2pdrdiddgnl3up7653oe
Biomedical ontology alignment: an approach based on representation learning
2018
Journal of Biomedical Semantics
This embedding is derived on the basis of a novel phrase retrofitting strategy through which semantic similarity information becomes inscribed onto fields of pre-trained word vectors. ...
Conclusions: Our proposed representation learning approach leverages terminological embeddings to capture semantic similarity. ...
Siamese CBOW is a log linear model aiming at predicting a sentence from its adjacent sentences; addressing the research question whether directly optimizing word vectors for the task of being averaged ...
doi:10.1186/s13326-018-0187-8
pmid:30111369
pmcid:PMC6094585
fatcat:x4ojf6uht5g45dke7zmty7slue
Learning Contextual Embeddings for Structural Semantic Similarity using Categorical Information
2017
Proceedings of the 21st Conference on Computational Natural Language Learning (CoNLL 2017)
We study how to learn representations for the words in context such that TKs can exploit more focused information. ...
Thus, we define a new approach based on a Siamese Network, which produces word representations while learning a binary text similarity. ...
Many thanks to the anonymous reviewers for their valuable suggestions. ...
doi:10.18653/v1/k17-1027
dblp:conf/conll/NicosiaM17
fatcat:vvbdoihe2zdo3jsmt3jrycmkfi
Learning Neural Word Salience Scores
[article]
2017
arXiv
pre-print
Specifically, we learn word salience scores such that, using pre-trained word embeddings as the input, can accurately predict the words that appear in a sentence, given the words that appear in the sentences ...
Experimental results on sentence similarity prediction show that the learnt word salience scores perform comparably or better than some of the state-of-the-art approaches for representing sentences on ...
Siamese CBOW [26] learns word embeddings such that we can accurately compute sentence embeddings by averaging the word embeddings. ...
arXiv:1709.01186v1
fatcat:xxua2n2vgvaqpaiyxgiuxbe3vu
Learning Neural Word Salience Scores
2018
Proceedings of the Seventh Joint Conference on Lexical and Computational Semantics
Specifically, we learn word salience scores such that, using pre-trained word embeddings as the input, can accurately predict the words that appear in a sentence, given the words that appear in the sentences ...
Experimental results on sentence similarity prediction show that the learnt word salience scores perform comparably or better than some of the state-of-the-art approaches for representing sentences on ...
Siamese CBOW (Kenter et al., 2016) learns word embeddings such that we can accurately compute sentence embeddings by averaging the word embeddings. ...
doi:10.18653/v1/s18-2004
dblp:conf/starsem/SamardzhievGB18
fatcat:ajljcf27lrflbddapjfqb7pjce
RiskFinder: A Sentence-level Risk Detector for Financial Reports
2018
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations
based on the latest sentence embedding techniques; 4) a visualization of financial time-series data for a corresponding company. ...
In particular, the system broadens the analyses from the word level to sentence level, which makes the system useful for practitioner communities and unprecedented among financial academics. ...
sentence classification and yields performance on par with other deep learning classifiers, and Siamese-CBOW, a neural network architecture that obtains word embeddings, directly optimized for sentence ...
doi:10.18653/v1/n18-5017
dblp:conf/naacl/LiuLWT18
fatcat:fz4jwh46xbdkvcmfut3vp4pdpy
DeepKAF: A Heterogeneous CBR & Deep Learning Approach for NLP Prototyping
2020
2020 International Conference on INnovations in Intelligent SysTems and Applications (INISTA)
well as find similarities during the retrieval phase. b) Word Embeddings: For more clarification, word embedding is the method of translating the text into a numerical representation (Vectorization). ...
An observation from the authors is that embeddings are affected from the type of text, the training which should be based on high coverage of available words and sentences for a highly relevant embeddings ...
doi:10.1109/inista49547.2020.9194679
dblp:conf/inista/AminKPA020
fatcat:blvkxw7t25akvl4bhq2hl2cplq
Learning semantic similarity in a continuous space
2018
Neural Information Processing Systems
Our work naturally extends Word Mover's Distance (WMD) [1] by representing text documents as normal distributions instead of bags of embedded words. ...
Semantic similarity between pairs is then learned discriminatively as an optimal transport distance metric (Wasserstein 2) with our novel variational siamese framework. ...
Acknowledgments We would like to thank Ecole Polytechnique for financial support and Télécom Paris-Tech for GPU resources. ...
dblp:conf/nips/Deudon18
fatcat:4wl4phdspffinomr5i5of76i7e
Interpreting the Syntactic and Social Elements of the Tweet Representations via Elementary Property Prediction Tasks
[article]
2016
arXiv
pre-print
, LDA), unsupervised representation learning methods (Siamese CBOW, Tweet2Vec) as well as supervised methods (CNN, BLSTM). ...
Our work presented here constitutes the first step in opening the black-box of vector embedding for social media posts, with emphasis on tweets in particular. ...
Note that this is different from BOW because the word vectors here are optimized for sentence representation. ...
arXiv:1611.04887v1
fatcat:4t7rx5nct5e43grsqyalihutxq
Concatenated Power Mean Word Embeddings as Universal Cross-Lingual Sentence Representations
[article]
2018
arXiv
pre-print
Average word embeddings are a common baseline for more sophisticated sentence embedding techniques. However, they typically fall short of the performances of more complex models such as InferSent. ...
Here, we generalize the concept of average word embeddings to power mean word embeddings. ...
Siamese CBOW: Optimizing word embed-
dings for sentence representations. In Proceed-
ings of the 54th Annual Meeting of the Associ-
ation for Computational Linguistics (ACL 2016). ...
arXiv:1803.01400v2
fatcat:bzhqljltunhszlqa5dlagwtuqe
« Previous
Showing results 1 — 15 out of 169 results