Filters








169 Hits in 4.1 sec

Siamese CBOW: Optimizing Word Embeddings for Sentence Representations [article]

Tom Kenter, Alexey Borisov, Maarten de Rijke
2016 arXiv   pre-print
We present the Siamese Continuous Bag of Words (Siamese CBOW) model, a neural network for efficient estimation of high-quality sentence embeddings.  ...  However, word embeddings trained with the methods currently available are not optimized for the task of sentence representation, and, thus, likely to be suboptimal.  ...  Many thanks to Christophe Van Gysel for implementation-related help.  ... 
arXiv:1606.04640v1 fatcat:gebil5wpingavjchb5cuequdxa

Siamese CBOW: Optimizing Word Embeddings for Sentence Representations

Tom Kenter, Alexey Borisov, Maarten de Rijke
2016 Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)  
Siamese CBOW: Optimizing word embeddings for sentence representations Kenter, T.; Borisov, A.; de Rijke, M.  ...  Many thanks to Christophe Van Gysel for implementation-related help.  ...  Acknowledgments The authors wish to express their gratitude for the valuable advice and relevant pointers of the anonymous reviewers.  ... 
doi:10.18653/v1/p16-1089 dblp:conf/acl/KenterBR16 fatcat:dcedeelxvnd5pmxctilpqog6ey

A Source Code Similarity Based on Siamese Neural Network

Chunli Xie, Xia Wang, Cheng Qian, Mengqi Wang
2020 Applied Sciences  
The experimental results show that our method improves some performance over single word embedding method.  ...  Then, a Siamese Neural Network trained model is constructed to learn semantic vector representation of code snippets.  ...  Finally, the cosine similarity scores of source code pairs are calculated based on their representations. We call this approach Word Information for Code Embedding-Siamese Neural Networks (WICE-SNN).  ... 
doi:10.3390/app10217519 fatcat:msik2vttabeezgb5aoncejazk4

Dependency-based Siamese long short-term memory network for learning sentence representations

Wenhao Zhu, Tengjun Yao, Jianyue Ni, Baogang Wei, Zhiguo Lu, Xuchu Weng
2018 PLoS ONE  
of words (CBOW) and skip-gram models, and they have been extensively employed in a variety of NLP tasks.  ...  Because of the complex structure generated by the longer text lengths, such as sentences, algorithms appropriate for learning short textual representations are not applicable for learning long textual  ...  Their work highlighted that word embeddings trained with the currently available methods are not optimized for the task of sentence representation, whereas Siamese CBOW handles this problem by directly  ... 
doi:10.1371/journal.pone.0193919 pmid:29513748 pmcid:PMC5841810 fatcat:fwo7rdfc75fatozew4o5jwsjqi

Improving the Community Question Retrieval Performance Using Attention-Based Siamese LSTM [chapter]

Nouha Othman, Rim Faiz, Kamel Smaïli
2020 Lecture Notes in Computer Science  
We propose a deep learning approach based on a Siamese architecture with LSTM networks, augmented with an attention mechanism.  ...  The major challenges in this crucial task are the shortness of the questions as well as the word mismatch problem as users can formulate the same query using different wording.  ...  Recent works focused on the representation learning for questions, relying on the Word Embedding model for learning distributed representations of words in a lowdimensional vector space.  ... 
doi:10.1007/978-3-030-51310-8_23 fatcat:gagfhb3p3vflrgbez65s25tr4m

Query-focused Scientific Paper Summarization with Localized Sentence Representation

Kazutoshi Shinoda, Akiko Aizawa
2018 Annual International ACM SIGIR Conference on Research and Development in Information Retrieval  
In our approach, we first calculate word importance scores for each target document using a word-level random walk. Next, we optimize sentence embedding vectors using a Siamese neural network.  ...  Here, we utilize localized sentence representations obtained as the weighted average of word embeddings where the weights are determined by the word importance scores.  ...  Secondly, we train a Siamese neural network to obtain optimal sentence embeddings.  ... 
dblp:conf/sigir/ShinodaA18 fatcat:3qrwsk2pdrdiddgnl3up7653oe

Biomedical ontology alignment: an approach based on representation learning

Prodromos Kolyvakis, Alexandros Kalousis, Barry Smith, Dimitris Kiritsis
2018 Journal of Biomedical Semantics  
This embedding is derived on the basis of a novel phrase retrofitting strategy through which semantic similarity information becomes inscribed onto fields of pre-trained word vectors.  ...  Conclusions: Our proposed representation learning approach leverages terminological embeddings to capture semantic similarity.  ...  Siamese CBOW is a log linear model aiming at predicting a sentence from its adjacent sentences; addressing the research question whether directly optimizing word vectors for the task of being averaged  ... 
doi:10.1186/s13326-018-0187-8 pmid:30111369 pmcid:PMC6094585 fatcat:x4ojf6uht5g45dke7zmty7slue

Learning Contextual Embeddings for Structural Semantic Similarity using Categorical Information

Massimo Nicosia, Alessandro Moschitti
2017 Proceedings of the 21st Conference on Computational Natural Language Learning (CoNLL 2017)  
We study how to learn representations for the words in context such that TKs can exploit more focused information.  ...  Thus, we define a new approach based on a Siamese Network, which produces word representations while learning a binary text similarity.  ...  Many thanks to the anonymous reviewers for their valuable suggestions.  ... 
doi:10.18653/v1/k17-1027 dblp:conf/conll/NicosiaM17 fatcat:vvbdoihe2zdo3jsmt3jrycmkfi

Learning Neural Word Salience Scores [article]

Krasen Samardzhiev and Andrew Gargett and Danushka Bollegala
2017 arXiv   pre-print
Specifically, we learn word salience scores such that, using pre-trained word embeddings as the input, can accurately predict the words that appear in a sentence, given the words that appear in the sentences  ...  Experimental results on sentence similarity prediction show that the learnt word salience scores perform comparably or better than some of the state-of-the-art approaches for representing sentences on  ...  Siamese CBOW [26] learns word embeddings such that we can accurately compute sentence embeddings by averaging the word embeddings.  ... 
arXiv:1709.01186v1 fatcat:xxua2n2vgvaqpaiyxgiuxbe3vu

Learning Neural Word Salience Scores

Krasen Samardzhiev, Andrew Gargett, Danushka Bollegala
2018 Proceedings of the Seventh Joint Conference on Lexical and Computational Semantics  
Specifically, we learn word salience scores such that, using pre-trained word embeddings as the input, can accurately predict the words that appear in a sentence, given the words that appear in the sentences  ...  Experimental results on sentence similarity prediction show that the learnt word salience scores perform comparably or better than some of the state-of-the-art approaches for representing sentences on  ...  Siamese CBOW (Kenter et al., 2016) learns word embeddings such that we can accurately compute sentence embeddings by averaging the word embeddings.  ... 
doi:10.18653/v1/s18-2004 dblp:conf/starsem/SamardzhievGB18 fatcat:ajljcf27lrflbddapjfqb7pjce

RiskFinder: A Sentence-level Risk Detector for Financial Reports

Yu-Wen Liu, Liang-Chih Liu, Chuan-Ju Wang, Ming-Feng Tsai
2018 Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations  
based on the latest sentence embedding techniques; 4) a visualization of financial time-series data for a corresponding company.  ...  In particular, the system broadens the analyses from the word level to sentence level, which makes the system useful for practitioner communities and unprecedented among financial academics.  ...  sentence classification and yields performance on par with other deep learning classifiers, and Siamese-CBOW, a neural network architecture that obtains word embeddings, directly optimized for sentence  ... 
doi:10.18653/v1/n18-5017 dblp:conf/naacl/LiuLWT18 fatcat:fz4jwh46xbdkvcmfut3vp4pdpy

DeepKAF: A Heterogeneous CBR & Deep Learning Approach for NLP Prototyping

Kareem Amin, Stelios Kapetanakis, Nikolaos Polatidis, Klaus-Dieter Althoff, Andreas Dengel
2020 2020 International Conference on INnovations in Intelligent SysTems and Applications (INISTA)  
well as find similarities during the retrieval phase. b) Word Embeddings: For more clarification, word embedding is the method of translating the text into a numerical representation (Vectorization).  ...  An observation from the authors is that embeddings are affected from the type of text, the training which should be based on high coverage of available words and sentences for a highly relevant embeddings  ... 
doi:10.1109/inista49547.2020.9194679 dblp:conf/inista/AminKPA020 fatcat:blvkxw7t25akvl4bhq2hl2cplq

Learning semantic similarity in a continuous space

Michel Deudon
2018 Neural Information Processing Systems  
Our work naturally extends Word Mover's Distance (WMD) [1] by representing text documents as normal distributions instead of bags of embedded words.  ...  Semantic similarity between pairs is then learned discriminatively as an optimal transport distance metric (Wasserstein 2) with our novel variational siamese framework.  ...  Acknowledgments We would like to thank Ecole Polytechnique for financial support and Télécom Paris-Tech for GPU resources.  ... 
dblp:conf/nips/Deudon18 fatcat:4wl4phdspffinomr5i5of76i7e

Interpreting the Syntactic and Social Elements of the Tweet Representations via Elementary Property Prediction Tasks [article]

J Ganesh, Manish Gupta, Vasudeva Varma
2016 arXiv   pre-print
, LDA), unsupervised representation learning methods (Siamese CBOW, Tweet2Vec) as well as supervised methods (CNN, BLSTM).  ...  Our work presented here constitutes the first step in opening the black-box of vector embedding for social media posts, with emphasis on tweets in particular.  ...  Note that this is different from BOW because the word vectors here are optimized for sentence representation.  ... 
arXiv:1611.04887v1 fatcat:4t7rx5nct5e43grsqyalihutxq

Concatenated Power Mean Word Embeddings as Universal Cross-Lingual Sentence Representations [article]

Andreas Rücklé, Steffen Eger, Maxime Peyrard, Iryna Gurevych
2018 arXiv   pre-print
Average word embeddings are a common baseline for more sophisticated sentence embedding techniques. However, they typically fall short of the performances of more complex models such as InferSent.  ...  Here, we generalize the concept of average word embeddings to power mean word embeddings.  ...  Siamese CBOW: Optimizing word embed- dings for sentence representations. In Proceed- ings of the 54th Annual Meeting of the Associ- ation for Computational Linguistics (ACL 2016).  ... 
arXiv:1803.01400v2 fatcat:bzhqljltunhszlqa5dlagwtuqe
« Previous Showing results 1 — 15 out of 169 results