Filters








1,233 Hits in 6.9 sec

A Simple Regularization-based Algorithm for Learning Cross-Domain Word Embeddings

Wei Yang, Wei Lu, Vincent Zheng
2017 Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing  
In this paper, we present a simple yet effective method for learning word embeddings based on text from different domains.  ...  Learning word embeddings has received a significant amount of attention recently. Often, word embeddings are learned in an unsupervised manner from a large collection of text.  ...  We thank the support of Human-centered Cyber-physical Systems Programme at Advanced Digital Sciences Center from Singapores Agency for Science, Technology and Research (A*STAR).  ... 
doi:10.18653/v1/d17-1312 dblp:conf/emnlp/YangLZ17 fatcat:4mjliiamm5a6vaxefg4ptguxdm

A Hierarchical Approach for Sentiment Analysis and Categorization of Turkish Written Customer Relationship Management Data

Mehmet Seyfioğlu, Mustafa Demirezen
2017 Proceedings of the 2017 Federated Conference on Computer Science and Information Systems  
For binary sentiment analysis, i.e determination of 'positive' and 'negative' sentiments, an extreme gradient boosting (xgboost) classifier is trained on averaged review vectors and an overall accuracy  ...  As sentiments expressed by the customers are vitally important for companies, an accurate and swift analysis is needed.  ...  Word2Vec is a shallow neural network in general, which has one input, one hidden and one output layers. There are two Word2Vec models available; Continuous Bag of Words (CBOW) and Skip-Gram.  ... 
doi:10.15439/2017f204 dblp:conf/fedcsis/SeyfiogluD17 fatcat:sm6i5qoy25hbrl6xtroy2cokcm

A Term Weighted Neural Language Model and Stacked Bidirectional LSTM Based Framework for Sarcasm Identification

Aytug Onan, Mansur Alp Tocoglu
2021 IEEE Access  
The purpose of our research is to present an effective sarcasm identification framework on social media data by pursuing the paradigms of neural language models and deep neural networks.  ...  In the empirical analysis, three neural language models (i.e., word2vec, fastText and GloVe), two unsupervised term weighting functions (i.e., term-frequency, and TF-IDF) and eight supervised term weighting  ...  EXPERIMENTAL ANALYSIS To examine the predictive performance of the proposed deep learning based framework on sarcasm identification, an extensive empirical analysis has been performed.  ... 
doi:10.1109/access.2021.3049734 fatcat:kzobtjmgp5b2bcsufcmmuoufza

Convolutional Neural Networks for Sentiment Classification on Business Reviews [article]

Andreea Salinca
2017 arXiv   pre-print
Recently Convolutional Neural Networks (CNNs) models have proven remarkable results for text classification and sentiment analysis.  ...  In this paper, we present our approach on the task of classifying business reviews using word embeddings on a large-scale dataset provided by Yelp: Yelp 2017 challenge dataset.  ...  We conduct an empirical study on effect of hyperparameters on the overall performance in the sentiment classification task.  ... 
arXiv:1710.05978v1 fatcat:jmux4xkdtbay7gncrmwezakrta

Learning Sentiment-Specific Word Embedding via Global Sentiment Representation

Peng Fu, Zheng Lin, Fengcheng Yuan, Weiping Wang, Dan Meng
2018 PROCEEDINGS OF THE THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE AND THE TWENTY-EIGHTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE  
We take global sentiment representation as a simple average of word embeddings in the text, and use a corruption strategy as a sentiment-dependent regularization.  ...  Recently, some sentiment embedding learning methods have been proposed, but most of them are designed to work well on sentence-level texts.  ...  The authors would like to thank the anonymous reviewers and the area chairs for their constructive comments.  ... 
doi:10.1609/aaai.v32i1.11916 fatcat:7zmqvkihhvff3ncktnnm3elsm4

Robust Gram Embeddings

Taygun Kekec, David M. J. Tax
2016 Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing  
We propose a regularized embedding formulation, called Robust Gram (RG), which penalizes overfitting by suppressing the disparity between target and context embeddings.  ...  Our experimental analysis shows that the RG model trained on small datasets generalizes better compared to alternatives, is more robust to variations in the training set, and correlates well to human similarities  ...  We also would like to thank Hamdi Dibeklioglu and Mustafa Unel for their kind support during this work.  ... 
doi:10.18653/v1/d16-1113 dblp:conf/emnlp/KekecT16 fatcat:nauo2v6g3zapdbvodwklnwjftq

Jointly Modeling Review Content and Aspect Ratings for Review Rating Prediction

Zhipeng Jin, Qiudan Li, Daniel D. Zeng, YongCheng Zhan, Ruoran Liu, Lei Wang, Hongyuan Ma
2016 Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval - SIGIR '16  
Review rating prediction is of much importance for sentiment analysis and business intelligence.  ...  The method firstly learns the latent vector representation of review content using skip-thought vectors, a state-of-the-art deep learning method, then, the missing values of aspect ratings are filled in  ...  Recently, deep learning has been a hot research topic and successfully applied to sentiment analysis and natural language processing.  ... 
doi:10.1145/2911451.2914692 dblp:conf/sigir/JinLZZLWM16 fatcat:b4oprnd7rfcwpj52cjru27af64

Data Mining for Cyberbullying and Harassment Detection in Arabic Texts

Eman Bashir, Collage of Computer Sciences and Information Technology, Sudan University of Science and Technology, Khartoum, Sudan, Mohamed Bouguessa
2021 International Journal of Information Technology and Computer Science  
However, the Arabic text has drawbacks for its complexity, challenges, and scarcity of its resources.  ...  This paper investigates several questions related to the content of how to protect an Arabic text from cyberbullying/harassment through the information posted on Twitter.  ...  The results of this study showed the machine learning classifiers got an accuracy of less than 70%, while LSTM approaches got an accuracy of 72%, which was one of the main aims.  ... 
doi:10.5815/ijitcs.2021.05.04 fatcat:ovx3hpqvhzcffbsizkqh4dz2ru

Not just about size - A Study on the Role of Distributed Word Representations in the Analysis of Scientific Publications [article]

Andres Garcia, Jose Manuel Gomez-Perez
2018 arXiv   pre-print
On the other hand, obtaining comparable results through general corpora can also be achieved, but only in the presence of very large corpora of well formed text.  ...  In this paper we present experimental results about the generation of word embeddings from scholarly publications for the intelligent processing of scientific texts extracted from SciGraph.  ...  We also thank Constantino Roman for his contributions to the experimental evaluation of this work.  ... 
arXiv:1804.01772v1 fatcat:3fuzesbrsfbdnaeslgbodea4n4

Multi-level stacked ensemble with sparse and dense features for hate speech detection on Twitter

Darko Tosev, Sonja Gievska
2021 Conference and Labs of the Evaluation Forum  
In this paper, we report on an experiment that examines the predictive power of a number of sparse and dense feature representations coupled with a multi-level ensemble classifier.  ...  Harnessing the power of machine learning for the purpose of detecting and mediating the spread of malicious behavior has received a heightened attention in the last decade.  ...  Lexicons For the purposes of sentiment analysis and hateful terms recognition, a couple of lexicons were used.  ... 
dblp:conf/clef/TosevG21 fatcat:uup4mr6fynellc4prnnmkuzorq

Sentiment classification of documents in Serbian: The effects of morphological normalization and word embeddings

Vuk Batanovic, Bosko Nikolic
2017 Telfor Journal  
In this paper, we assess the impact of lemmatizers and stemmers for Serbian on classifiers trained and evaluated on the Serbian Movie Review Dataset.  ...  An open issue in the sentiment classification of texts written in Serbian is the effect of different forms of morphological normalization and the usefulness of leveraging large amounts of unlabeled texts  ...  An additional drawback of lemmatizers for Serbian, in the context of sentiment analysis, is that they reduce all degrees of comparison of adverbs and adjectives into one.  ... 
doi:10.5937/telfor1702104b fatcat:nwpvpnff5jepjp53clzswh5hyi

A Survey of Neural Network Techniques for Feature Extraction from Text [article]

Vineet John
2017 arXiv   pre-print
The research questions discussed in the paper focus on the state-of-the-art neural network techniques that have proven to be useful tools for language processing, language generation, text classification  ...  and other computational linguistics tasks.  ...  Feature extraction of text can be used for a multitude of applications including -but not limited tounsupervised semantic similarity detection, article classification and sentiment analysis.  ... 
arXiv:1704.08531v1 fatcat:ennpaa3j4rc5jm22zusoak7ke4

Task-oriented Word Embedding for Text Classification

Qian Liu, Heyan Huang, Yang Gao, Xiaochi Wei, Yuxin Tian, Luyang Liu
2018 International Conference on Computational Linguistics  
The rational word embeddings should have the ability to capture both the semantic features and task-specific features of words.  ...  With the function-aware component, our method regularizes the distribution of words to enable the embedding space to have a clear classification boundary.  ...  Experiments show qualitative improvements of our method over context-based Skip-gram method on word neighbors for classification.  ... 
dblp:conf/coling/LiuHGWTL18 fatcat:kmz3gwz3pfhtxopyagfilsgzhu

Concept-Based Embeddings for Natural Language Processing [article]

Yukun Ma, Erik Cambria
2018 arXiv   pre-print
, and targeted sentiment analysis.  ...  In a broad context of opinion understanding system, we investigate the use of the fused embedding for several core NLP tasks: named entity detection and classification, automatic speech recognition reranking  ...  I also would like to thank San Linn, the manager of our research project, for his effective coordination and kind support throughout the whole process. Special thanks go to Dr. Benjamin Bigot, Dr.  ... 
arXiv:1807.05519v1 fatcat:t4l263zvl5e6dizmoauad55nxq

Syntax-Ignorant N-gram Embeddings for Sentiment Analysis of Arabic Dialects

Hala Mulki, Hatem Haddad, Mourad Gridach, Ismail Babaoğlu
2019 Proceedings of the Fourth Arabic Natural Language Processing Workshop  
With the free word order and the varying syntax nature across the different Arabic dialects, a sentiment analysis system developed for one dialect might not be efficient for the others.  ...  Here we present syntax-ignorant n-gram embeddings to be used in sentiment analysis of several Arabic dialects.  ...  The model was trained with word embeddings learned from a corpus of 3.4 billion Arabic words using CBOW and Skip-Gram (SG).  ... 
doi:10.18653/v1/w19-4604 dblp:conf/wanlp/MulkiHGB19 fatcat:sjmoaohvlzhknbnrscjd5l22tm
« Previous Showing results 1 — 15 out of 1,233 results