Filters








28,858 Hits in 4.7 sec

Generating and mixing feature sets from language models for sentiment classification

Yoonjae Jeong, Youngho Kim, Seongchan Kim, Sung-Hyon Myaeng, Hyo-Jung Oh
2009 2009 International Conference on Natural Language Processing and Knowledge Engineering  
This paper presents methods for mixing feature sets in sentence-level sentiment analysis where a sentence is classified into one of three classes: positive, negative, and neutral.  ...  The enumeration of feature types arising from the LMs for the Logistic Regression classifier allowed us to show that domain specific models can be smoothed with a general model and that attaching a syntactic  ...  Equation (1) (2) Clue-based Sentiment Classification Sentiment clues are basically selected from the features generated from the domain-specific sentiment LMs and reinforced with manually selected  ... 
doi:10.1109/nlpke.2009.5313746 dblp:conf/nlpke/JeongKKMO09 fatcat:fzajrft5k5gwbi44pdubjmxnqu

CIA_NITT@Dravidian-CodeMix-FIRE2020: Malayalam-English Code Mixed Sentiment Analysis Using Sentence BERT And Sentiment Features

Yandrapati Prakash Babu, Rajagopal Eswari, K. Nimmi
2020 Forum for Information Retrieval Evaluation  
The classification model used for this challenging task is sentence-level BERT.  ...  Code mixing is the mixing of language while writing text. The biggest problem in Malayalam-English code-mixing is that people switch between languages (e.g.  ...  = number of features, one vector is generated for every comment with the length of 252 having 0 and 1 (if Manglish sentiment word is appeared in the comment 1 is appended to the vector otherwise 0 is appended  ... 
dblp:conf/fire/BabuEN20 fatcat:xd4qdrbflvbw7msszt5aqc64cm

ULD@NUIG at SemEval-2020 Task 9: Generative Morphemes with an Attention Model for Sentiment Analysis in Code-Mixed Text

Koustava Goswami, Priya Rani, Bharathi Raja Chakravarthi, Theodorus Fransen, John P. McCrae
2020 Zenodo  
Code mixing is a common phenomena in multilingual societies where people switch from one language to another for various reasons.  ...  In this paper, we present the Generative Morphemes with Attention (GenMA) Model sentiment analysis system contributed to SemEval 2020 Task 9 SentiMix.  ...  Besides the sentiment labels the data-set also includes word-level language tags, which are en (English), hi (Hindi), mixed, and univ (symbols, @ mentions, hashtags).  ... 
doi:10.5281/zenodo.4320704 fatcat:e4qd4oyiezenpj2ohcb3x4reju

ULD@NUIG at SemEval-2020 Task 9: Generative Morphemes with an Attention Model for Sentiment Analysis in Code-Mixed Text [article]

Koustava Goswami, Priya Rani, Bharathi Raja Chakravarthi, Theodorus Fransen, John P. McCrae
2020 arXiv   pre-print
Code mixing is a common phenomena in multilingual societies where people switch from one language to another for various reasons.  ...  In this paper, we present the Generative Morphemes with Attention (GenMA) Model sentiment analysis system contributed to SemEval 2020 Task 9 SentiMix.  ...  Besides the sentiment labels the data-set also includes word-level language tags, which are en (English), hi (Hindi), mixed, and univ (symbols, @ mentions, hashtags).  ... 
arXiv:2008.01545v1 fatcat:iinewbkpjngurdijwdm4w7uryq

SENTIMIX (HINDI- ENGLISH)

Aastha Awasthi, Akanksha Singh, Istuti Agarwal, Arushi Sanjay
2021 International Journal of Engineering Applied Sciences and Technology  
We Introduces a Hi-En code-mixing set for sensitive information and satisfying performance. A comparative study of the feasibility and implementation of SA methods in social media.  ...  We examine this problem by using sets of lexicon, emotion, and form metadata to construct a classification that can vary between "positive", "negative" and "neutral" feelings.  ...  The team used word and character level n-grams as features and SVM for sentiment classification.  ... 
doi:10.33564/ijeast.2021.v06i02.036 fatcat:drkrsnsouvfdlew6mh62y3vucm

Sentiment Analysis of Arabic Tweets: Feature Engineering and A Hybrid Approach [article]

Nora Al-Twairesh, Hend Al-Khalifa, AbdulMalik Alsalman, Yousef Al-Ohali
2018 arXiv   pre-print
Then a hybrid method that combines a corpus-based and lexicon-based method was developed for several classification models (two-way, three-way, four-way).  ...  Sentiment Analysis in Arabic is a challenging task due to the rich morphology of the language.  ...  The feature sets that were present in all the best feature sets of all three classification models are the ones extracted from the AraSenTi lexicon.  ... 
arXiv:1805.08533v1 fatcat:n4wwdkqu6fh2zors52nakagtqq

HRS-TECHIE@Dravidian-CodeMix and HASOC-FIRE2020: Sentiment Analysis and Hate Speech Identification using Machine Learning Deep Learning and Ensemble Models

Sridhar Swaminathan, Hari Krishnan Ganesan, Radhakrishnan Pandiyarajan
2020 Forum for Information Retrieval Evaluation  
Dravidian-CodeMix (Sentiment analysis for Dravidian Languages in Code-Mixed Text) at FIRE 2020 is a challenge for classification of sentiments of YouTube comments posted in mix of Tamil-English (Task 1  ...  Classification of sentiments from social media posts and comments is essential in this modern digital era.  ...  Sentiment Analysis Veena P V, et al., [19] presented the techniques for language identification for code-mixed data i.e. Tamil-English, collected from Facebook.  ... 
dblp:conf/fire/SwaminathanGP20 fatcat:bwprwbaflzcibjwbaidix5osym

An Ensemble Model for Sentiment Analysis of Hindi-English Code-Mixed Data [article]

Madan Gopal Jhanwar, Arpita Das
2018 arXiv   pre-print
The ensemble model combines the strengths of rich sequential patterns from the LSTM model and polarity of keywords from the probabilistic ngram model to identify sentiments in sparse and inconsistent code-mixed  ...  In this paper, we propose an ensemble of character-trigrams based LSTM model and word-ngrams based Multinomial Naive Bayes (MNB) model to identify the sentiments of Hindi-English (Hi-En) code-mixed data  ...  Therefore, for the task of sentiment classification using ngram-based features of Hi-En code-mixed user comments, we experimented with word unigram and bigram features, and evaluated SVM and MNB classifiers  ... 
arXiv:1806.04450v1 fatcat:2rki4bcmjzfcvgprtahb3lkwuq

Machine Learning Techniques for Sentiment Analysis of Code-Mixed and Switched Indian Social Media Text Corpus - A Comprehensive Review

Gazi Imtiyaz Ahmad, Jimmy Singla, Anis Ali, Aijaz Ahmad Reshi, Anas A. Salameh
2022 International Journal of Advanced Computer Science and Applications  
In code-mixing and switching, a bilingual person takes one or more words or phrases from one language and introduces them into another language while communicating in that language in spoken or written  ...  A comprehensive review of sentiment analysis for code-mixed and switched text corpus of Indian social media using machine learning (ML) approaches, based on recent research studies has been presented in  ...  I TEXT ANALYSIS AND SENTIMENT CLASSIFICATION OF CODE MIXED TEXT OF INDIAN LANGUAGES GATHERED FROM SOCIAL MEDIA 08 20 33 Sasidhar, T.  ... 
doi:10.14569/ijacsa.2022.0130254 fatcat:43ub7ku5xjeqvcjkpxfutpqgqi

Code-Mixed Sentiment Analysis Using Machine Learning and Neural Network Approaches [article]

Pruthwik Mishra and Prathyusha Danda and Pranav Dhakras
2018 arXiv   pre-print
Sentiment Analysis for Indian Languages (SAIL)-Code Mixed tools contest aimed at identifying the sentence level sentiment polarity of the code-mixed dataset of Indian languages pairs (Hi-En, Ben-Hi-En)  ...  Hi-En dataset is henceforth referred to as HI-EN and Ben-Hi-En dataset as BN-EN respectively. For this, we submitted four models for sentiment analysis of code-mixed HI-EN and BN-EN datasets.  ...  We used 85% of the data for training and rest 15% for validation. Table 2 : Model-wise hyperparameters proaches and the corresponding feature sets from the datasets provided.  ... 
arXiv:1808.03299v1 fatcat:napgaqgfmvbubh55opxi7tisqy

Theedhum Nandrum@Dravidian-CodeMix-FIRE2020: A Sentiment Polarity Classifier for YouTube Comments with Code-switching between Tamil, Malayalam and English [article]

BalaSundaraRaman Lakshmanan, Sanjeeth Kumar Ravindranath
2020 arXiv   pre-print
Our approach utilises language features like use of emoji, choice of scripts and code mixing which appeared quite marked in the datasets specified for the Dravidian Codemix - FIRE 2020 task.  ...  We achieved a weighted average F1 score of 0.77 for Tamil-English using a Logistic Regression based model after the task deadline.  ...  The logo for Theedhum Nandrum software was designed by Tharique Azeez signifying the duality of good and bad using the yin yang metaphor.  ... 
arXiv:2010.03189v2 fatcat:6siktgnrqzhkhdsgirw5prz3ly

De-Mixing Sentiment from Code-Mixed Text

Yash Kumar Lal, Vaibhav Kumar, Mrinal Dhar, Manish Shrivastava, Philipp Koehn
2019 Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop  
Code-mixing is the phenomenon of mixing the vocabulary and syntax of multiple languages in the same sentence.  ...  In this paper, we present a hybrid architecture for the task of Sentiment Analysis of English-Hindi code-mixed data.  ...  It augments our base neural network to boost classification accuracy for the task. Generating Subword Representations Word embeddings are now commonplace but are generally trained for one language.  ... 
doi:10.18653/v1/p19-2052 dblp:conf/acl/LalKDSK19 fatcat:hwguzm7worbqtfofmuhxmucsxq

HIT_SUN@Dravidian-CodeMix-FIRE2020: Sentiment Analysis on Multilingual Code-Mixing Text Base on BERT

Huilin Sun, Jiaming Gao, Fang Sun
2020 Forum for Information Retrieval Evaluation  
This paper mainly introduces the method used in the FIRE2020@Sentiment Analysis for Davidian Languages in the Code-Mixed Text evaluation task [1] .  ...  This paper uses a bidirectional pre-training language model (BERT) to solve the problem of sentiment classification of cross-language text.  ...  For classification tasks, we only use the information in the [CLS] token output by the model as the feature vector of the input sentence.  ... 
dblp:conf/fire/SunGS20 fatcat:zuyzk33vd5dgvmqsgk4dierraq

A Study of the Performance of Embedding Methods for Arabic Short-Text Sentiment Analysis Using Deep Learning Approaches

Ali Alwehaibi, Marwan Bikdash, Mohammad Albogmi, Kaushik Roy
2021 Journal of King Saud University: Computer and Information Sciences  
While most of the studies focus on eliciting features from English text, the research on Arabic is limited due to the morphological and grammatical complexity of Arabic language.  ...  While most of the studies focus on eliciting features from English text, the research on Arabic is limited due to the morphological and grammatical complexity of Arabic language.  ...  CNN was used for character-level classification, with features proposed for a 136-character alphabet set.  ... 
doi:10.1016/j.jksuci.2021.07.011 fatcat:nhqt4em64vexhjhn4bizhsa6sq

Palomino-Ochoa at SemEval-2020 Task 9: Robust System based on Transformer for Code-Mixed Sentiment Classification [article]

Daniel Palomino, Jose Ochoa-Luna
2020 arXiv   pre-print
We present a transfer learning system to perform a mixed Spanish-English sentiment classification task.  ...  Our proposal uses the state-of-the-art language model BERT and embed it within a ULMFiT transfer learning pipeline.  ...  BERT is the base language model (LM) which is trained on a general domain corpus to capture general features of the language through several layers.  ... 
arXiv:2011.09448v1 fatcat:v7o2fmr7inhvnhnbtlwg3tpig4
« Previous Showing results 1 — 15 out of 28,858 results