Filters








217 Hits in 7.0 sec

Cause Identification from Aviation Safety Incident Reports via Weakly Supervised Semantic Lexicon Construction

M. A. Abedin, V. Ng, L. Khan
2010 The Journal of Artificial Intelligence Research  
Both approaches exploit information provided by a semantic lexicon, which is automatically constructed via Thelen and Riloff's Basilisk framework augmented with our linguistic and algorithmic modifications  ...  The first approach labels a report using a simple heuristic, which looks for the words and phrases acquired during the semantic lexicon learning process in the report.  ...  Below we describe two versions of SVMs: (1) inductive SVMs, which learn a classifier solely from labeled data, and (2) transductive SVMs, which learn a classifier from both labeled and unlabeled data.  ... 
doi:10.1613/jair.2986 fatcat:vw2hmdrp5fezhk7l3ooxh3p3j4

Retrofitting Word Vectors to Semantic Lexicons [article]

Manaal Faruqui and Jesse Dodge and Sujay K. Jauhar and Chris Dyer and Eduard Hovy and Noah A. Smith
2015 arXiv   pre-print
Vector space word representations are learned from distributional information of words in large corpora.  ...  This paper proposes a method for refining vector space representations using relational information from semantic lexicons by encouraging linked words to have similar vector representations, and it makes  ...  Acknowledgements This research was supported in part by the National Science Foundation under grants IIS-1143703, IIS-1147810, and IIS-1251131; by IARPA via Department of Interior National Business Center  ... 
arXiv:1411.4166v4 fatcat:4svbdvh3lbgfhjmhpxrqjh3d2e

Retrofitting Word Vectors to Semantic Lexicons

Manaal Faruqui, Jesse Dodge, Sujay Kumar Jauhar, Chris Dyer, Eduard Hovy, Noah A. Smith
2015 Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies  
Vector space word representations are learned from distributional information of words in large corpora.  ...  This paper proposes a method for refining vector space representations using relational information from semantic lexicons by encouraging linked words to have similar vector representations, and it makes  ...  Acknowledgements This research was supported in part by the National Science Foundation under grants IIS-1143703, IIS-1147810, and IIS-1251131; by IARPA via Department of Interior National Business Center  ... 
doi:10.3115/v1/n15-1184 dblp:conf/naacl/FaruquiDJDHS15 fatcat:rad4khio25hrppkmbchp6vydj4

Retrofitting Word Vectors to Semantic Lexicons

Manaal Faruqui, Jesse Dodge, Sujay Kumar Jauhar, Chris Dyer, Eduard Hovy, Noah A. Smith
2018
Vector space word representations are learned from distributional information of words in large corpora.  ...  This paper proposes a method for refining vector space representations using relational information from semantic lexicons by encouraging linked words to have similar vector representations, and it makes  ...  Acknowledgements This research was supported in part by the National Science Foundation under grants IIS-1143703, IIS-1147810, and IIS-1251131; by IARPA via Department of Interior National Business Center  ... 
doi:10.1184/r1/6473660.v1 fatcat:lg2ljow77ngoxnmdka67wljg4i

Extracting Semantics from Multimedia Content: Challenges and Solutions [chapter]

Lexing Xie, Rong Yan
2008 Signals and Communication Technology  
In this chapter, we present a review on extracting semantics from a large amount of multimedia data as a statistical learning problem.  ...  correspondence across modalities, learning structured (generative) models to account for natural data dependency or model hidden topics, handling rare classes, leveraging unlabeled data, scaling to large  ...  This said, learning to extract semantics from multimedia shall be of much broader interest than in the multimedia analysis community.  ... 
doi:10.1007/978-0-387-76569-3_2 fatcat:jul6fw7esfaurct6erjnvpcq6q

Self-training from labeled features for sentiment analysis

Yulan He, Deyu Zhou
2011 Information Processing & Management  
The word-class distributions of such self-learned features are estimated from the pseudolabeled examples and are used to train another classifier by constraining the model's predictions on unlabeled instances  ...  In this paper, we propose a novel framework where an initial classifier is learned by incorporating prior information extracted from an existing sentiment lexicon with preferences on expectations of sentiment  ...  Proposed framework We propose a novel framework for sentiment classifier learning from unlabeled documents as shown in Fig. 1 .  ... 
doi:10.1016/j.ipm.2010.11.003 fatcat:d2xopnal5fgxtpn5p2qi6ete5a

Bootstrap Domain-Specific Sentiment Classifiers from Unlabeled Corpora

Andrius Mudinas, Dell Zhang, Mark Levene
2018 Transactions of the Association for Computational Linguistics  
Our investigation indicates that in the word embeddings learned from the unlabeled corpus of a given domain, the distributed word representations (vectors) for opposite sentiments form distinct clusters  ...  Exploiting such a clustering structure, we are able to utilize machine learning algorithms to induce a quality domain-specific sentiment lexicon from just a few typical sentiment words ("seeds").  ...  We thank the reviewers for their constructive and helpful comments. We also gratefully acknowledge the support of Geek.AI for this work.  ... 
doi:10.1162/tacl_a_00020 fatcat:5vkbqtktm5cb7fte7ajipuxyk4

LCCT: A Semi-supervised Model for Sentiment Classification

Min Yang, Wenting Tu, Ziyu Lu, Wenpeng Yin, Kam-Pui Chow
2015 Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies  
In addition, we want the algorithm being able to deal with missing labels and learning from incomplete sentiment lexicons.  ...  The proposed method combines the idea of lexicon-based learning and corpus-based learning in a unified cotraining framework.  ...  SemEval-2013 (SemEval) dataset in English This dataset is constructed for the Twitter sentiment analysis task (Task 2) in the Semantic Evaluation of Systems challenge (SemEval-2013).  ... 
doi:10.3115/v1/n15-1057 dblp:conf/naacl/YangTLYC15 fatcat:sna7kjzfjve3rb3tkmzrx6e7sa

Cross Domain Sentiment Classification Techniques: A Review

Parvati Kadli, Vidyavathi B.
2019 International Journal of Computer Applications  
It is difficult to get annotated data of all domains that can be used to train a learning model.  ...  In this paper we present literature review of methods and techniques employed for cross domain sentiment analysis.  ...  Latent Direct Analysis (LDA) In [26] , Real time transfer learning framework based on LDA is proposed. Here topic space is learnt from social streams in real time via online streaming LDA.  ... 
doi:10.5120/ijca2019918338 fatcat:uubvhfmyg5bmrdjioqis72zlhi

Active Sentiment Domain Adaptation

Fangzhao Wu, Yongfeng Huang, Jun Yan
2017 Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)  
annotated in an active learning mode, as well as the domain-specific sentiment similarities among words mined from unlabeled samples of target domain.  ...  Domain adaptation is an important technology to handle domain dependence problem in sentiment analysis field. Existing methods usually rely on sentiment classifiers trained in source domains.  ...  The first part was used as test data, and the second part was used as the pool of "unlabeled" samples to perform active learning.  ... 
doi:10.18653/v1/p17-1156 dblp:conf/acl/WuHY17 fatcat:uhezbtujcnearcashulbwcpf3i

Cross-domain graph based similarity measurement of workflows

Tahereh Koohi-Var, Morteza Zahedi
2018 Journal of Big Data  
To manage the transfer learning crashes we benefit from LSI when we use some unlabeled data from background knowledge.  ...  We train a model that learns representational embeddings for motifs from a large collection of unlabeled data using a generative model.  ...  The idea was to workflow motif extraction between large amount of structured and unstructured data (Big Data). We use MZ experience to improve our work.  ... 
doi:10.1186/s40537-018-0127-6 fatcat:hv3twitaund25kdjmd5ipwyk7i

Bilingual Co-Training for Sentiment Classification of Chinese Product Reviews

Xiaojun Wan
2011 Computational Linguistics  
the language gap, and then propose a bilingual co-training approach to make use of both the English view and the Chinese view based on additional unlabeled Chinese data.  ...  We first investigate several basic methods (including lexicon-based methods and corpus-based methods) for cross-lingual sentiment classification by simply leveraging machine translation services to eliminate  ...  We are very grateful to the anonymous reviewers for their insightful and constructive comments and suggestions.  ... 
doi:10.1162/coli_a_00061 fatcat:ernhxn5n7jfeddxw7nclyhpreq

Learning Word Ratings for Empathy and Distress from Document-Level User Responses [article]

João Sedoc, Sven Buechel, Yehonathan Nachmany, Anneke Buffone, Lyle Ungar
2020 arXiv   pre-print
Emotion analysis of text is increasing in popularity in NLP; however, manually creating lexica for psychological constructs such as empathy has proven difficult.  ...  to create the first-ever empathy lexicon.  ...  Lexicon Learning from Document Labels Few studies address learning word ratings based on document-level supervision.  ... 
arXiv:1912.01079v2 fatcat:ldxussv3rzaofor3gpl6egvphy

Modeling Social Networks using Data Mining Approaches-Review

Fatima Hassan, Suhad Faisal Behadili
2022 Iraqi Journal of Science  
Though, implementations of data mining are still raw and require more work via industry and academic world to prepare the work sufficiently. Bring this analysis to a close.  ...  Getting knowledge from raw data has delivered beneficial information in several domains. The prevalent utilizing of social media produced extraordinary quantities of social information.  ...  understanding inside supervised learning and extent the method to contain unlabeled data.  ... 
doi:10.24996/ijs.2022.63.3.35 fatcat:dnpkdkzssjhvbarxa22hc4cu2i

Two Step CCA: A new spectral method for estimating vector models of words [article]

Paramveer Dhillon, Jordan Rodu, Lyle Ungar
2012 arXiv   pre-print
Unlabeled data is often used to learn representations which can be used to supplement baseline features in a supervised learner.  ...  In this paper, we present a new spectral method based on CCA to learn an eigenword dictionary.  ...  All the data was collected from the PERMA lexicon.  ... 
arXiv:1206.6403v1 fatcat:clymoaestngkhbk54ewunksloa
« Previous Showing results 1 — 15 out of 217 results