Filters








8,542 Hits in 6.1 sec

Combining Labeled and Unlabeled Data for Learning Cross-Document Structural Relationships [chapter]

Zhu Zhang, Dragomir Radev
2005 Lecture Notes in Computer Science  
, exploiting both labeled and unlabeled data.  ...  We investigate a binary classifier for determining existence of structural relationships and a full classifier using the full taxonomy of relationships.  ...  Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.  ... 
doi:10.1007/978-3-540-30211-7_4 fatcat:3iqjywtunndvxbzllnr7qj2nmm

Cross Domain Sentiment Classification Techniques: A Review

Parvati Kadli, Vidyavathi B.
2019 International Journal of Computer Applications  
It is difficult to get annotated data of all domains that can be used to train a learning model.  ...  In this paper we present literature review of methods and techniques employed for cross domain sentiment analysis.  ...  It is challenging as machine learning techniques used for cross domain classification perform well with labeled documents and hence are highly domain sensitive.  ... 
doi:10.5120/ijca2019918338 fatcat:uubvhfmyg5bmrdjioqis72zlhi

A Novel Multi label Text Classification Model using Semi supervised learning

Shweta C Dharmadhikari
2012 International Journal of Data Mining & Knowledge Management Process  
We are proposing a new multi label text classification model for assigning more relevant set of categories to every input text document.  ...  Through this paper a classification model for ATC in multi-label domain is discussed.  ...  In our proposed model ,preprocessing stage exploits relationship between labeled and unlabeled documents by identifying structural and semantically relationship between them for more relevant classification  ... 
doi:10.5121/ijdkp.2012.2402 fatcat:hhn3aa63zjdovnwgbvy25v236a

Cross-Lingual Adaptation using Structural Correspondence Learning [article]

Peter Prettenhofer, Benno Stein
2010 arXiv   pre-print
In this article we describe an extension of Structural Correspondence Learning (SCL), a recently proposed algorithm for domain adaptation, for cross-lingual adaptation.  ...  The proposed method uses unlabeled documents from both languages, along with a word translation oracle, to induce cross-lingual feature correspondences.  ...  The latter emphasizes the relationship to semi-supervised learning-with the crucial difference that labeled and unlabeled data stem from different distributions.  ... 
arXiv:1008.0716v2 fatcat:qygcn7nvuvea3erjgllg7vndlq

Intra-document structural frequency features for semi-supervised domain adaptation

Andrew Arnold, William W. Cohen
2008 Proceeding of the 17th ACM conference on Information and knowledge mining - CIKM '08  
By exploiting the explicit and implicit common structure of the different subsections of these documents, including the unlabeled full text, we are able to generate robust features that are insensitive  ...  the target domain data are wholly unlabeled captions.  ...  In the end, we are left with a semi-supervised intra-document representation of the labeled abstract data that is, due to its cross structural nature robust to shifts across the various document section  ... 
doi:10.1145/1458082.1458253 dblp:conf/cikm/ArnoldC08 fatcat:pyalxfukvzh5fluzii6lpu4y7m

Using LSI for text classification in the presence of background text

Sarah Zelikovitz, Haym Hirsh
2001 Proceedings of the tenth international conference on Information and knowledge management - CIKM'01  
However, in addition to relying on labeled training data, we improve classification accuracy by also using unlabeled data and other forms of available "background" text in the classification process.  ...  Rather than performing LSI's singular value decomposition (SVD) process solely on the training data, we instead use an expanded term-by-document matrix that includes both the labeled data as well as any  ...  ACKNOWLEDGMENTS We would like to thank Kamal Nigam for his help with rainbow and the WebKB and 20 Newsgroups data sets.  ... 
doi:10.1145/502602.502605 fatcat:hi7uxneuaveklhqmfob5kwlsuq

Using LSI for text classification in the presence of background text

Sarah Zelikovitz, Haym Hirsh
2001 Proceedings of the tenth international conference on Information and knowledge management - CIKM'01  
However, in addition to relying on labeled training data, we improve classification accuracy by also using unlabeled data and other forms of available "background" text in the classification process.  ...  Rather than performing LSI's singular value decomposition (SVD) process solely on the training data, we instead use an expanded term-by-document matrix that includes both the labeled data as well as any  ...  ACKNOWLEDGMENTS We would like to thank Kamal Nigam for his help with rainbow and the WebKB and 20 Newsgroups data sets.  ... 
doi:10.1145/502585.502605 dblp:conf/cikm/ZelikovitzH01 fatcat:5jbuwdivfnehzotvtsai2u6fce

Semi-Supervised Learning with Declaratively Specified Entropy Constraints [article]

Haitian Sun, William W. Cohen, Lidong Bing
2018 arXiv   pre-print
We propose a technique for declaratively specifying strategies for semi-supervised learning (SSL).  ...  The proposed method can be used to specify ensembles of semi-supervised learning, as well as agreement constraints and entropic regularization constraints between these learners, and can be used to model  ...  Results We take 20 labeled examples for each class as training data, and reserve 1,000 examples as test data. Other examples are treated as unlabeled.  ... 
arXiv:1804.09238v2 fatcat:prq6iil3bbhbfgawys2uvyxeqq

Complex Relation Extraction: Challenges and Opportunities [article]

Haiyun Jiang, Qiaoben Bao, Qiao Cheng, Deqing Yang, Li Wang, Yanghua Xiao
2020 arXiv   pre-print
Relation extraction is very important for knowledge base construction and text understanding.  ...  Then we summarize the existing complex relation extraction tasks and present the definition, recent progress, challenges and opportunities for each task.  ...  To leverage the large amount of unlabeled data in the training stage, semi-supervised BiRE tries to learn from both labeled data and unlabeled data.  ... 
arXiv:2012.04821v1 fatcat:6fnhjnpwmrabhhrfit3x63tfnm

A survey on heterogeneous transfer learning

Oscar Day, Taghi M. Khoshgoftaar
2017 Journal of Big Data  
Heterogeneous transfer learning is characterized by the source and target domains having differing feature spaces, but may also be combined with other issues such as differing data distributions and label  ...  These can present significant challenges, as one must develop a method to bridge the feature spaces, data distributions, and other gaps which may be present in these cross-domain learning tasks.  ...  Availability of data and materials Not applicable. Consent for publication Not applicable. Ethics approval and consent to participate Not applicable. Funding Not applicable.  ... 
doi:10.1186/s40537-017-0089-0 fatcat:bpfjycwlkrawzdyyfv2ugle5cy

Improving Probabilistic Models in Text Classification via Active Learning [article]

Mitchell Bosley, Saki Kuzushima, Ted Enamorado, Yuki Shiraito
2022 arXiv   pre-print
We propose a fast new model for text classification that combines information from both labeled and unlabeled data with an active learning component, where a human iteratively labels documents that the  ...  of unlabeled data and iteratively labeling uncertain documents, our model improves performance relative to classifiers that (a) only use information from labeled data and (b) randomly decide which documents  ...  In the model, latent clusters are observed as labels for labeled documents and estimated as a latent variable for unlabeled documents, and active learning iteratively provides observed labels for the documents  ... 
arXiv:2202.02629v1 fatcat:aeajfmebfvg5bixqgvu6pbmoce

Exploiting Context for Robustness to Label Noise in Active Learning [article]

Sudipta Paul, Shivkumar Chandrasekaran, B.S. Manjunath, Amit K. Roy-Chowdhury
2020 arXiv   pre-print
Several works in computer vision have demonstrated the effectiveness of active learning for adapting the recognition model when new unlabeled data becomes available.  ...  We construct a graphical representation of the unlabeled data to encode these relationships and obtain new beliefs on the graph when noisy labels are available.  ...  ACKNOWLEDGMENT The work was partially supported by ONR grant N00014-12-C-5113 and NSF grant 1901379  ... 
arXiv:2010.09066v1 fatcat:dfobzy3z2vcfzpsrywcy4pmsum

Cross-domain Sentiment Classification using an Adapted Naïve Bayes Approach and Features Derived from Syntax Trees
english

Srilaxmi Cheeti, Ana Stanescu, Doina Caragea
2013 Proceedings of the International Conference on Knowledge Discovery and Information Retrieval and the International Conference on Knowledge Management and Information Sharing  
To assist customers, supervised learning algorithms can be used to categorize the reviews as either positive or negative, if large amounts of labeled data are available.  ...  However, some domains have few or no labeled instances (i.e., reviews), yet a large number of unlabeled instances.  ...  In cross-domain classification, the general goal is to use labeled data in the source domain and, possibly, some labeled data in the target domain, together with unlabeled data from the target to learn  ... 
doi:10.5220/0004546501690176 dblp:conf/ic3k/CheetiSC13 fatcat:thutkqd2ezestfjfzmm3ad3uvi

Learning the Semantic Correlation: An Alternative Way to Gain from Unlabeled Text

Yi Zhang, Jeff G. Schneider, Artur Dubrawski
2008 Neural Information Processing Systems  
Empirical results show that the proposed approach is a reliable and scalable method for semi-supervised learning, regardless of the source of unlabeled data, the specific task to be enhanced, and the prediction  ...  We suggest and analyze the semantic correlation of words as a generally transferable structure of the language and propose a new method to learn this structure using an appropriately chosen latent variable  ...  Acknowledgments This work was supported by the Centers of Disease Control and Prevention (award R01-PH 000028) and by the National Science Foundation (grant IIS-0325581).  ... 
dblp:conf/nips/ZhangSD08 fatcat:av5rnllubna6zbzc7a6u5qykx4

Semi-Supervised Cross-Modal Retrieval Based on Discriminative Comapping

Li Liu, Xiao Dong, Tianshi Wang
2020 Complexity  
To address the two limitations and make full use of unlabelled data, we propose a novel semi-supervised method for cross-modal retrieval named modal-related retrieval based on discriminative comapping  ...  Most cross-modal retrieval methods based on subspace learning just focus on learning the projection matrices that map different modalities to a common subspace and pay less attention to the retrieval task  ...  semi-supervised learning, task-related learning, and linear discriminative analysis into a unified framework for cross-modal retrieval (2) e class information of labelled data is propagated to unlabelled  ... 
doi:10.1155/2020/1462429 fatcat:nvdtfct4crdozf5wa6fppdaaaq
« Previous Showing results 1 — 15 out of 8,542 results