7,004 Hits in 3.7 sec

Exploiting unlabeled data for improving accuracy of predictive data mining

Kang Peng, S. Vucetic, Bo Han, Hongbo Xie, Z. Obradovic
Third IEEE International Conference on Data Mining  
Predictive data mining typically relies on labeled data without exploiting a much larger amount of available unlabeled data.  ...  The goal of this paper is to show that using unlabeled data can be beneficial in a range of important prediction problems and therefore should be an integral part of the learning process.  ...  This property of unlabeled data makes it an attractive tool for improving accuracy of predictive data mining [17] .  ... 
doi:10.1109/icdm.2003.1250929 dblp:conf/icdm/PengVHXO03 fatcat:kpcikicbqrafjoelifd6puxjh4

A Novel Multi label Text Classification Model using Semi supervised learning

Shweta C Dharmadhikari
2012 International Journal of Data Mining & Knowledge Management Process  
Our experimental results indicate that the use of Semi Supervised Learning in MLTC greatly improves the decision making capability of classifier.  ...  We are proposing a new multi label text classification model for assigning more relevant set of categories to every input text document.  ...  Also, the abundantly available unlabeled data contains the joint distribution over features of a input dataset which may improve accuracy of overall classification process when used in conjunction with  ... 
doi:10.5121/ijdkp.2012.2402 fatcat:hhn3aa63zjdovnwgbvy25v236a

Context-Aware Collaborative Data Stream Mining in Ubiquitous Devices [chapter]

João Bártolo Gomes, Mohamed Medhat Gaber, Pedro A. C. Sousa, Ernestina Menasalvas
2011 Lecture Notes in Computer Science  
This paper motivates and describes a novel Context-aware Collaborative data stream mining system CC-Stream that allows intelligent mining and classification of time-changing data streams on-board ubiquitous  ...  CC-Stream explores the knowledge available in other ubiquitous devices to improve local classification accuracy.  ...  Bártolo Gomes is supported by a Phd Grant of the Portuguese Foundation for Science and Technology (FCT) and a mobility grant from Consejo Social of UPM that made possible his stay at the University of  ... 
doi:10.1007/978-3-642-24800-9_5 fatcat:2syrymi7nzhjddnj6trmnbmjmy

Identification of Blood Cell Subtypes from Images Using an Improved SSL Algorithm

Ioannis E Livieris
2018 Biomedical Journal of Scientific & Technical Research  
Semi-supervised learning algorithms constitute the appropriate machine learning methodology to exploit the knowledge hidden in the unlabeled data with the explicit classification information of labeled  ...  data for building powerful and effective classifiers.  ...  Naive Bayes (NB) [32] classifier constitutes one of the most popular classification techniques for data mining and machine learning.  ... 
doi:10.26717/bjstr.2018.09.001755 fatcat:gzlxbs7n2rbkpiib3mbybz6hta

A New Collaborative Filtering Recommendation Method Based on Transductive SVM and Active Learning

Xibin Wang, Zhenyu Dai, Hui Li, Jianfeng Yang, Qingyi Zhu
2020 Discrete Dynamics in Nature and Society  
Due to the benefits of the above design, the quality of unlabeled sample annotation can be improved; meanwhile, both the data sparsity and cold start problems are alleviated.  ...  Firstly, a "maximum-minimum segmentation" of version space-based AL strategy is developed to choose the most informative unlabeled samples for human annotation; it aims to choose the least data which is  ...  TSVM is an effective method to solve the lack of labels problem; it can make better use of unlabeled data to improve the prediction accuracy of the classifier.  ... 
doi:10.1155/2020/6480273 fatcat:vytpdbwaljas5lfs2lohhxejrm

Continual Semi-Supervised Learning through Contrastive Interpolation Consistency [article]

Matteo Boschini, Pietro Buzzega, Lorenzo Bonicelli, Angelo Porrello, Simone Calderara
2021 arXiv   pre-print
Subsequently, we design a novel CSSL method that exploits metric learning and consistency regularization to leverage unlabeled examples while learning.  ...  However, this clashes with many real-world applications: gathering labeled data, which is in itself tedious and expensive, becomes infeasible when data flow as a stream.  ...  Continual Learning with Unsupervised Data. Some attempts have been recently made at improving CL methods by exploiting unlabeled data.  ... 
arXiv:2108.06552v2 fatcat:joafoxbrpzh6dmobngbwumh7xm

A Semi-Supervised Method for Predicting Cancer Survival Using Incomplete Clinical Data [article]

Hamid Reza Hassanzadeh and John H. Phan and May D. Wang
2015 arXiv   pre-print
Our method is able to use unlabeled data to improve classification by adopting a semi-supervised training approach to learn an ensemble classifier.  ...  Prediction of survival for cancer patients is an open area of research. However, many of these studies focus on datasets with a large number of patients.  ...  James Cheng for assisting in manuscript preparation.  ... 
arXiv:1509.08888v1 fatcat:cegtutg4tfeexk6yml5yhv2wee

Urban Green Plastic Cover Mapping Based on VHR Remote Sensing Images and a Deep Semi-Supervised Learning Framework

Jiantao Liu, Quanlong Feng, Ying Wang, Bayartungalag Batsaikhan, Jianhua Gong, Yi Li, Chunting Liu, Yin Ma
2020 ISPRS International Journal of Geo-Information  
Afterwards, a semi-supervised learning strategy was proposed to integrate the limited labeled data and massive unlabeled data for model co-training.  ...  Experimental results indicate that the proposed method could accurately identify green plastic-covered regions in Jinan with an overall accuracy (OA) of 91.63%.  ...  Additionally, the authors would like to give special thanks to the anonymous reviewers and editors for their very useful comments and suggestions to help improve the quality of this paper.  ... 
doi:10.3390/ijgi9090527 fatcat:tucmpysn2rcxpgpdga3cwpv4yu

Inferring Gene Regulatory Networks: Challenges and Opportunities

Jason T L Wang
2015 Journal of Data Mining in Genomics & Proteomics  
• employing advanced data mining and machine learning algorithms to improve the accuracy of supervised or semi-supervised methods [11] ; • using new techniques to tackle constrained or condition-specific  ...  The accuracy of these algorithms is usually low. However, these algorithms are useful for organisms where training data are not available.  ... 
doi:10.4172/2153-0602.1000e118 fatcat:b74bhkkbr5df7myvrxm7nwoboa

Analysis of Semi Supervised Learning Methods towards Multi Label Text Classification

S. C.Dharmadhikari, Maya Ingle, Parag Kulkarni
2012 International Journal of Computer Applications  
The goal of Semi supervised learning is to reduce the classification errors using readily available unlabeled data in conjunction with available labeled data.  ...  The area of multi label text classification is getting more attention of researchers because of its role in the field of information retrieval , text mining , web mining etc.  ...  (By solving Sylvester eg) Merits: It offers effective utilization of large amount of unlabeled data and also able to exploit relationship between labels. Significant improvement in the accuracy.  ... 
doi:10.5120/5775-8026 fatcat:oqkzmctw2zao7hzrgjkt5ut2j4

Active Learning Based on Diversity Maximization

Yong Cheng Wu
2013 Applied Mechanics and Materials  
Therefore, as one type of the paradigms for addressing the problem of combining labeled and unlabeled data to boost the performance, active learning has attracted much attention.  ...  In many practical data mining applications, unlabeled training examples are readily available but labeled ones are fairly expensive to obtain.  ...  of (A,B)or the ensemble consists of ( C,D ); however, if we find that A and B make the same predictions on unlabeled data, while C and D make different predictions on some unlabeled data, then we will  ... 
doi:10.4028/ fatcat:dkgwryf5nrd4hexp24jo6hno5m

Co-mining: Self-Supervised Learning for Sparsely Annotated Object Detection [article]

Tiancai Wang, Tong Yang, Jiale Cao, Xiangyu Zhang
2021 arXiv   pre-print
In our Co-mining, two branches of a Siamese network predict the pseudo-label sets for each other.  ...  Most existing methods for sparsely annotated object detection either re-weight the loss of hard negative samples or convert the unlabeled instances into ignored regions to reduce the interference of false  ...  Acknowledgements This work was supported by The National Key Research and Development Program of China (No. 2020AAA0105200) and Beijing Academy of Artificial Intelligence (BAAI).  ... 
arXiv:2012.01950v2 fatcat:evjwqbrahzf5nkm3wbfahzg44y

Semi-Supervised Object Detection with Adaptive Class-Rebalancing Self-Training [article]

Fangyuan Zhang, Tianxiang Pan, Bin Wang
2021 arXiv   pre-print
This study delves into semi-supervised object detection (SSOD) to improve detector performance with additional unlabeled data.  ...  When using only 1\% labeled data in MS-COCO, our method achieves 17.02 mAP improvement over supervised baselines, and 5.32 mAP improvement compared with state-of-the-art methods.  ...  Although the recall improvement of the two-stage mining is higher than the accuracy improvement of the two-stage filtering, the AP 50:95 improvement of the latter is 2.73, which is higher compared to the  ... 
arXiv:2107.05031v1 fatcat:fxoe4ze4dzfjfacawqo42qwx6m

Semi-Supervised Target-Dependent Sentiment Classification for Micro-Blogs

Shadi I. Abudalfa, Moataz A. Ahmed
2019 Journal of Computer Science and Technology  
Such techniques need a huge amount of labeled data for increasing classification accuracy. However, preparing labeled data from social media needs a lot of efforts.  ...  Many such tools are currently available online for opinion mining in short text, known as micro-blogs, but their efficacies are still limited.  ...  Acknowledgements The authors wish to acknowledge King Fahd University of Petroleum and Minerals (KFUPM) for providing the facilities to carry out this research.  ... 
doi:10.24215/16666038.19.e06 fatcat:sx4wk3rhvnas5arzbmnc24apcy

Exploiting Unlabeled Data to Enhance Ensemble Diversity

Min-Ling Zhang, Zhi-Hua Zhou
2010 2010 IEEE International Conference on Data Mining  
Unlike existing semi-supervised ensemble methods where error-prone pseudo-labels are estimated for unlabeled data to enlarge the labeled data to improve accuracy, UDEED works by maximizing accuracies of  ...  In this paper, unlabeled data is exploited to facilitate ensemble learning by helping augment the diversity among the base learners.  ...  ACKNOWLEDGMENT The authors wish to thank the anonymous reviewers for their helpful comments in improving this paper.  ... 
doi:10.1109/icdm.2010.12 dblp:conf/icdm/ZhangZ10 fatcat:5247wqppbnhb7jjvnogihme3xu
« Previous Showing results 1 — 15 out of 7,004 results