1,263 Hits in 4.5 sec

A Novel Over-Sampling Method and its Application to Cancer Classification from Gene Expression Data

Xuan Tho Dang, Osamu Hirose, Duong Hung Bui, Thammakorn Saethang, Vu Anh Tran, Lan Anh T. Nguyen, Tu Kien T. Le, Mamoru Kubo, Yoichi Yamada, Kenji Satou
2013 Chem-Bio Informatics Journal  
One of the most critical and frequent problems in biomedical data classification is imbalanced class distribution, where samples from the majority class significantly outnumber the minority class.  ...  SMOTE is a well-known general over-sampling method used to address this problem; however, in some cases it cannot improve or even reduces classification performance.  ...  This means that if we use a kernel method like SVM, the safety is not guaranteed in the kernel space.  ... 
doi:10.1273/cbij.13.19 fatcat:c3prq5oeundvjk3rgexeatfas4

Imbalanced Data Classification for Multi-source Heterogenous Sensor Networks

Wei Wang, Mengjun Zhang, Li Zhang, Qiong Bai
2020 IEEE Access  
Most of the traditional classification algorithms are based on the uniform distribution of samples, and the effect is not ideal when dealing with such data, which mainly shows that the classification results  ...  Therefore, we propose the imbalanced multi-source heterogeneous data classification algorithms in this paper, which are mainly based on the expansion and extension of Support Vector Machines.  ...  So, the sample size in some datasets is much smaller than in other categories, that is, imbalanced dataset [1] .  ... 
doi:10.1109/access.2020.2966324 fatcat:h3eta76sh5hdvphgtz76lfmwiy

An Improved Oversampling Algorithm Based on the Samples' Selection Strategy for Classifying Imbalanced Data

Wenhao Xie, Gongqian Liang, Zhonghui Dong, Baoyu Tan, Baosheng Zhang
2019 Mathematical Problems in Engineering  
In this paper, an improved oversampling algorithm based on the samples' selection strategy for the imbalanced data classification is proposed.  ...  The imbalanced data sets exist widely in the real world, and the classification for them has become one of the hottest issues in the field of data mining.  ...  effect for the imbalanced data.  ... 
doi:10.1155/2019/3526539 fatcat:y3ylyjg37va7la5efvbbam5bra

Customer Credit Scoring Method Based on the SVDD Classification Model with Imbalanced Dataset [chapter]

Bo Tian, Lin Nan, Qin Zheng, Lei Yang
2010 Communications in Computer and Information Science  
Our experimental results confirm that our approach is effective in ranking and classifying customer credit.  ...  Customer credit scoring is a typical class of pattern classification problem with imbalanced dataset.  ...  Acknowledgment The work was partly supported by the National Natural Science Foundation of China (70971083), Leading Academic Discipline Program, 211 Project for Shanghai University of Finance and Economics  ... 
doi:10.1007/978-3-642-16397-5_4 fatcat:gdh5lua3dfd47mujpho3tfq2li

A Comparison of Oversampling Methods on Imbalanced Topic Classification of Korean News Articles

Yirey Suh, Cheongtag Kim, Leegu Song, Jaemyung Yu, Jonghoon Mo
2017 Journal of Cognitive Science  
Machine learning has progressed to match human performance, including the field of text classification. However, when training data are imbalanced, classifiers do not perform well.  ...  Oversampling is one way to overcome the problem of imbalanced data and there are many oversampling methods that can be conveniently implemented.  ...  Minority class samples are randomly selected and replicated in feature space until the number of minority class samples match that of majority class samples.  ... 
doi:10.17791/jcs.2017.18.4.391 fatcat:5mqa6rz6jffn7mlsbwf4o3sg4u

CGMOS: Certainty Guided Minority OverSampling [article]

Xi Zhang and Di Ma and Lin Gan and Shanshan Jiang and Gady Agam
2016 arXiv   pre-print
Handling imbalanced datasets is a challenging problem that if not treated correctly results in reduced classification performance.  ...  In this paper we propose a novel extension to the SMOTE algorithm with a theoretical guarantee for improved classification performance.  ...  Instead, CGMOS computes the Bayes classification certainties for both the majority and minority classes and then synthesize new samples based on improvement of the certainties for samples in both classes  ... 
arXiv:1607.06525v1 fatcat:evvsuhbp6rhbhicmcmnc7vdgfy

A Hybrid Sampling SVM Approach to Imbalanced Data Classification

Qiang Wang
2014 Abstract and Applied Analysis  
In this paper, a hybrid sampling SVM approach is proposed combining an oversampling technique and an undersampling technique for addressing the imbalanced data classification problem.  ...  Imbalanced datasets are frequently found in many real applications. Resampling is one of the effective solutions due to generating a relatively balanced class distribution.  ...  Furthermore, the dot product Φ(x ) ⋅ Φ(x ) in the transformed space can be expressed as the kernel function (x , x ) = Φ(x ) ⋅ Φ(x ).  ... 
doi:10.1155/2014/972786 fatcat:maw524vwgrgi7lluo4i7sweg7i

Towards Fair Cross-Domain Adaptation via Generative Learning [article]

Tongxin Wang, Zhengming Ding, Wei Shao, Haixu Tang, Kun Huang
2020 arXiv   pre-print
Specifically, generative feature augmentation is explored to synthesize effective training data for few-shot source classes, while effective cross-domain alignment aims to adapt knowledge from source to  ...  However, in real-world applications, labeled samples for some categories in the source domain could be extremely few due to the difficulty of data collection and annotation, which leads to decreasing performance  ...  original feature space to a universal Reproducing Kernel Hilbert Space (RKHS) H.  ... 
arXiv:2003.02366v2 fatcat:73q2wegggjhixo462wya2czore

Protein Subnuclear Localization Based on Radius-SMOTE and Kernel Linear Discriminant Analysis Combined with Random Forest

Liwen Wu, Shanshan Huang, Feng Wu, Qian Jiang, Shaowen Yao, Xin Jin
2020 Electronics  
Second, the Radius-SMOTE is used to expand the samples of minority classes to deal with the problem of imbalance in datasets.  ...  The results indicate that the proposed method can achieve better effect compared with other conventional methods, and it can also improve the accuracy for both majority and minority classes effectively  ...  (a) Minority samples synthesized by SMOTE. (b) Overlap problem of minority samples synthesized by SMOTE. Figure 3 . 3 Figure 3. (a) Minority samples synthesized by Radius-SMOTE.  ... 
doi:10.3390/electronics9101566 fatcat:gijx2egnajgb7czcal6hsjs5my

A Sampling Method Based on Gauss Kernel Learning and the Expanding Research

Shunzhi Zhu, Kaibiao Lin, Zhiqiang Zeng, Lizhao Liu, Wenxing Hong
2012 Journal of Computers  
The method first preprocesses the data by oversampling the minority class in kernel space, and then the pre-images of the synthetic samples are found based on a distance relation between kernel space and  ...  As a result, the inconsistency which is brought about by processing samples in different spaces is overcome.  ...  Funding from key lab of Spatial Data Mining and Information Sharing, Ministry of Education, Fuzhou University ;Open Fund (BLISSOS2010102) Funding from key lab of of Brain Imitation Intelligent Systems in  ... 
doi:10.4304/jcp.7.2.547-554 fatcat:fb47fbcq6bfxvj74njywzawpiy

Towards Fair Knowledge Transfer for Imbalanced Domain Adaptation [article]

Taotao Jing, Bingrong Xu, Jingjing Li, Zhengming Ding
2020 arXiv   pre-print
To this end, we propose a Towards Fair Knowledge Transfer (TFKT) framework to handle the fairness challenge in imbalanced cross-domain learning.  ...  Unfortunately, they ignore the fairness issue when the auxiliary source is extremely imbalanced across different categories, which results in severe under-presented knowledge adaptation of minority source  ...  synthesized samples, which also proves the effectiveness of the CPA strategy intuitively.  ... 
arXiv:2010.12184v2 fatcat:yenza66qsjfbfggavz7hamjxia

Manifold-based synthetic oversampling with manifold conformance estimation

Colin Bellinger, Christopher Drummond, Nathalie Japkowicz
2017 Machine Learning  
Classification domains such as those in medicine, national security and the environment regularly suffer from a lack of training instances for the class of interest.  ...  generate additional training samples.  ...  In order to limit the impact of sampling during the preprocessing to create imbalanced binary classification problems, we record mean AUC performance results over thirty trials for the baseline classifiers  ... 
doi:10.1007/s10994-017-5670-4 fatcat:562hbc454vhuho2qex6jc6ubyy

A novel over-sampling method and its application to miRNA prediction

Xuan Tho Dang, Osamu Hirose, Thammakorn Saethang, Vu Anh Tran, Lan Anh T. Nguyen, Tu Kien T. Le, Mamoru Kubo, Yoichi Yamada, Kenji Satou
2013 Journal of Biomedical Science and Engineering  
For example, SMOTE is a famous and general over-sampling method addressing this problem, however in some cases it cannot improve or sometimes reduces classification performance.  ...  of imbalanced class distribution in the datasets.  ...  In our research, we used Radial Basis kernel (Gaussian kernel) of kernlab for SVM.  ... 
doi:10.4236/jbise.2013.62a029 fatcat:k7ctiy35eneuhjrltfxz5prqoi

Classification of Imbalanced Data Using Deep Learning with Adding Noise

Wan-Wei Fan, Ching-Hung Lee, Binghua Cao
2021 Journal of Sensors  
In addition, a simple design method for selecting structure of CNN is first introduced and then, we add noise in feature space of CNN to obtain proper features by a training process and to improve the  ...  This paper proposes a method to treat the classification of imbalanced data by adding noise to the feature space of convolutional neural network (CNN) without changing a data set (ratio of majority and  ...  Acknowledgments This work was supported in part by the Ministry of Science and Technology, Taiwan, under contracts MOST 110-2634-F-009-024, 109-2634-F-009-031, and 109-2218-E-005-015.  ... 
doi:10.1155/2021/1735386 fatcat:aioevowddbbmhbsnhuzhvsoebq

Emotion Recognition from Single-Trial EEG Based on Kernel Fisher's Emotion Pattern and Imbalanced Quasiconformal Kernel Support Vector Machine

Yi-Hung Liu, Chien-Te Wu, Wei-Teng Cheng, Yu-Tsung Hsiao, Po-Ming Chen, Jyh-Tong Teng
2014 Sensors  
The feature vector produced by layer 2 is called a kernel Fisher's emotion pattern (KFEP), and is sent into layer 3 for further classification where the proposed imbalanced quasiconformal kernel support  ...  Furthermore, to collect effective training and testing datasets for the current EEG-ER system, we also use an emotion-induction paradigm in which a set of pictures selected from the International Affective  ...  Jyh-Tong Teng helped provide research resources and involved in project discussion.  ... 
doi:10.3390/s140813361 pmid:25061837 pmcid:PMC4179000 fatcat:w4a2lypq5varrlf5omdh6utmce
« Previous Showing results 1 — 15 out of 1,263 results