321 Hits in 2.5 sec

Ensemble Learning with Attention-Integrated Convolutional Recurrent Neural Network for Imbalanced Speech Emotion Recognition

Xusheng Ai, Victor S. Sheng, Wei Fang, Charles X. Ling, Chunhua Li
2020 IEEE Access  
The average ranks of Bagging and OS (oversampling) are better than the rank of the base method.  ...  The average ranks of Oversampling and Bagging even falls behind the average rank of the benchmark method in terms of UAF.  ... 
doi:10.1109/access.2020.3035910 fatcat:rgf2mnn3vnhifhez6e72a5fihi

Improving Detection of False Data Injection Attacks Using Machine Learning with Feature Selection and Oversampling

Ajit Kumar, Neetesh Saxena, Souhwan Jung, Bong Jun Choi
2021 Energies  
In particular, the injection of false data and commands into communication is one of the most common and fatal cyberattacks in critical infrastructures.  ...  Our results show that the considered minority oversampling techniques can improve the Area Under Curve (AUC) of GradientBoosting, AdaBoost, and kNN by 10–12%.  ...  An IDS using a semi-supervised system for attack localization and deep neural network learning for anomaly detection was proposed in [23] .  ... 
doi:10.3390/en15010212 fatcat:7zvo767f7nexvi6ytqyv5uokcy

Predicting Fault-Prone Software Modules with Rank Sum Classification

Jaspar Cahill, James M. Hogan, Richard Thomas
2013 2013 22nd Australian Software Engineering Conference  
This rank sum representation offers improved or at worst comparable performance to earlier approaches for standard data sets, and readily allows the user to choose an appropriate trade-off between precision  ...  The problem is then to identify such modules reliably and automatically, thus making a more intensive local effort worthwhile.  ...  Some of this work appears promising -the weak classifiers inherent in this problem are an ideal candidate for boosting -and the combination of data sets is a very attractive notion.  ... 
doi:10.1109/aswec.2013.33 dblp:conf/aswec/CahillHT13 fatcat:jekxqazjhzbwti2agujfw3hoqq

A Novel Approach for Handling Imbalanced Data in Medical Diagnosis using Undersampling Technique

Varsha Babar, Roshani Ade
2016 Communications on Applied Electronics  
To deal with such type of imbalance, several undersampling as well as oversampling methods were proposed.  ...  Experiments are performed on 5 real world data sets for the evaluation of performance of proposed work.  ...  Another technique, namely, Ranked Minority Oversampling and Boosting (RAMOBoost) has been proposed in [15] which adaptively assigns a rank to each minority instance at every iteration and generates synthetic  ... 
doi:10.5120/cae2016652323 fatcat:mamfwtweqrb3bkzmncnnhexf2q

LinkBoost: A Novel Cost-Sensitive Boosting Framework for Community-Level Network Link Prediction

Prakash Mandayam Comar, Pang-Ning Tan, Anil K. Jain
2011 2011 IEEE 11th International Conference on Data Mining  
Typical link prediction methods can be categorized as either local or global.  ...  This paper presents a community (cluster) level link prediction method without the need to explicitly identify the communities in a network.  ...  This divide-and-conquer strategy is well suited both for the link prediction problem and the boosting framework since link formation is typically a local phenomenon, in the sense that there are several  ... 
doi:10.1109/icdm.2011.93 dblp:conf/icdm/ComarTJ11 fatcat:rsyxeck22fawrala3vwnmm6hx4

EPRENNID: An evolutionary prototype reduction based ensemble for nearest neighbor classification of imbalanced data

Sarah Vluymans, Isaac Triguero, Chris Cornelis, Yvan Saeys
2016 Neurocomputing  
Classification problems with an imbalanced class distribution have received an increased amount of attention within the machine learning community over the last decade.  ...  It uses minority elements located near the decision boundaries as seeds for the construction of artificial instances.  ...  When updating the scale factors F i , two local searches are used: the golden section search and hill-climbing. We refer to [28] for further details.  ... 
doi:10.1016/j.neucom.2016.08.026 fatcat:zu74vynjsbgujimhw3dlnl6jfa


Nitesh V. Chawla, Nathalie Japkowicz, Aleksander Kotcz
2004 SIGKDD Explorations  
ACKNOWLEDGEMENTS We thank the reviewers for their useful and timely comments on the papers submitted to this Issue.  ...  We would also like to thank the participants and attendees of the previous workshops for the enlightening presentations and discussions.  ...  The important differentiator from other semisupervised problems (e.g., [40] ) is that there are no labeled seed data to initialize the estimation model for the missing class.  ... 
doi:10.1145/1007730.1007733 fatcat:tdpfkg6vgbgqrpjtclkhrz5rne

RB-CCR: Radial-Based Combined Cleaning and Resampling algorithm for imbalanced data classification [article]

Michał Koziarski, Colin Bellinger, Michał Woźniak
2021 arXiv   pre-print
In particular, RB-CCR exploits the class potential to accurately locate sub-regions of the data-space for synthetic oversampling.  ...  The category sub-region for oversampling can be specified as an input parameter to meet domain-specific needs or be automatically selected via cross-validation.  ...  It is worth also mentioning SMOTEBoost, which is based on a combination of the SMOTE algorithm and the boosting procedure [31] .  ... 
arXiv:2105.04009v1 fatcat:5to34dfbmfdz3bu3yqzq5rjxwa

(ISSBM) Improved Synthetic Sampling based on Model for Imbalance Data

Ragini Gour, Ramratan Ahirwal
2021 International Journal of Computer Applications  
The forecast performances usually depreciate as classifiers learn from data imbalanced, as most of classifiers presume the class division is balanced or the costs for different types of classification  ...  b) Second different random seeds for performing sampling method.  ...  methods for each splitting data with another 8 different seeds.  ... 
doi:10.5120/ijca2021921342 fatcat:bcqju5utgzaolhw4pjyh3wvk5a

Imbalanced Big Data Oversampling: Taxonomy, Algorithms, Software, Guidelines and Future Directions [article]

William C. Sleeman IV, Bartosz Krawczyk
2021 arXiv   pre-print
While oversampling algorithms are an effective way for handling class imbalance, they have not been designed for distributed environments.  ...  In this paper, we propose a holistic look on oversampling algorithms for imbalanced big data.  ...  We have also created guidelines and future directions for designing novel oversampling algorithms for imbalanced big data that can be adopted by the research community.  ... 
arXiv:2107.11508v1 fatcat:7t4h7k5qujhk7idbjqx5fgjwyu

Do David and Goliath Play the Same Game? Explanation of the Abundance of Rare and Frequent Invasive Alien Plants in Urban Woodlands in Warsaw, Poland

Artur Obidziński, Piotr Mędrzycki, Ewa Kołaczkowska, Wojciech Ciurzycki, Katarzyna Marciszewska, RunGuo Zang
2016 PLoS ONE  
Both frequent and rare taxa share a similar hierarchy of predictors' importance: Land use > Tree stand > Seed source and, for frequent taxa, Forest properties as well.  ...  The aim of this paper is an estimation of the influence of invasive plants frequency on the explanation of their local abundance.  ...  ., for providing us with forest management plans and data about the history of Warsaw's municipal urban woodlands. We also thank our students for their help in field data collection.  ... 
doi:10.1371/journal.pone.0168365 pmid:27992516 pmcid:PMC5161360 fatcat:xqcow4ozfzadhim5cnjgtrqdfe

QSAR Models for Active Substances against Pseudomonas aeruginosa Using Disk-Diffusion Test Data

Cosmin Alexandru Bugeac, Robert Ancuceanu, Mihaela Dinu
2021 Molecules  
used for this purpose in the literature and we decided to explore their use in this sense.  ...  In total, 32 models were built for each set of descriptors or fingerprint and balancing method, of which 28 were selected and stacked to create meta-models.  ...  The importance attributed to these two descriptors was relatively low, though (the highest rank for BCUTi-1h was 5, whereas for VSA_EState1 the highest rank was 20).  ... 
doi:10.3390/molecules26061734 pmid:33808845 pmcid:PMC8003670 fatcat:xgblk7sfpbbh5bklp3bo4go3yq

Classification Techniques for Intrusion Detection An Overview

P. Amudha, S. Karthik, S. Sivakumari
2013 International Journal of Computer Applications  
Sheng Chen et al. [38] presented ranked Minority Oversampling in Boosting (RAMOBoost) which is an integration of ensemble learning methodology with RAMO technique.  ...  Boosting has attracted much attention in the machine learning community as well as in statistics mainly because of its excellent performance and computational attractiveness for large datasets.  ... 
doi:10.5120/13334-0928 fatcat:pcjh5osxvbdbzfdkmiq5f7svga

A geometric approach to characterize the functional identity of single cells

Shahin Mohammadi, Vikram Ravindra, David F. Gleich, Ananth Grama
2018 Nature Communications  
Acknowledgements This work is supported by the NSF Center for Science of Information STC (CCF-0939370), NSF Grants BIO 1124962, NSF IIS-1546488, NSF CCF-1149756, IIS-1422918, the DARPA SIMPLEX program,  ...  In this graph, oversampling can be identified by the emergence of dense local regions.  ...  For the Melanoma dataset, however, there is no consensus among the top-ranked methods.  ... 
doi:10.1038/s41467-018-03933-2 pmid:29666373 pmcid:PMC5904143 fatcat:q7a4ygolo5cvzitnnhx3fjsbym

Intent Classification of Short-Text on Social Media

Hemant Purohit, Guozhu Dong, Valerie Shalin, Krishnaprasad Thirunarayan, Amit Sheth
2015 2015 IEEE International Conference on Smart City/SocialCom/SustainCom (SmartCity)  
Social media platforms facilitate the emergence of citizen communities that discuss real-world events.  ...  Hence, mining intent from social data can aid in filtering social media to support organizations, such as an emergency management unit for resource planning.  ...  We also acknowledge our colleagues at Kno.e.sis Center, and collaborators at OSU and QCRI for the invaluable discussion and feedback to improve our results.  ... 
doi:10.1109/smartcity.2015.75 dblp:conf/smartcity/PurohitDSTS15 fatcat:atcyriyfenbftf3eeyjb775dzu
« Previous Showing results 1 — 15 out of 321 results