Machine Learning Methods for Detecting Fraud in Online Marketplaces

Raoul Dekou, Sabljic Savo, Simon Kufeld, Diana Francesca, Ricardo Kawase
2021 International Conference on Information and Knowledge Management  
Connecting buyers and sellers in a safe and secure environment is one of the biggest challenges in online marketplaces. Probabilistic models built upon user-item databases address the challenge, but often encounter issues such as lack of stability and robustness. These issues are magnified in fraud scenarios where datasets are highly imbalanced, noisy and malicious users deliberately adapt their behaviors to avoid detection. In this context, we leveraged the power of existing open sources
more » ... e learning libraries H2O and Catboost and designed a pipeline to collect, process and predict the likelihood of a private seller's listing data to be fraudulent. We found that the stacked ensemble model provides the best performance (F1=0.73) when compared to other commonly used models in the field. Further, our models are benchmarked on a public Kaggle Dataset, TalkingData AdTracking Fraud Detection Challenge where we compared them to other studies and highlighted their generalizability and effectiveness at handling online fraud.
dblp:conf/cikm/DekouSKFK21 fatcat:kx3ms2g6unddtimtrxe5ed7pce