The Sensitivity Conundrum – Random Forest or Boosting

Dhimant Ganatra
2020 International Journal of Emerging Trends in Engineering Research  
In the classification context, tree-based models are simple and useful for interpretation. However when it comes to model accuracy the single-tree model does not match the power of other supervised learning approaches. By aggregating trees a model's accuracy can be improved. Ensemble methods like random forest and boosting combine predictions from multiple models into one that is far superior to the individual models. Depending on the business goal, the accuracy paradox may come into play. The
more » ... lass statistics, precision and recall may be more important than the overall accuracy. The True Positive Rate varies based on the type of ensemble used among other factors. While both random forest and boosting lead to some loss of interpretability, the improvement in sensitivity will outweigh this loss. By tuning several of the model's parameters it is possible to achieve higher levels on any of the four counts in the confusion matrix. While both random forest and boosting use trees as base learners, they differ primarily in the way the trees are built. A comparison of both the approaches is made to identify a superior performer on the positive class.
doi:10.30534/ijeter/2020/56872020 fatcat:frmxpxtksbcgvbs6qqdf4ngieq