Acceleration of a Feature Selection Algorithm Using High Performance Computing

Bieito Beceiro, Jorge González-Domínguez, Juan Touriño
2020 Proceedings (MDPI)  
Feature selection is a subfield of data analysis that is on reducing the dimensionality of datasets, so that subsequent analyses over them can be performed in affordable execution times while keeping the same results. Joint Mutual Information (JMI) is a highly used feature selection method that removes irrelevant and redundant characteristics. Nevertheless, it has high computational complexity. In this work, we present a multithreaded MPI parallel implementation of JMI to accelerate its
more » ... elerate its execution on distributed memory systems, reaching speedups of up to 198.60 when running on 256 cores, and allowing for the analysis of very large datasets that do not fit in the main memory of a single node.
doi:10.3390/proceedings2020054054 fatcat:dod5tsfeljhdjhwzowj5ly3goy