A logic approach for reducing the computational complexity of the attribute reduction problem

Sirzat Kahramanli, Mehmet Hacibeyoglu
2011 2011 5th International Conference on Application of Information and Communication Technologies (AICT)  
The goal of attribute reduction is to find a minimal subset (MS) R of the condition attribute set C of a dataset such that R has the same classification power as C. It was proved that the number of MSs for a dataset with n attributes may be as large as ( n n/2 ) and the generation of all of them is an NP-hard problem. The main reason for this is the intractable space complexity of the conversion of the discernibility function (DF) of a dataset to the disjunctive normal form (DNF). Our analysis
more » ... f many DF-to-DNF conversion processes showed that approximately (1 − 2/( n n/2 ) × 100) % of the implicants generated in the DF-to-DNF process are redundant ones. We prevented their generation based on the Boolean inverse distribution law. Due to this property, the proposed method generates 0.5 × ( n n/2 ) times fewer implicants than other Boolean logic-based attribute reduction methods. Hence, it can process most of the datasets that cannot be processed by other attribute reduction methods. or discernibility functions (DFs) have been developed [2, 4, 5] . Attribute reduction provides the following benefits for processing datasets: reducing the dimensionality of feature space, improving the efficiency and precision of data classification rules, speeding up the data mining algorithms, facilitating the data collection process, and reducing the amount of memory needed for storing the datasets [2, [5] [6] [7] . For instance, the application of attribute reduction to the dataset "Lung Cancer" [8], with 56 attributes and 32 objects, showed that this dataset could be classified with only 4 attributes from the original 56. That is to say, this dataset could be reduced from 56 to 4 columns and classified by rules with only 4 conditions instead of 56. Due to the mentioned benefits, attribute reduction is widely used for preprocessing the datasets used in many fields, including data mining, decision support systems, knowledge acquisition and discovery, pattern recognition, machine learning, text categorization, customer relationship management, intrusion detection, weather forecast, economic forecasts, fault diagnosis, and forecasting [2, 6, 7] .
doi:10.1109/icaict.2011.6111006 fatcat:qas7uvfaevfh3ibjus7j6g2hme