An interval-valued data classification method based on the unified representation frame

Xiaobo Qi, Husheng Guo, Zadorozhnyi Artem, Wenjian Wang
2020 IEEE Access  
Interval-valued data (IVD) is a kind of data where each feature is an interval. The midpoint and boundary are the two commonly used methods for representing IVD. However, their structure information (such as location, size) may be incomplete because only midpoint or endpoint is adopted which will lead to poor results of data processing. To better depict the structural information of IVD, a unified representation frame (URF) for IVD is proposed. It not only takes into account the size and
more » ... the size and location information, but the relationship between them as well. This frame can also represent the midpoint and boundary methods in a unified way. Besides, symmetrical uncertainty (SU) is adopted to measure the relationship between features and classes quantitatively, and irrelevant features will be eliminated based on SU. The proposed URF_ SU is applied in some traditional classifiers like LIBSVM, CART Tree and KNN. The experimental results on synthetic and real-world datasets demonstrate that the proposed approach is more effective than other representation methods of IVD in classification tasks. INDEX TERMS Interval-valued data, unified representation frame, symmetrical uncertainty, feature selection. 17002 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see http://creativecommons.org/licenses/by/4.0/ VOLUME 8, 2020
doi:10.1109/access.2020.2967780 fatcat:mthmlognbnbnncmwi6keb4r3we