Evolutionary Active Constrained Clustering for Obstructive Sleep Apnea Analysis
Data Science and Engineering
We introduce a novel interactive framework to handle both instance-level and temporal smoothness constraints for clustering large longitudinal data and for tracking the cluster evolutions over time. It consists of a constrained clustering algorithm, called CVQE+, which optimizes the clustering quality, constraint violation and the historical cost between consecutive data snapshots. At the center of our framework is a simple yet effective active learning technique, named Border, for iteratively
... r, for iteratively selecting the most informative pairs of objects to query users about, and updating the clustering with new constraints. Those constraints are then propagated inside each data snapshot and between snapshots via two schemes, called constraint inheritance and constraint propagation, to further enhance the results. Moreover, a historical constraint is enforced between consecutive snapshots to ensure the consistency of results among them. Experiments show better or comparable clustering results than state-of-the-art techniques as well as high scalability for large datasets. Finally, we apply our algorithm for clustering phenotypes in patients with Obstructive Sleep Apnea as well as for tracking how these clusters evolve over time.