A Graph Coarsening Algorithm for Compressing Representations of Single-Cell Data with Clinical or Experimental Attributes [article]

Chi-Jane Chen, Emma Crawford, Natalie Stanley
2022 bioRxiv   pre-print
Graph-based algorithms have become essential in the analysis of single-cell data for numerous tasks, such as automated cell-phenotyping and identifying cellular correlates of experimental perturbations or disease states. In large multi-patient, multi-sample single-cell datasets, the analysis of cell-cell similarity graphs representations of these data becomes computationally prohibitive. Here, we introduce cytocoarsening, a novel graph-coarsening algorithm that significantly reduces the size of
more » ... single-cell graph representations, which can then used as input to downstream bioinformatics algorithms for improved computational efficiency. Uniquely, cytocoarsening considers both phenotypical similarity of cells and similarity of cells' associated clinical or experimental attributes in order to more readily identify condition-specific cell populations. The resulting coarse graph representations were evaluated based on both their structural correctness and the capacity of downstream algorithms to uncover the same biological conclusions as if the full graph had been used. Cytocoarsening is provided as open source code at https://github.com/ChenCookie/cytocoarsening.
doi:10.1101/2022.07.30.502142 fatcat:fqj7b3l5rndapc2fqntkkuvbja