Counterfactual inference for single-cell gene expression analysis [article]

Yongjin Park, Manolis Kellis
2021 medRxiv   pre-print
Finding a causal gene from case-control studies is a classic and fundamental problem in genomics. To date, we still ask which genes are differentially regulated by a disease with single-cell sequencing data, but in a cell-type-specific way. Here, we present a causal inference framework that effectively adjusts confounding effects, not requiring prior knowledge of control genes or cells. We demonstrate that our causal inference algorithm substantially improves statistical power in simulations
more » ... r in simulations and real-world data analysis of 70k brain cells, collected for dissecting Alzheimer's disease (AD) mechanisms. We identified that 377 causal genes are differentially regulated by the disease in various brain cell types, including highly-relevant AD genes with a proper cell type annotation, such as DGKD in neurons, SNCA in microglia, PIAS in oligodendrocyte progenitor cells, and FGFR2 in astrocytes. Causal genes in different cell types also enrich distinctive pathways, highlighting multiple components of the disease progressions.
doi:10.1101/2021.01.21.21249765 fatcat:mg452jju7zddjhdhwsxctibota