Robustness and applicability of functional genomics tools on scRNA-seq data [article]

Christian H. Holland, Jovan Tanevski, Jan Gleixner, Manu P. Kumar, Elisabetta Mereu, Javier Perales-Paton, Brian A. Joughin, Oliver Stegle, Douglas A. Lauffenburger, Holger Heyn, Bence Szalai, Julio Saez-Rodriguez
2019 bioRxiv   pre-print
Many tools have been developed to extract functional and mechanistic insight from bulk transcriptome profiling data. With the advent of single-cell RNA sequencing (scRNA-seq), it is in principle possible to do such an analysis for single cells. However, scRNA-seq data has characteristics such as drop-out events, low library sizes and a comparatively large number of samples/cells. It is thus not clear if functional genomics tools established for bulk sequencing can be applied to scRNA-seq in a
more » ... aningful way. To address this question, we performed benchmark studies on in silico and in vitro single-cell RNA-seq data. We included the bulk-RNA tools PROGENy, GO enrichment and DoRothEA that estimate pathway and transcription factor (TF) activities, respectively, and compared them against the tools AUCell and metaVIPER, designed for scRNA-seq. For the in silico study we simulated single cells from TF/pathway perturbation bulk RNA-seq experiments. Our simulation strategy guarantees that the information of the original perturbation is preserved while resembling the characteristics of scRNA-seq data. We complemented the in silico data with in vitro scRNA-seq data upon CRISPR-mediated knock-out. Our benchmarks on both the simulated and real data revealed comparable performance to the original bulk data. Additionally, we showed that the TF and pathway activities preserve cell-type specific variability by analysing a mixture sample sequenced with 13 scRNA-seq different protocols. Our analyses suggest that bulk functional genomics tools can be applied to scRNA-seq data, outperforming dedicated single cell tools. Furthermore we provide a benchmark for further methods development by the community.
doi:10.1101/753319 fatcat:dgzg77k37zcsnjlffzsiwxuv2e