Statistical Methods for Deconvolution in Cancer Genomics [thesis]

Liuqing Yang
With the advance of deep sequencing techniques, intratumor heterogeneity becomes a prevalent confounding factor to tumor genomic profiling studies. The heterogeneous composition of a tumor tissue can potentially lead to false positive differential expression conclusions and influence patients' clinical outcomes and therapeutic responses. Many deconvolution methods aiming to separate the subcomponent signals have been developed in the past decades, modeling the tumor genomic profiling as a
more » ... combination of the abundance of the mixing components. In this dissertation, we characterize a two- components (tumor versus non-tumor) model and develop a Fast Tumor Deconvolution (FasTD) pipeline to address the heterogeneity issue. We build a semi-parametric regression- based framework utilizing raw measured gene expression values, and provide mixing proportions and individual component genomic profiles as outputs. We demonstrate our method and show it is more than a thousand times faster than several current probabilistic models. Both simulated data and real data applications are provided to demonstrate the effectiveness of our proposed method. Our method is then extended to deconvolve heterogeneous tumor samples with more than two subcomponents. The extended pipeline (FasTDK) can effectively deconvolve an unknown component in K-subcomponent mixtures provided with K-1 reference profiles.
doi:10.17615/28pr-1g32 fatcat:4qyj3fdiqrgtnpckuooeceomju