CNCDatabase: a database of non-coding cancer drivers [article]

Eric Minwei Liu, Alexander Martinez-Fundichely, Rajesh Bollapragada, Maurice Spiwack, Ekta Khurana
2020 bioRxiv   pre-print
Most mutations in cancer genomes occur in the non-coding regions with unknown impact to tumor development. Although the increase in number of cancer whole-genome sequences has revealed numerous putative non-coding cancer drivers, their information is dispersed across multiple studies and thus it is difficult to bridge the understanding of non-coding alterations, the genes they impact and the supporting evidence for their role in tumorigenesis across multiple cancer types. To address this gap,
more » ... address this gap, we have developed CNCDatabase, Cornell Non-Coding Cancer driver Database (https://cncdatabase.med.cornell.edu/) that contains detailed information about predicted non-coding drivers at gene promoters, 5- and 3-UTRs (untranslated regions), enhancers, CTCF insulators and non-coding RNAs. CNCDatabase documents 1,111 protein-coding genes and 90 non-coding RNAs with reported drivers in their non-coding regions from 32 cancer types by computational predictions of positive selection in whole-genome sequences; differential gene expression in samples with and without mutations; or another set of experimental validations including luciferase reporter assays and genome editing. The database can be easily modified and scaled as lists of non-coding drivers are revised in the community with larger whole-genome sequencing studies, CRISPR screens and further experimental validations. Overall, CNCDatabase provides a helpful resource for researchers to explore the pathological role of non-coding alterations and their associations with gene expression in human cancers.
doi:10.1101/2020.04.29.069047 fatcat:3e7u22couzdjtlzln2wytep5r4