The CINSARC signature as a prognostic marker for clinical outcome in multiple neoplasms

Tom Lesluyes, Lucile Delespaul, Jean-Michel Coindre, Frédéric Chibon
2017 Scientific Reports  
We previously reported the CINSARC signature as a prognostic marker for metastatic events in soft tissue sarcomas, breast carcinomas and lymphomas through genomic instability, acting as a major factor for tumor aggressiveness. In this study, we used a published resource to investigate CINSARC enrichment in poor outcome-associated genes at pan-cancer level and in 39 cancer types. CINSARC outperformed more than 15,000 defined signatures (including cancer-related), being enriched in topranked poor
more » ... outcome-associated genes of 21 cancer types, widest coverage reached among all tested signatures. Independently, this signature demonstrated significant survival differences between risk-groups in 33 published studies, representing 17 tumor types. As a consequence, we propose the CINSARC prognostication as a general marker for tumor aggressiveness to optimize the clinical managements of patients. From the first report of gene expression quantification method by Schena et al. in 1995 1 , to RNA sequencing (RNA-seq), extensively used nowadays by international consortia to decipher transcriptomic abnormalities 2, 3 , gene expression has become an essential tool in cancer research. For two decades, microarrays provided much information, both at gene and transcript levels, on various oncogenic factors 4-7 . Following on, the next-generation sequencing (NGS) permitted sequencing of RNA fragments, to a single base-pair resolution, where RNA abundance is directly related to the proportion of sequenced reads mapped to a given gene 8 . Moreover, dedicated RNA-seq algorithms allow obtaining genomic information such as point mutations, insertions/deletions, translocations and genomic integrations from foreign organisms: well-known oncogenic and tumor progression mechanisms 9-11 . However, standard measurements are yet to be defined since numerous software exist for RNA-seq processing, including many different gene expression normalization methods 12 . Such transcriptomic investigations generated a large amount of data. Consequently, databases were set up to standardize all information associated with expression matrices, notably: organism, platform identifier, normalization, unit measurement and study-specific information (i.e. cell types, treatments, time series, sampling and culture conditions, etc.). Among the many available gene expression databases, the two most used are Gene Expression Omnibus from the National Center for Biotechnology Information (http://www.ncbi.nlm.nih.gov/geo) 13 and ArrayExpress from the European Bioinformatics Institute (http://www.ebi.ac.uk/arrayexpress) 14 . These resources, by gathering results from microarrays and RNA-seq experiments, provide an easy access to millions of cancer-related transcriptomic profiles (cell lines, primary tumors and metastases/relapses). Since Golub et al. identified a specific gene set capable of distinguishing acute myeloid leukemia from acute lymphoblastic leukemia in the late 90s 15 , establishment of gene expression signature remains a key part of cancer research. This first study demonstrated the possibility to use gene expression as a new in silico classifier, whereas previous options were limited to clinical observations and immunohistochemistry experiments. Subsequently, multiple gene sets were defined, not only to differentiate entities, but also to try predicting disease evolution. Two publications in early 2000s demonstrated the usefulness of transcriptomic profile as a survival indicator in breast cancer by focusing on specific genes 16, 17 . Few years later, a two-gene expression ratio was found to be a good predictor of tamoxifen response for breast cancer 18 . Then, inferring chromosomal instability from gene expression has become a promising predictor of clinical outcome in various cancers 19 . Published: xx xx xxxx OPEN www.nature.com/scientificreports/ 2 Scientific RepoRts | 7: 5480 |
doi:10.1038/s41598-017-05726-x pmid:28710396 pmcid:PMC5511191 fatcat:g4yndf36ffcx3nprnei5bvbgeq