Accurate prognosis for localized prostate cancer through coherent voting networks and multi-omic data
release_davv2sli7fbjhmhz7ewxjistle
2022
Abstract
Background: Prostate cancer is a very heterogeneous disease, from both a clinical and a biological/biochemical point of
view, which makes the task of producing a stratification of patients into risk classes remarkably challenging. In particular, it is
important an early detection and discrimination of the indolent forms of the disease, from the aggressive ones, requiring closer
surveillance and timely treatment decisions.
Methods: We extend a recently developed supervised machine learning (ML) technique, called coherent voting networks (CVN)
by incorporating novel model-selection technique to counter model overfitting. The CVN method is then applied to the problem
of predicting an accurate prognosis (with a time granularity of 1 year) for patients affected by prostate cancer. The CVN is
developed on a discovery cohort of 495 patients from the TCGA-PRAD collection, and validated on several other independent
cohorts, comprising in total of 744 patients.
Findings: We uncover seven multi-gene fingerprints, each comprising six to seven genes, that correspond to different input
data types (mRNA expression, proteomic assays, or methylation) and different time points, for the event of progression-free
survival (PFS) in patients diagnosed with prostate adenocarcinoma, who had not received prior treatment for their disease.
On the test set for the discovery cohort, we attain Odds Ratios ranging from a minimum of 12.0 and a maximum of 21.0, with
average 16.8, and geometric mean p-value 0.01; Cohen kappa values ranging from a minim of 0.29 to a maximum of 0.59,
with average 0.47; and AUC ranging from a minimum of 0.62 to a maximum of 0.79, with average 0.72, with geometric mean
p-value 0.01; significant (< 0.05) p-values for the log-rank tests are found in six cases, with geometric mean p-value 0.0006.
On seven independent cohorts for 21 combinations of cohort vs fingerprint, we report Odds Ratios ranging from a minimum of
9.0 and a maximum of 40.0, with average 17.5, geometric mean p-value 0.003; Cohen kappa values ranging from a minimum
of 0.18 to a maximum of 0.65, with average 0.4; and AUC ranging from a minimum of 0.61 to a maximum of 0.88, with average
0.76, geometric mean p-value 0.001. Many of the genes in our fingerprint have recorded prognostic power in some form of
cancer, and have been studied for their functional roles in cancer on animal models or cell lines.
Interpretation: The development of novel ML techniques tailored to the problem of uncovering effective multi-gene prognostic
biomarkers is a promising new line of attack for sharpening our capability to diversify and personalize cancer patient treatments.
For the challenging problem of discriminating between indolent and aggressive types of non-metastatic prostate cancer, we
show that it is possible to attain accurate prognostic prediction with a granularity within a year, which is an improvement beyond
the current state of the art.
In application/xml+jats
format
Archived Files and Locations
application/pdf
577.2 kB
file_yinq7kpm2jbxfjun455vvl5dxu
|
www.medrxiv.org (repository) web.archive.org (webarchive) |
post
Stage
unknown
Date 2022-07-31
access all versions, variants, and formats of this works (eg, pre-prints)
Crossref Metadata (via API)
Worldcat
wikidata.org
CORE.ac.uk
Semantic Scholar
Google Scholar