TY - JOUR
T1 - Prognostic gene expression signatures of breast cancer are lacking a sensible biological meaning
AU - Manjang, Kalifa
AU - Tripathi, Shailesh
AU - Yli-Harja, Olli
AU - Dehmer, Matthias
AU - Glazko, Galina
AU - Emmert-Streib, Frank
N1 - Funding Information:
Kalifa Manjang is supported by Tampere University via the Prostate Cancer Center. Matthias Dehmer thanks the Austrian Science Funds for supporting this work (Project P30031).
Publisher Copyright:
© 2021, The Author(s).
PY - 2021/12
Y1 - 2021/12
N2 - The identification of prognostic biomarkers for predicting cancer progression is an important problem for two reasons. First, such biomarkers find practical application in a clinical context for the treatment of patients. Second, interrogation of the biomarkers themselves is assumed to lead to novel insights of disease mechanisms and the underlying molecular processes that cause the pathological behavior. For breast cancer, many signatures based on gene expression values have been reported to be associated with overall survival. Consequently, such signatures have been used for suggesting biological explanations of breast cancer and drug mechanisms. In this paper, we demonstrate for a large number of breast cancer signatures that such an implication is not justified. Our approach eliminates systematically all traces of biological meaning of signature genes and shows that among the remaining genes, surrogate gene sets can be formed with indistinguishable prognostic prediction capabilities and opposite biological meaning. Hence, our results demonstrate that none of the studied signatures has a sensible biological interpretation or meaning with respect to disease etiology. Overall, this shows that prognostic signatures are black-box models with sensible predictions of breast cancer outcome but no value for revealing causal connections. Furthermore, we show that the number of such surrogate gene sets is not small but very large.
AB - The identification of prognostic biomarkers for predicting cancer progression is an important problem for two reasons. First, such biomarkers find practical application in a clinical context for the treatment of patients. Second, interrogation of the biomarkers themselves is assumed to lead to novel insights of disease mechanisms and the underlying molecular processes that cause the pathological behavior. For breast cancer, many signatures based on gene expression values have been reported to be associated with overall survival. Consequently, such signatures have been used for suggesting biological explanations of breast cancer and drug mechanisms. In this paper, we demonstrate for a large number of breast cancer signatures that such an implication is not justified. Our approach eliminates systematically all traces of biological meaning of signature genes and shows that among the remaining genes, surrogate gene sets can be formed with indistinguishable prognostic prediction capabilities and opposite biological meaning. Hence, our results demonstrate that none of the studied signatures has a sensible biological interpretation or meaning with respect to disease etiology. Overall, this shows that prognostic signatures are black-box models with sensible predictions of breast cancer outcome but no value for revealing causal connections. Furthermore, we show that the number of such surrogate gene sets is not small but very large.
KW - Biomarkers, Tumor/genetics
KW - Breast Neoplasms/diagnosis
KW - Female
KW - Gene Expression Profiling
KW - Gene Expression Regulation, Neoplastic
KW - Humans
KW - Prognosis
KW - Transcriptome
UR - http://www.scopus.com/inward/record.url?scp=85099000909&partnerID=8YFLogxK
U2 - 10.1038/s41598-020-79375-y
DO - 10.1038/s41598-020-79375-y
M3 - Article
C2 - 33420139
AN - SCOPUS:85099000909
SN - 2045-2322
VL - 11
SP - 156
JO - Scientific Reports
JF - Scientific Reports
IS - 1
M1 - 156
ER -