Appropriate sample size for standardization parameters estimation reduces misdiagnoses of molecular-based risk predictors in breast cancer

Authors
González Montoro, Aldana María; Prato, Laura; Casares, Federico; Balzarini, Monica Graciela; Fernandez, Elmer Andres
Publication Year
2017
Language
English
Format
article
Status
Published version
Description
Background: Accurate risk/outcome prediction, in which molecular signatures (MS) are playing an increasingly important role, is crucial for personalized therapy. Patients require an accurate diagnosis and an appropriate therapy assignment as soon as they arrive at the clinic. However, most MS require gene-based standardization through parameters estimated from an available population sample. Thus, the estimation of gene standardization parameters (SP) turns out to be crucial to avoid misdiagnoses. Although dependency on SP has been recognized, the effect of different sample sizes on estimation of and impact on therapy management has not been reported. Because this is key for clinical application, in the present study we evaluated the impact of SP on outcome prediction error due to sample size. For this, 2 well-known breast cancer (BC) subtype/risk predictors were used on real data under different recruitment scenarios. Material/Methods: The PAM50 and Gene70 MS were fed with standardized gene expression profiles using SP estimated from different sample sizes to predict BC intrinsic subtypes and progression, respectively. Error sensitivity analysis was based on estimation of outcome prediction error rates against those obtained using SP estimated with all the patients in the cohort (our criterion standard). Seven BC cohorts including TCGA data (2014 subjects in total) were used. Results: We found that BC outcome prediction is very sensitive to the sample size used to estimate the MS standardization parameters. More than 20% of predicted classes can change when using small sample sizes to compute SP, and more than 20% of subjects can have their predicted outcome changed. Conclusions: Patients might receive inappropriate therapy if the SP are not carefully dealt with. A pilot study to provide SP that yield a stable prediction is necessary. A method to evaluate the sufficiency of the size of the available sample for parameter estimation is proposed to guide prior pilot study development.
Fil: González Montoro, Aldana María. Universidad Nacional de Córdoba. Facultad de Matemática, Astronomía y Física; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Córdoba. Centro de Investigación y Estudios de Matemática. Universidad Nacional de Córdoba. Centro de Investigación y Estudios de Matemática; Argentina
Fil: Prato, Laura. Universidad Nacional de Villa María; Argentina
Fil: Casares, Federico. Lisra Institute; Estados Unidos
Fil: Balzarini, Monica Graciela. Universidad Nacional de Córdoba; Argentina
Fil: Fernandez, Elmer Andres. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro de Investigación y Desarrollo en Inmunología y Enfermedades Infecciosas. Universidad Católica de Córdoba. Centro de Investigación y Desarrollo en Inmunología y Enfermedades Infecciosas; Argentina
Subject
DECISION SUPPORT TECHNIQUES
EARLY DETECTION OF CANCER
GENE EXPRESSION PROFILING
TRANSCRIPTOME
Ingeniería Médica
Ingeniería Médica
INGENIERÍAS Y TECNOLOGÍAS
Access level
Open access
License
https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
Repository
CONICET Digital (CONICET)
Institution
Consejo Nacional de Investigaciones Científicas y Técnicas
OAI Identifier
oai:ri.conicet.gov.ar:11336/59994