Nonparametric statistical tests: friend or foe?

Autores
Politi, Teresa; Carvalho Ferreira, Juliana; Patino, Cecilia María
Año de publicación
2021
Idioma
inglés
Tipo de recurso
artículo
Estado
versión publicada
Descripción
The head of an ICU would like to assess if obese patients admitted for a COPD exacerbation have a longer hospital length of stay (LOS) than do non-obese patients. After recruiting 200 patients, she finds that the distribution of LOS is strongly skewed to the right (Figure 1A). If she were to perform a test of hypothesis, would it be appropriate to use a t-test to compare LOS between obese and non-obese patients with a COPD exacerbation? PARAMETRIC VS. NONPARAMETRIC TESTS IN STATISTICSParametric tests assume that the distribution of data is normal or bell-shaped (Figure 1B) to test hypotheses. For example, the t-test is a parametric test that assumes that the outcome of interest has a normal distribution, that can be characterized by two parameters(1): the mean and the standard deviation (Figure 1B). Nonparametric tests do not require that the data fulfill this restrictive distribution assumption for the outcome variable. Therefore, they are more flexible andcan be widely applied to various different distributions. Nonparametric techniques use ranks(1) instead of the actual values of the observations. For this reason, in addition to continuous data, they can be used to analyze ordinal data, for which parametric tests are usually inappropriate.(2) What are the pitfalls? If the outcome variable is normally distributed and the assumptions for using parametric tests are met, nonparametric techniques have lower statistical power than do the comparable parametric tests. This means that nonparametric tests are less likely to detect a statistically significant result (i.e., less likely to find a p-value < 0.05 than a parametric test). Additionally, parametric tests provide parameter estimations?in the case of the t test, the mean and the standard deviation are the calculated parameters?and a confidence interval for these parameters. For example, in our practical scenario, if the difference in LOS between the groupswere analyzed with a t-test, it would report a sample mean difference in LOS between the groups and the standard deviation of that difference in LOS. Finally, the 95% confidence interval of the sample mean difference could be reported to express the range of values for the mean difference in the population. Conversely, nonparametric tests do not estimate parameters such as mean, standard deviation, or confidence intervals. They only calculate a p-value.(2)HOW TO CHOOSE BETWEEN PARAMETRIC AND NONPARAMETRIC TESTS?When sample sizes are large, that is, greater than 100, parametric tests can usually be applied regardless of the outcome variable distribution. This is due to the central limit theorem, which states that if the sample size is large enough, the distribution of a given variable is approximately normal. The farther the distribution departs from being normal, the larger the sample size will be necessary to approximate normality. When sample sizes are small, and outcome variabledistributions are extremely non-normal, nonparametric tests are more appropriate. For example, some variables are naturally skewed, such as hospital LOS or number of asthma exacerbations per year. In these cases, extremely skewed variables should always be analyzed with nonparametric tests, even with large sample sizes.(2) In our practical scenario, because the distribution of LOS is strongly skewed to the right, the relationship between obesity and LOS among the patients hospitalized for COPD exacerbations should be analyzed with a nonparametric test (Wilcoxon rank sum test or Mann-Whitney test) instead of a t-test.
Fil: Politi, Teresa. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Houssay. Instituto de Fisiología y Biofísica Bernardo Houssay. Universidad de Buenos Aires. Facultad de Medicina. Instituto de Fisiología y Biofísica Bernardo Houssay; Argentina
Fil: Carvalho Ferreira, Juliana. Universidade de Sao Paulo; Brasil
Fil: Patino, Cecilia María. University of Southern California; Estados Unidos
Materia
Parametric tests
Statistics
Epidemiology
Nivel de accesibilidad
acceso abierto
Condiciones de uso
https://creativecommons.org/licenses/by-nc/2.5/ar/
Repositorio
CONICET Digital (CONICET)
Institución
Consejo Nacional de Investigaciones Científicas y Técnicas
OAI Identificador
oai:ri.conicet.gov.ar:11336/172541

id CONICETDig_79b4ea1f0d6e040da9d3bbbde4e5daf7
oai_identifier_str oai:ri.conicet.gov.ar:11336/172541
network_acronym_str CONICETDig
repository_id_str 3498
network_name_str CONICET Digital (CONICET)
spelling Nonparametric statistical tests: friend or foe?Politi, TeresaCarvalho Ferreira, JulianaPatino, Cecilia MaríaParametric testsStatisticsEpidemiologyhttps://purl.org/becyt/ford/3.3https://purl.org/becyt/ford/3The head of an ICU would like to assess if obese patients admitted for a COPD exacerbation have a longer hospital length of stay (LOS) than do non-obese patients. After recruiting 200 patients, she finds that the distribution of LOS is strongly skewed to the right (Figure 1A). If she were to perform a test of hypothesis, would it be appropriate to use a t-test to compare LOS between obese and non-obese patients with a COPD exacerbation? PARAMETRIC VS. NONPARAMETRIC TESTS IN STATISTICSParametric tests assume that the distribution of data is normal or bell-shaped (Figure 1B) to test hypotheses. For example, the t-test is a parametric test that assumes that the outcome of interest has a normal distribution, that can be characterized by two parameters(1): the mean and the standard deviation (Figure 1B). Nonparametric tests do not require that the data fulfill this restrictive distribution assumption for the outcome variable. Therefore, they are more flexible andcan be widely applied to various different distributions. Nonparametric techniques use ranks(1) instead of the actual values of the observations. For this reason, in addition to continuous data, they can be used to analyze ordinal data, for which parametric tests are usually inappropriate.(2) What are the pitfalls? If the outcome variable is normally distributed and the assumptions for using parametric tests are met, nonparametric techniques have lower statistical power than do the comparable parametric tests. This means that nonparametric tests are less likely to detect a statistically significant result (i.e., less likely to find a p-value < 0.05 than a parametric test). Additionally, parametric tests provide parameter estimations?in the case of the t test, the mean and the standard deviation are the calculated parameters?and a confidence interval for these parameters. For example, in our practical scenario, if the difference in LOS between the groupswere analyzed with a t-test, it would report a sample mean difference in LOS between the groups and the standard deviation of that difference in LOS. Finally, the 95% confidence interval of the sample mean difference could be reported to express the range of values for the mean difference in the population. Conversely, nonparametric tests do not estimate parameters such as mean, standard deviation, or confidence intervals. They only calculate a p-value.(2)HOW TO CHOOSE BETWEEN PARAMETRIC AND NONPARAMETRIC TESTS?When sample sizes are large, that is, greater than 100, parametric tests can usually be applied regardless of the outcome variable distribution. This is due to the central limit theorem, which states that if the sample size is large enough, the distribution of a given variable is approximately normal. The farther the distribution departs from being normal, the larger the sample size will be necessary to approximate normality. When sample sizes are small, and outcome variabledistributions are extremely non-normal, nonparametric tests are more appropriate. For example, some variables are naturally skewed, such as hospital LOS or number of asthma exacerbations per year. In these cases, extremely skewed variables should always be analyzed with nonparametric tests, even with large sample sizes.(2) In our practical scenario, because the distribution of LOS is strongly skewed to the right, the relationship between obesity and LOS among the patients hospitalized for COPD exacerbations should be analyzed with a nonparametric test (Wilcoxon rank sum test or Mann-Whitney test) instead of a t-test.Fil: Politi, Teresa. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Houssay. Instituto de Fisiología y Biofísica Bernardo Houssay. Universidad de Buenos Aires. Facultad de Medicina. Instituto de Fisiología y Biofísica Bernardo Houssay; ArgentinaFil: Carvalho Ferreira, Juliana. Universidade de Sao Paulo; BrasilFil: Patino, Cecilia María. University of Southern California; Estados UnidosSociedade Brasileira de Pneumologia e Tisiologia2021-09info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionhttp://purl.org/coar/resource_type/c_6501info:ar-repo/semantics/articuloapplication/pdfapplication/pdfhttp://hdl.handle.net/11336/172541Politi, Teresa; Carvalho Ferreira, Juliana; Patino, Cecilia María; Nonparametric statistical tests: friend or foe?; Sociedade Brasileira de Pneumologia e Tisiologia; Jornal Brasileiro de Pneumologia; 47; 4; 9-2021; 1-21806-3756CONICET DigitalCONICETenginfo:eu-repo/semantics/altIdentifier/doi/10.36416/1806-3756/e20210292info:eu-repo/semantics/openAccesshttps://creativecommons.org/licenses/by-nc/2.5/ar/reponame:CONICET Digital (CONICET)instname:Consejo Nacional de Investigaciones Científicas y Técnicas2025-09-29T10:13:18Zoai:ri.conicet.gov.ar:11336/172541instacron:CONICETInstitucionalhttp://ri.conicet.gov.ar/Organismo científico-tecnológicoNo correspondehttp://ri.conicet.gov.ar/oai/requestdasensio@conicet.gov.ar; lcarlino@conicet.gov.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:34982025-09-29 10:13:19.058CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicasfalse
dc.title.none.fl_str_mv Nonparametric statistical tests: friend or foe?
title Nonparametric statistical tests: friend or foe?
spellingShingle Nonparametric statistical tests: friend or foe?
Politi, Teresa
Parametric tests
Statistics
Epidemiology
title_short Nonparametric statistical tests: friend or foe?
title_full Nonparametric statistical tests: friend or foe?
title_fullStr Nonparametric statistical tests: friend or foe?
title_full_unstemmed Nonparametric statistical tests: friend or foe?
title_sort Nonparametric statistical tests: friend or foe?
dc.creator.none.fl_str_mv Politi, Teresa
Carvalho Ferreira, Juliana
Patino, Cecilia María
author Politi, Teresa
author_facet Politi, Teresa
Carvalho Ferreira, Juliana
Patino, Cecilia María
author_role author
author2 Carvalho Ferreira, Juliana
Patino, Cecilia María
author2_role author
author
dc.subject.none.fl_str_mv Parametric tests
Statistics
Epidemiology
topic Parametric tests
Statistics
Epidemiology
purl_subject.fl_str_mv https://purl.org/becyt/ford/3.3
https://purl.org/becyt/ford/3
dc.description.none.fl_txt_mv The head of an ICU would like to assess if obese patients admitted for a COPD exacerbation have a longer hospital length of stay (LOS) than do non-obese patients. After recruiting 200 patients, she finds that the distribution of LOS is strongly skewed to the right (Figure 1A). If she were to perform a test of hypothesis, would it be appropriate to use a t-test to compare LOS between obese and non-obese patients with a COPD exacerbation? PARAMETRIC VS. NONPARAMETRIC TESTS IN STATISTICSParametric tests assume that the distribution of data is normal or bell-shaped (Figure 1B) to test hypotheses. For example, the t-test is a parametric test that assumes that the outcome of interest has a normal distribution, that can be characterized by two parameters(1): the mean and the standard deviation (Figure 1B). Nonparametric tests do not require that the data fulfill this restrictive distribution assumption for the outcome variable. Therefore, they are more flexible andcan be widely applied to various different distributions. Nonparametric techniques use ranks(1) instead of the actual values of the observations. For this reason, in addition to continuous data, they can be used to analyze ordinal data, for which parametric tests are usually inappropriate.(2) What are the pitfalls? If the outcome variable is normally distributed and the assumptions for using parametric tests are met, nonparametric techniques have lower statistical power than do the comparable parametric tests. This means that nonparametric tests are less likely to detect a statistically significant result (i.e., less likely to find a p-value < 0.05 than a parametric test). Additionally, parametric tests provide parameter estimations?in the case of the t test, the mean and the standard deviation are the calculated parameters?and a confidence interval for these parameters. For example, in our practical scenario, if the difference in LOS between the groupswere analyzed with a t-test, it would report a sample mean difference in LOS between the groups and the standard deviation of that difference in LOS. Finally, the 95% confidence interval of the sample mean difference could be reported to express the range of values for the mean difference in the population. Conversely, nonparametric tests do not estimate parameters such as mean, standard deviation, or confidence intervals. They only calculate a p-value.(2)HOW TO CHOOSE BETWEEN PARAMETRIC AND NONPARAMETRIC TESTS?When sample sizes are large, that is, greater than 100, parametric tests can usually be applied regardless of the outcome variable distribution. This is due to the central limit theorem, which states that if the sample size is large enough, the distribution of a given variable is approximately normal. The farther the distribution departs from being normal, the larger the sample size will be necessary to approximate normality. When sample sizes are small, and outcome variabledistributions are extremely non-normal, nonparametric tests are more appropriate. For example, some variables are naturally skewed, such as hospital LOS or number of asthma exacerbations per year. In these cases, extremely skewed variables should always be analyzed with nonparametric tests, even with large sample sizes.(2) In our practical scenario, because the distribution of LOS is strongly skewed to the right, the relationship between obesity and LOS among the patients hospitalized for COPD exacerbations should be analyzed with a nonparametric test (Wilcoxon rank sum test or Mann-Whitney test) instead of a t-test.
Fil: Politi, Teresa. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Houssay. Instituto de Fisiología y Biofísica Bernardo Houssay. Universidad de Buenos Aires. Facultad de Medicina. Instituto de Fisiología y Biofísica Bernardo Houssay; Argentina
Fil: Carvalho Ferreira, Juliana. Universidade de Sao Paulo; Brasil
Fil: Patino, Cecilia María. University of Southern California; Estados Unidos
description The head of an ICU would like to assess if obese patients admitted for a COPD exacerbation have a longer hospital length of stay (LOS) than do non-obese patients. After recruiting 200 patients, she finds that the distribution of LOS is strongly skewed to the right (Figure 1A). If she were to perform a test of hypothesis, would it be appropriate to use a t-test to compare LOS between obese and non-obese patients with a COPD exacerbation? PARAMETRIC VS. NONPARAMETRIC TESTS IN STATISTICSParametric tests assume that the distribution of data is normal or bell-shaped (Figure 1B) to test hypotheses. For example, the t-test is a parametric test that assumes that the outcome of interest has a normal distribution, that can be characterized by two parameters(1): the mean and the standard deviation (Figure 1B). Nonparametric tests do not require that the data fulfill this restrictive distribution assumption for the outcome variable. Therefore, they are more flexible andcan be widely applied to various different distributions. Nonparametric techniques use ranks(1) instead of the actual values of the observations. For this reason, in addition to continuous data, they can be used to analyze ordinal data, for which parametric tests are usually inappropriate.(2) What are the pitfalls? If the outcome variable is normally distributed and the assumptions for using parametric tests are met, nonparametric techniques have lower statistical power than do the comparable parametric tests. This means that nonparametric tests are less likely to detect a statistically significant result (i.e., less likely to find a p-value < 0.05 than a parametric test). Additionally, parametric tests provide parameter estimations?in the case of the t test, the mean and the standard deviation are the calculated parameters?and a confidence interval for these parameters. For example, in our practical scenario, if the difference in LOS between the groupswere analyzed with a t-test, it would report a sample mean difference in LOS between the groups and the standard deviation of that difference in LOS. Finally, the 95% confidence interval of the sample mean difference could be reported to express the range of values for the mean difference in the population. Conversely, nonparametric tests do not estimate parameters such as mean, standard deviation, or confidence intervals. They only calculate a p-value.(2)HOW TO CHOOSE BETWEEN PARAMETRIC AND NONPARAMETRIC TESTS?When sample sizes are large, that is, greater than 100, parametric tests can usually be applied regardless of the outcome variable distribution. This is due to the central limit theorem, which states that if the sample size is large enough, the distribution of a given variable is approximately normal. The farther the distribution departs from being normal, the larger the sample size will be necessary to approximate normality. When sample sizes are small, and outcome variabledistributions are extremely non-normal, nonparametric tests are more appropriate. For example, some variables are naturally skewed, such as hospital LOS or number of asthma exacerbations per year. In these cases, extremely skewed variables should always be analyzed with nonparametric tests, even with large sample sizes.(2) In our practical scenario, because the distribution of LOS is strongly skewed to the right, the relationship between obesity and LOS among the patients hospitalized for COPD exacerbations should be analyzed with a nonparametric test (Wilcoxon rank sum test or Mann-Whitney test) instead of a t-test.
publishDate 2021
dc.date.none.fl_str_mv 2021-09
dc.type.none.fl_str_mv info:eu-repo/semantics/article
info:eu-repo/semantics/publishedVersion
http://purl.org/coar/resource_type/c_6501
info:ar-repo/semantics/articulo
format article
status_str publishedVersion
dc.identifier.none.fl_str_mv http://hdl.handle.net/11336/172541
Politi, Teresa; Carvalho Ferreira, Juliana; Patino, Cecilia María; Nonparametric statistical tests: friend or foe?; Sociedade Brasileira de Pneumologia e Tisiologia; Jornal Brasileiro de Pneumologia; 47; 4; 9-2021; 1-2
1806-3756
CONICET Digital
CONICET
url http://hdl.handle.net/11336/172541
identifier_str_mv Politi, Teresa; Carvalho Ferreira, Juliana; Patino, Cecilia María; Nonparametric statistical tests: friend or foe?; Sociedade Brasileira de Pneumologia e Tisiologia; Jornal Brasileiro de Pneumologia; 47; 4; 9-2021; 1-2
1806-3756
CONICET Digital
CONICET
dc.language.none.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv info:eu-repo/semantics/altIdentifier/doi/10.36416/1806-3756/e20210292
dc.rights.none.fl_str_mv info:eu-repo/semantics/openAccess
https://creativecommons.org/licenses/by-nc/2.5/ar/
eu_rights_str_mv openAccess
rights_invalid_str_mv https://creativecommons.org/licenses/by-nc/2.5/ar/
dc.format.none.fl_str_mv application/pdf
application/pdf
dc.publisher.none.fl_str_mv Sociedade Brasileira de Pneumologia e Tisiologia
publisher.none.fl_str_mv Sociedade Brasileira de Pneumologia e Tisiologia
dc.source.none.fl_str_mv reponame:CONICET Digital (CONICET)
instname:Consejo Nacional de Investigaciones Científicas y Técnicas
reponame_str CONICET Digital (CONICET)
collection CONICET Digital (CONICET)
instname_str Consejo Nacional de Investigaciones Científicas y Técnicas
repository.name.fl_str_mv CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicas
repository.mail.fl_str_mv dasensio@conicet.gov.ar; lcarlino@conicet.gov.ar
_version_ 1844614049143193600
score 13.070432