Nonparametric statistical tests: friend or foe?
- Autores
- Politi, Teresa; Carvalho Ferreira, Juliana; Patino, Cecilia María
- Año de publicación
- 2021
- Idioma
- inglés
- Tipo de recurso
- artículo
- Estado
- versión publicada
- Descripción
- The head of an ICU would like to assess if obese patients admitted for a COPD exacerbation have a longer hospital length of stay (LOS) than do non-obese patients. After recruiting 200 patients, she finds that the distribution of LOS is strongly skewed to the right (Figure 1A). If she were to perform a test of hypothesis, would it be appropriate to use a t-test to compare LOS between obese and non-obese patients with a COPD exacerbation? PARAMETRIC VS. NONPARAMETRIC TESTS IN STATISTICSParametric tests assume that the distribution of data is normal or bell-shaped (Figure 1B) to test hypotheses. For example, the t-test is a parametric test that assumes that the outcome of interest has a normal distribution, that can be characterized by two parameters(1): the mean and the standard deviation (Figure 1B). Nonparametric tests do not require that the data fulfill this restrictive distribution assumption for the outcome variable. Therefore, they are more flexible andcan be widely applied to various different distributions. Nonparametric techniques use ranks(1) instead of the actual values of the observations. For this reason, in addition to continuous data, they can be used to analyze ordinal data, for which parametric tests are usually inappropriate.(2) What are the pitfalls? If the outcome variable is normally distributed and the assumptions for using parametric tests are met, nonparametric techniques have lower statistical power than do the comparable parametric tests. This means that nonparametric tests are less likely to detect a statistically significant result (i.e., less likely to find a p-value < 0.05 than a parametric test). Additionally, parametric tests provide parameter estimations?in the case of the t test, the mean and the standard deviation are the calculated parameters?and a confidence interval for these parameters. For example, in our practical scenario, if the difference in LOS between the groupswere analyzed with a t-test, it would report a sample mean difference in LOS between the groups and the standard deviation of that difference in LOS. Finally, the 95% confidence interval of the sample mean difference could be reported to express the range of values for the mean difference in the population. Conversely, nonparametric tests do not estimate parameters such as mean, standard deviation, or confidence intervals. They only calculate a p-value.(2)HOW TO CHOOSE BETWEEN PARAMETRIC AND NONPARAMETRIC TESTS?When sample sizes are large, that is, greater than 100, parametric tests can usually be applied regardless of the outcome variable distribution. This is due to the central limit theorem, which states that if the sample size is large enough, the distribution of a given variable is approximately normal. The farther the distribution departs from being normal, the larger the sample size will be necessary to approximate normality. When sample sizes are small, and outcome variabledistributions are extremely non-normal, nonparametric tests are more appropriate. For example, some variables are naturally skewed, such as hospital LOS or number of asthma exacerbations per year. In these cases, extremely skewed variables should always be analyzed with nonparametric tests, even with large sample sizes.(2) In our practical scenario, because the distribution of LOS is strongly skewed to the right, the relationship between obesity and LOS among the patients hospitalized for COPD exacerbations should be analyzed with a nonparametric test (Wilcoxon rank sum test or Mann-Whitney test) instead of a t-test.
Fil: Politi, Teresa. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Houssay. Instituto de Fisiología y Biofísica Bernardo Houssay. Universidad de Buenos Aires. Facultad de Medicina. Instituto de Fisiología y Biofísica Bernardo Houssay; Argentina
Fil: Carvalho Ferreira, Juliana. Universidade de Sao Paulo; Brasil
Fil: Patino, Cecilia María. University of Southern California; Estados Unidos - Materia
-
Parametric tests
Statistics
Epidemiology - Nivel de accesibilidad
- acceso abierto
- Condiciones de uso
- https://creativecommons.org/licenses/by-nc/2.5/ar/
- Repositorio
- Institución
- Consejo Nacional de Investigaciones Científicas y Técnicas
- OAI Identificador
- oai:ri.conicet.gov.ar:11336/172541
Ver los metadatos del registro completo
id |
CONICETDig_79b4ea1f0d6e040da9d3bbbde4e5daf7 |
---|---|
oai_identifier_str |
oai:ri.conicet.gov.ar:11336/172541 |
network_acronym_str |
CONICETDig |
repository_id_str |
3498 |
network_name_str |
CONICET Digital (CONICET) |
spelling |
Nonparametric statistical tests: friend or foe?Politi, TeresaCarvalho Ferreira, JulianaPatino, Cecilia MaríaParametric testsStatisticsEpidemiologyhttps://purl.org/becyt/ford/3.3https://purl.org/becyt/ford/3The head of an ICU would like to assess if obese patients admitted for a COPD exacerbation have a longer hospital length of stay (LOS) than do non-obese patients. After recruiting 200 patients, she finds that the distribution of LOS is strongly skewed to the right (Figure 1A). If she were to perform a test of hypothesis, would it be appropriate to use a t-test to compare LOS between obese and non-obese patients with a COPD exacerbation? PARAMETRIC VS. NONPARAMETRIC TESTS IN STATISTICSParametric tests assume that the distribution of data is normal or bell-shaped (Figure 1B) to test hypotheses. For example, the t-test is a parametric test that assumes that the outcome of interest has a normal distribution, that can be characterized by two parameters(1): the mean and the standard deviation (Figure 1B). Nonparametric tests do not require that the data fulfill this restrictive distribution assumption for the outcome variable. Therefore, they are more flexible andcan be widely applied to various different distributions. Nonparametric techniques use ranks(1) instead of the actual values of the observations. For this reason, in addition to continuous data, they can be used to analyze ordinal data, for which parametric tests are usually inappropriate.(2) What are the pitfalls? If the outcome variable is normally distributed and the assumptions for using parametric tests are met, nonparametric techniques have lower statistical power than do the comparable parametric tests. This means that nonparametric tests are less likely to detect a statistically significant result (i.e., less likely to find a p-value < 0.05 than a parametric test). Additionally, parametric tests provide parameter estimations?in the case of the t test, the mean and the standard deviation are the calculated parameters?and a confidence interval for these parameters. For example, in our practical scenario, if the difference in LOS between the groupswere analyzed with a t-test, it would report a sample mean difference in LOS between the groups and the standard deviation of that difference in LOS. Finally, the 95% confidence interval of the sample mean difference could be reported to express the range of values for the mean difference in the population. Conversely, nonparametric tests do not estimate parameters such as mean, standard deviation, or confidence intervals. They only calculate a p-value.(2)HOW TO CHOOSE BETWEEN PARAMETRIC AND NONPARAMETRIC TESTS?When sample sizes are large, that is, greater than 100, parametric tests can usually be applied regardless of the outcome variable distribution. This is due to the central limit theorem, which states that if the sample size is large enough, the distribution of a given variable is approximately normal. The farther the distribution departs from being normal, the larger the sample size will be necessary to approximate normality. When sample sizes are small, and outcome variabledistributions are extremely non-normal, nonparametric tests are more appropriate. For example, some variables are naturally skewed, such as hospital LOS or number of asthma exacerbations per year. In these cases, extremely skewed variables should always be analyzed with nonparametric tests, even with large sample sizes.(2) In our practical scenario, because the distribution of LOS is strongly skewed to the right, the relationship between obesity and LOS among the patients hospitalized for COPD exacerbations should be analyzed with a nonparametric test (Wilcoxon rank sum test or Mann-Whitney test) instead of a t-test.Fil: Politi, Teresa. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Houssay. Instituto de Fisiología y Biofísica Bernardo Houssay. Universidad de Buenos Aires. Facultad de Medicina. Instituto de Fisiología y Biofísica Bernardo Houssay; ArgentinaFil: Carvalho Ferreira, Juliana. Universidade de Sao Paulo; BrasilFil: Patino, Cecilia María. University of Southern California; Estados UnidosSociedade Brasileira de Pneumologia e Tisiologia2021-09info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionhttp://purl.org/coar/resource_type/c_6501info:ar-repo/semantics/articuloapplication/pdfapplication/pdfhttp://hdl.handle.net/11336/172541Politi, Teresa; Carvalho Ferreira, Juliana; Patino, Cecilia María; Nonparametric statistical tests: friend or foe?; Sociedade Brasileira de Pneumologia e Tisiologia; Jornal Brasileiro de Pneumologia; 47; 4; 9-2021; 1-21806-3756CONICET DigitalCONICETenginfo:eu-repo/semantics/altIdentifier/doi/10.36416/1806-3756/e20210292info:eu-repo/semantics/openAccesshttps://creativecommons.org/licenses/by-nc/2.5/ar/reponame:CONICET Digital (CONICET)instname:Consejo Nacional de Investigaciones Científicas y Técnicas2025-09-29T10:13:18Zoai:ri.conicet.gov.ar:11336/172541instacron:CONICETInstitucionalhttp://ri.conicet.gov.ar/Organismo científico-tecnológicoNo correspondehttp://ri.conicet.gov.ar/oai/requestdasensio@conicet.gov.ar; lcarlino@conicet.gov.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:34982025-09-29 10:13:19.058CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicasfalse |
dc.title.none.fl_str_mv |
Nonparametric statistical tests: friend or foe? |
title |
Nonparametric statistical tests: friend or foe? |
spellingShingle |
Nonparametric statistical tests: friend or foe? Politi, Teresa Parametric tests Statistics Epidemiology |
title_short |
Nonparametric statistical tests: friend or foe? |
title_full |
Nonparametric statistical tests: friend or foe? |
title_fullStr |
Nonparametric statistical tests: friend or foe? |
title_full_unstemmed |
Nonparametric statistical tests: friend or foe? |
title_sort |
Nonparametric statistical tests: friend or foe? |
dc.creator.none.fl_str_mv |
Politi, Teresa Carvalho Ferreira, Juliana Patino, Cecilia María |
author |
Politi, Teresa |
author_facet |
Politi, Teresa Carvalho Ferreira, Juliana Patino, Cecilia María |
author_role |
author |
author2 |
Carvalho Ferreira, Juliana Patino, Cecilia María |
author2_role |
author author |
dc.subject.none.fl_str_mv |
Parametric tests Statistics Epidemiology |
topic |
Parametric tests Statistics Epidemiology |
purl_subject.fl_str_mv |
https://purl.org/becyt/ford/3.3 https://purl.org/becyt/ford/3 |
dc.description.none.fl_txt_mv |
The head of an ICU would like to assess if obese patients admitted for a COPD exacerbation have a longer hospital length of stay (LOS) than do non-obese patients. After recruiting 200 patients, she finds that the distribution of LOS is strongly skewed to the right (Figure 1A). If she were to perform a test of hypothesis, would it be appropriate to use a t-test to compare LOS between obese and non-obese patients with a COPD exacerbation? PARAMETRIC VS. NONPARAMETRIC TESTS IN STATISTICSParametric tests assume that the distribution of data is normal or bell-shaped (Figure 1B) to test hypotheses. For example, the t-test is a parametric test that assumes that the outcome of interest has a normal distribution, that can be characterized by two parameters(1): the mean and the standard deviation (Figure 1B). Nonparametric tests do not require that the data fulfill this restrictive distribution assumption for the outcome variable. Therefore, they are more flexible andcan be widely applied to various different distributions. Nonparametric techniques use ranks(1) instead of the actual values of the observations. For this reason, in addition to continuous data, they can be used to analyze ordinal data, for which parametric tests are usually inappropriate.(2) What are the pitfalls? If the outcome variable is normally distributed and the assumptions for using parametric tests are met, nonparametric techniques have lower statistical power than do the comparable parametric tests. This means that nonparametric tests are less likely to detect a statistically significant result (i.e., less likely to find a p-value < 0.05 than a parametric test). Additionally, parametric tests provide parameter estimations?in the case of the t test, the mean and the standard deviation are the calculated parameters?and a confidence interval for these parameters. For example, in our practical scenario, if the difference in LOS between the groupswere analyzed with a t-test, it would report a sample mean difference in LOS between the groups and the standard deviation of that difference in LOS. Finally, the 95% confidence interval of the sample mean difference could be reported to express the range of values for the mean difference in the population. Conversely, nonparametric tests do not estimate parameters such as mean, standard deviation, or confidence intervals. They only calculate a p-value.(2)HOW TO CHOOSE BETWEEN PARAMETRIC AND NONPARAMETRIC TESTS?When sample sizes are large, that is, greater than 100, parametric tests can usually be applied regardless of the outcome variable distribution. This is due to the central limit theorem, which states that if the sample size is large enough, the distribution of a given variable is approximately normal. The farther the distribution departs from being normal, the larger the sample size will be necessary to approximate normality. When sample sizes are small, and outcome variabledistributions are extremely non-normal, nonparametric tests are more appropriate. For example, some variables are naturally skewed, such as hospital LOS or number of asthma exacerbations per year. In these cases, extremely skewed variables should always be analyzed with nonparametric tests, even with large sample sizes.(2) In our practical scenario, because the distribution of LOS is strongly skewed to the right, the relationship between obesity and LOS among the patients hospitalized for COPD exacerbations should be analyzed with a nonparametric test (Wilcoxon rank sum test or Mann-Whitney test) instead of a t-test. Fil: Politi, Teresa. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Houssay. Instituto de Fisiología y Biofísica Bernardo Houssay. Universidad de Buenos Aires. Facultad de Medicina. Instituto de Fisiología y Biofísica Bernardo Houssay; Argentina Fil: Carvalho Ferreira, Juliana. Universidade de Sao Paulo; Brasil Fil: Patino, Cecilia María. University of Southern California; Estados Unidos |
description |
The head of an ICU would like to assess if obese patients admitted for a COPD exacerbation have a longer hospital length of stay (LOS) than do non-obese patients. After recruiting 200 patients, she finds that the distribution of LOS is strongly skewed to the right (Figure 1A). If she were to perform a test of hypothesis, would it be appropriate to use a t-test to compare LOS between obese and non-obese patients with a COPD exacerbation? PARAMETRIC VS. NONPARAMETRIC TESTS IN STATISTICSParametric tests assume that the distribution of data is normal or bell-shaped (Figure 1B) to test hypotheses. For example, the t-test is a parametric test that assumes that the outcome of interest has a normal distribution, that can be characterized by two parameters(1): the mean and the standard deviation (Figure 1B). Nonparametric tests do not require that the data fulfill this restrictive distribution assumption for the outcome variable. Therefore, they are more flexible andcan be widely applied to various different distributions. Nonparametric techniques use ranks(1) instead of the actual values of the observations. For this reason, in addition to continuous data, they can be used to analyze ordinal data, for which parametric tests are usually inappropriate.(2) What are the pitfalls? If the outcome variable is normally distributed and the assumptions for using parametric tests are met, nonparametric techniques have lower statistical power than do the comparable parametric tests. This means that nonparametric tests are less likely to detect a statistically significant result (i.e., less likely to find a p-value < 0.05 than a parametric test). Additionally, parametric tests provide parameter estimations?in the case of the t test, the mean and the standard deviation are the calculated parameters?and a confidence interval for these parameters. For example, in our practical scenario, if the difference in LOS between the groupswere analyzed with a t-test, it would report a sample mean difference in LOS between the groups and the standard deviation of that difference in LOS. Finally, the 95% confidence interval of the sample mean difference could be reported to express the range of values for the mean difference in the population. Conversely, nonparametric tests do not estimate parameters such as mean, standard deviation, or confidence intervals. They only calculate a p-value.(2)HOW TO CHOOSE BETWEEN PARAMETRIC AND NONPARAMETRIC TESTS?When sample sizes are large, that is, greater than 100, parametric tests can usually be applied regardless of the outcome variable distribution. This is due to the central limit theorem, which states that if the sample size is large enough, the distribution of a given variable is approximately normal. The farther the distribution departs from being normal, the larger the sample size will be necessary to approximate normality. When sample sizes are small, and outcome variabledistributions are extremely non-normal, nonparametric tests are more appropriate. For example, some variables are naturally skewed, such as hospital LOS or number of asthma exacerbations per year. In these cases, extremely skewed variables should always be analyzed with nonparametric tests, even with large sample sizes.(2) In our practical scenario, because the distribution of LOS is strongly skewed to the right, the relationship between obesity and LOS among the patients hospitalized for COPD exacerbations should be analyzed with a nonparametric test (Wilcoxon rank sum test or Mann-Whitney test) instead of a t-test. |
publishDate |
2021 |
dc.date.none.fl_str_mv |
2021-09 |
dc.type.none.fl_str_mv |
info:eu-repo/semantics/article info:eu-repo/semantics/publishedVersion http://purl.org/coar/resource_type/c_6501 info:ar-repo/semantics/articulo |
format |
article |
status_str |
publishedVersion |
dc.identifier.none.fl_str_mv |
http://hdl.handle.net/11336/172541 Politi, Teresa; Carvalho Ferreira, Juliana; Patino, Cecilia María; Nonparametric statistical tests: friend or foe?; Sociedade Brasileira de Pneumologia e Tisiologia; Jornal Brasileiro de Pneumologia; 47; 4; 9-2021; 1-2 1806-3756 CONICET Digital CONICET |
url |
http://hdl.handle.net/11336/172541 |
identifier_str_mv |
Politi, Teresa; Carvalho Ferreira, Juliana; Patino, Cecilia María; Nonparametric statistical tests: friend or foe?; Sociedade Brasileira de Pneumologia e Tisiologia; Jornal Brasileiro de Pneumologia; 47; 4; 9-2021; 1-2 1806-3756 CONICET Digital CONICET |
dc.language.none.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
info:eu-repo/semantics/altIdentifier/doi/10.36416/1806-3756/e20210292 |
dc.rights.none.fl_str_mv |
info:eu-repo/semantics/openAccess https://creativecommons.org/licenses/by-nc/2.5/ar/ |
eu_rights_str_mv |
openAccess |
rights_invalid_str_mv |
https://creativecommons.org/licenses/by-nc/2.5/ar/ |
dc.format.none.fl_str_mv |
application/pdf application/pdf |
dc.publisher.none.fl_str_mv |
Sociedade Brasileira de Pneumologia e Tisiologia |
publisher.none.fl_str_mv |
Sociedade Brasileira de Pneumologia e Tisiologia |
dc.source.none.fl_str_mv |
reponame:CONICET Digital (CONICET) instname:Consejo Nacional de Investigaciones Científicas y Técnicas |
reponame_str |
CONICET Digital (CONICET) |
collection |
CONICET Digital (CONICET) |
instname_str |
Consejo Nacional de Investigaciones Científicas y Técnicas |
repository.name.fl_str_mv |
CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicas |
repository.mail.fl_str_mv |
dasensio@conicet.gov.ar; lcarlino@conicet.gov.ar |
_version_ |
1844614049143193600 |
score |
13.070432 |