Correcting MM estimates for "fat" data sets

Autores
Maronna, Ricardo Antonio; Yohai, Victor Jaime
Año de publicación
2010
Idioma
inglés
Tipo de recurso
artículo
Estado
versión publicada
Descripción
Regression MM estimates require the estimation of the error scale, and the determination of a constant that controls the efficiency. These two steps are based on the asymptotic results that are derived assuming that the number of predictors p remains fixed while the number of observations n tends to infinity, which means assuming that the ratio p/n is "small". However, many high-dimensional data sets have a "large" value of p/n (say, ≥0.2). It is shown that the standard asymptotic results do not hold if p/n is large; namely that (a) the estimated scale underestimates the true error scale, and (b) that even if the scale is correctly estimated, the actual efficiency can be much lower than the nominal one. To overcome these drawbacks simple corrections for the scale and for the efficiency controlling constant are proposed, and it is demonstrated that these corrections improve on the estimate's performance under both normal and contaminated data.
Fil: Maronna, Ricardo Antonio. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina. Universidad Nacional de La Plata. Facultad de Ciencias Exactas; Argentina
Fil: Yohai, Victor Jaime. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina
Materia
MM estimators
M-scale
HIgh-dimensional data
Nivel de accesibilidad
acceso abierto
Condiciones de uso
https://creativecommons.org/licenses/by-nc-nd/2.5/ar/
Repositorio
CONICET Digital (CONICET)
Institución
Consejo Nacional de Investigaciones Científicas y Técnicas
OAI Identificador
oai:ri.conicet.gov.ar:11336/98688

id CONICETDig_79a59df4cd3343cfb05d5a0329d3a7af
oai_identifier_str oai:ri.conicet.gov.ar:11336/98688
network_acronym_str CONICETDig
repository_id_str 3498
network_name_str CONICET Digital (CONICET)
spelling Correcting MM estimates for "fat" data setsMaronna, Ricardo AntonioYohai, Victor JaimeMM estimatorsM-scaleHIgh-dimensional datahttps://purl.org/becyt/ford/1.1https://purl.org/becyt/ford/1Regression MM estimates require the estimation of the error scale, and the determination of a constant that controls the efficiency. These two steps are based on the asymptotic results that are derived assuming that the number of predictors p remains fixed while the number of observations n tends to infinity, which means assuming that the ratio p/n is "small". However, many high-dimensional data sets have a "large" value of p/n (say, ≥0.2). It is shown that the standard asymptotic results do not hold if p/n is large; namely that (a) the estimated scale underestimates the true error scale, and (b) that even if the scale is correctly estimated, the actual efficiency can be much lower than the nominal one. To overcome these drawbacks simple corrections for the scale and for the efficiency controlling constant are proposed, and it is demonstrated that these corrections improve on the estimate's performance under both normal and contaminated data.Fil: Maronna, Ricardo Antonio. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina. Universidad Nacional de La Plata. Facultad de Ciencias Exactas; ArgentinaFil: Yohai, Victor Jaime. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; ArgentinaElsevier Science2010-12info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionhttp://purl.org/coar/resource_type/c_6501info:ar-repo/semantics/articuloapplication/pdfapplication/pdfapplication/pdfhttp://hdl.handle.net/11336/98688Maronna, Ricardo Antonio; Yohai, Victor Jaime; Correcting MM estimates for "fat" data sets; Elsevier Science; Computational Statistics and Data Analysis; 54; 12; 12-2010; 3168-31730167-9473CONICET DigitalCONICETenginfo:eu-repo/semantics/altIdentifier/url/https://www.sciencedirect.com/science/article/pii/S0167947309003314info:eu-repo/semantics/altIdentifier/doi/10.1016/j.csda.2009.09.015info:eu-repo/semantics/openAccesshttps://creativecommons.org/licenses/by-nc-nd/2.5/ar/reponame:CONICET Digital (CONICET)instname:Consejo Nacional de Investigaciones Científicas y Técnicas2025-09-29T10:02:37Zoai:ri.conicet.gov.ar:11336/98688instacron:CONICETInstitucionalhttp://ri.conicet.gov.ar/Organismo científico-tecnológicoNo correspondehttp://ri.conicet.gov.ar/oai/requestdasensio@conicet.gov.ar; lcarlino@conicet.gov.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:34982025-09-29 10:02:37.618CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicasfalse
dc.title.none.fl_str_mv Correcting MM estimates for "fat" data sets
title Correcting MM estimates for "fat" data sets
spellingShingle Correcting MM estimates for "fat" data sets
Maronna, Ricardo Antonio
MM estimators
M-scale
HIgh-dimensional data
title_short Correcting MM estimates for "fat" data sets
title_full Correcting MM estimates for "fat" data sets
title_fullStr Correcting MM estimates for "fat" data sets
title_full_unstemmed Correcting MM estimates for "fat" data sets
title_sort Correcting MM estimates for "fat" data sets
dc.creator.none.fl_str_mv Maronna, Ricardo Antonio
Yohai, Victor Jaime
author Maronna, Ricardo Antonio
author_facet Maronna, Ricardo Antonio
Yohai, Victor Jaime
author_role author
author2 Yohai, Victor Jaime
author2_role author
dc.subject.none.fl_str_mv MM estimators
M-scale
HIgh-dimensional data
topic MM estimators
M-scale
HIgh-dimensional data
purl_subject.fl_str_mv https://purl.org/becyt/ford/1.1
https://purl.org/becyt/ford/1
dc.description.none.fl_txt_mv Regression MM estimates require the estimation of the error scale, and the determination of a constant that controls the efficiency. These two steps are based on the asymptotic results that are derived assuming that the number of predictors p remains fixed while the number of observations n tends to infinity, which means assuming that the ratio p/n is "small". However, many high-dimensional data sets have a "large" value of p/n (say, ≥0.2). It is shown that the standard asymptotic results do not hold if p/n is large; namely that (a) the estimated scale underestimates the true error scale, and (b) that even if the scale is correctly estimated, the actual efficiency can be much lower than the nominal one. To overcome these drawbacks simple corrections for the scale and for the efficiency controlling constant are proposed, and it is demonstrated that these corrections improve on the estimate's performance under both normal and contaminated data.
Fil: Maronna, Ricardo Antonio. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina. Universidad Nacional de La Plata. Facultad de Ciencias Exactas; Argentina
Fil: Yohai, Victor Jaime. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina
description Regression MM estimates require the estimation of the error scale, and the determination of a constant that controls the efficiency. These two steps are based on the asymptotic results that are derived assuming that the number of predictors p remains fixed while the number of observations n tends to infinity, which means assuming that the ratio p/n is "small". However, many high-dimensional data sets have a "large" value of p/n (say, ≥0.2). It is shown that the standard asymptotic results do not hold if p/n is large; namely that (a) the estimated scale underestimates the true error scale, and (b) that even if the scale is correctly estimated, the actual efficiency can be much lower than the nominal one. To overcome these drawbacks simple corrections for the scale and for the efficiency controlling constant are proposed, and it is demonstrated that these corrections improve on the estimate's performance under both normal and contaminated data.
publishDate 2010
dc.date.none.fl_str_mv 2010-12
dc.type.none.fl_str_mv info:eu-repo/semantics/article
info:eu-repo/semantics/publishedVersion
http://purl.org/coar/resource_type/c_6501
info:ar-repo/semantics/articulo
format article
status_str publishedVersion
dc.identifier.none.fl_str_mv http://hdl.handle.net/11336/98688
Maronna, Ricardo Antonio; Yohai, Victor Jaime; Correcting MM estimates for "fat" data sets; Elsevier Science; Computational Statistics and Data Analysis; 54; 12; 12-2010; 3168-3173
0167-9473
CONICET Digital
CONICET
url http://hdl.handle.net/11336/98688
identifier_str_mv Maronna, Ricardo Antonio; Yohai, Victor Jaime; Correcting MM estimates for "fat" data sets; Elsevier Science; Computational Statistics and Data Analysis; 54; 12; 12-2010; 3168-3173
0167-9473
CONICET Digital
CONICET
dc.language.none.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv info:eu-repo/semantics/altIdentifier/url/https://www.sciencedirect.com/science/article/pii/S0167947309003314
info:eu-repo/semantics/altIdentifier/doi/10.1016/j.csda.2009.09.015
dc.rights.none.fl_str_mv info:eu-repo/semantics/openAccess
https://creativecommons.org/licenses/by-nc-nd/2.5/ar/
eu_rights_str_mv openAccess
rights_invalid_str_mv https://creativecommons.org/licenses/by-nc-nd/2.5/ar/
dc.format.none.fl_str_mv application/pdf
application/pdf
application/pdf
dc.publisher.none.fl_str_mv Elsevier Science
publisher.none.fl_str_mv Elsevier Science
dc.source.none.fl_str_mv reponame:CONICET Digital (CONICET)
instname:Consejo Nacional de Investigaciones Científicas y Técnicas
reponame_str CONICET Digital (CONICET)
collection CONICET Digital (CONICET)
instname_str Consejo Nacional de Investigaciones Científicas y Técnicas
repository.name.fl_str_mv CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicas
repository.mail.fl_str_mv dasensio@conicet.gov.ar; lcarlino@conicet.gov.ar
_version_ 1844613832562966528
score 13.070432