Correcting MM estimates for "fat" data sets

Autores: Maronna, Ricardo Antonio; Yohai, Victor Jaime
Año de publicación: 2010
Idioma: inglés
Tipo de recurso: artículo
Estado: versión publicada
Descripción: Regression MM estimates require the estimation of the error scale, and the determination of a constant that controls the efficiency. These two steps are based on the asymptotic results that are derived assuming that the number of predictors p remains fixed while the number of observations n tends to infinity, which means assuming that the ratio p/n is "small". However, many high-dimensional data sets have a "large" value of p/n (say, ≥0.2). It is shown that the standard asymptotic results do not hold if p/n is large; namely that (a) the estimated scale underestimates the true error scale, and (b) that even if the scale is correctly estimated, the actual efficiency can be much lower than the nominal one. To overcome these drawbacks simple corrections for the scale and for the efficiency controlling constant are proposed, and it is demonstrated that these corrections improve on the estimate's performance under both normal and contaminated data.
Fil: Maronna, Ricardo Antonio. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina. Universidad Nacional de La Plata. Facultad de Ciencias Exactas; Argentina
Fil: Yohai, Victor Jaime. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina
Materia: MM estimators
M-scale
HIgh-dimensional data
Nivel de accesibilidad: acceso abierto
Condiciones de uso: https://creativecommons.org/licenses/by-nc-nd/2.5/ar/
Repositorio
Institución: Consejo Nacional de Investigaciones Científicas y Técnicas
OAI Identificador: oai:ri.conicet.gov.ar:11336/98688

Acceder

id	CONICETDig_79a59df4cd3343cfb05d5a0329d3a7af
oai_identifier_str	oai:ri.conicet.gov.ar:11336/98688
network_acronym_str	CONICETDig
repository_id_str	3498
network_name_str	CONICET Digital (CONICET)
spelling	Correcting MM estimates for "fat" data setsMaronna, Ricardo AntonioYohai, Victor JaimeMM estimatorsM-scaleHIgh-dimensional datahttps://purl.org/becyt/ford/1.1https://purl.org/becyt/ford/1Regression MM estimates require the estimation of the error scale, and the determination of a constant that controls the efficiency. These two steps are based on the asymptotic results that are derived assuming that the number of predictors p remains fixed while the number of observations n tends to infinity, which means assuming that the ratio p/n is "small". However, many high-dimensional data sets have a "large" value of p/n (say, ≥0.2). It is shown that the standard asymptotic results do not hold if p/n is large; namely that (a) the estimated scale underestimates the true error scale, and (b) that even if the scale is correctly estimated, the actual efficiency can be much lower than the nominal one. To overcome these drawbacks simple corrections for the scale and for the efficiency controlling constant are proposed, and it is demonstrated that these corrections improve on the estimate's performance under both normal and contaminated data.Fil: Maronna, Ricardo Antonio. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina. Universidad Nacional de La Plata. Facultad de Ciencias Exactas; ArgentinaFil: Yohai, Victor Jaime. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; ArgentinaElsevier Science2010-12info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionhttp://purl.org/coar/resource_type/c_6501info:ar-repo/semantics/articuloapplication/pdfapplication/pdfapplication/pdfhttp://hdl.handle.net/11336/98688Maronna, Ricardo Antonio; Yohai, Victor Jaime; Correcting MM estimates for "fat" data sets; Elsevier Science; Computational Statistics and Data Analysis; 54; 12; 12-2010; 3168-31730167-9473CONICET DigitalCONICETenginfo:eu-repo/semantics/altIdentifier/url/https://www.sciencedirect.com/science/article/pii/S0167947309003314info:eu-repo/semantics/altIdentifier/doi/10.1016/j.csda.2009.09.015info:eu-repo/semantics/openAccesshttps://creativecommons.org/licenses/by-nc-nd/2.5/ar/reponame:CONICET Digital (CONICET)instname:Consejo Nacional de Investigaciones Científicas y Técnicas2026-02-06T12:41:14Zoai:ri.conicet.gov.ar:11336/98688instacron:CONICETInstitucionalhttp://ri.conicet.gov.ar/Organismo científico-tecnológicoNo correspondehttp://ri.conicet.gov.ar/oai/requestdasensio@conicet.gov.ar; lcarlino@conicet.gov.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:34982026-02-06 12:41:14.623CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicasfalse
dc.title.none.fl_str_mv	Correcting MM estimates for "fat" data sets
title	Correcting MM estimates for "fat" data sets
spellingShingle	Correcting MM estimates for "fat" data sets Maronna, Ricardo Antonio MM estimators M-scale HIgh-dimensional data
title_short	Correcting MM estimates for "fat" data sets
title_full	Correcting MM estimates for "fat" data sets
title_fullStr	Correcting MM estimates for "fat" data sets
title_full_unstemmed	Correcting MM estimates for "fat" data sets
title_sort	Correcting MM estimates for "fat" data sets
dc.creator.none.fl_str_mv	Maronna, Ricardo Antonio Yohai, Victor Jaime
author	Maronna, Ricardo Antonio
author_facet	Maronna, Ricardo Antonio Yohai, Victor Jaime
author_role	author
author2	Yohai, Victor Jaime
author2_role	author
dc.subject.none.fl_str_mv	MM estimators M-scale HIgh-dimensional data
topic	MM estimators M-scale HIgh-dimensional data
purl_subject.fl_str_mv	https://purl.org/becyt/ford/1.1 https://purl.org/becyt/ford/1
dc.description.none.fl_txt_mv	Regression MM estimates require the estimation of the error scale, and the determination of a constant that controls the efficiency. These two steps are based on the asymptotic results that are derived assuming that the number of predictors p remains fixed while the number of observations n tends to infinity, which means assuming that the ratio p/n is "small". However, many high-dimensional data sets have a "large" value of p/n (say, ≥0.2). It is shown that the standard asymptotic results do not hold if p/n is large; namely that (a) the estimated scale underestimates the true error scale, and (b) that even if the scale is correctly estimated, the actual efficiency can be much lower than the nominal one. To overcome these drawbacks simple corrections for the scale and for the efficiency controlling constant are proposed, and it is demonstrated that these corrections improve on the estimate's performance under both normal and contaminated data. Fil: Maronna, Ricardo Antonio. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina. Universidad Nacional de La Plata. Facultad de Ciencias Exactas; Argentina Fil: Yohai, Victor Jaime. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina
description	Regression MM estimates require the estimation of the error scale, and the determination of a constant that controls the efficiency. These two steps are based on the asymptotic results that are derived assuming that the number of predictors p remains fixed while the number of observations n tends to infinity, which means assuming that the ratio p/n is "small". However, many high-dimensional data sets have a "large" value of p/n (say, ≥0.2). It is shown that the standard asymptotic results do not hold if p/n is large; namely that (a) the estimated scale underestimates the true error scale, and (b) that even if the scale is correctly estimated, the actual efficiency can be much lower than the nominal one. To overcome these drawbacks simple corrections for the scale and for the efficiency controlling constant are proposed, and it is demonstrated that these corrections improve on the estimate's performance under both normal and contaminated data.
publishDate	2010
dc.date.none.fl_str_mv	2010-12
dc.type.none.fl_str_mv	info:eu-repo/semantics/article info:eu-repo/semantics/publishedVersion http://purl.org/coar/resource_type/c_6501 info:ar-repo/semantics/articulo
format	article
status_str	publishedVersion
dc.identifier.none.fl_str_mv	http://hdl.handle.net/11336/98688 Maronna, Ricardo Antonio; Yohai, Victor Jaime; Correcting MM estimates for "fat" data sets; Elsevier Science; Computational Statistics and Data Analysis; 54; 12; 12-2010; 3168-3173 0167-9473 CONICET Digital CONICET
url	http://hdl.handle.net/11336/98688
identifier_str_mv	Maronna, Ricardo Antonio; Yohai, Victor Jaime; Correcting MM estimates for "fat" data sets; Elsevier Science; Computational Statistics and Data Analysis; 54; 12; 12-2010; 3168-3173 0167-9473 CONICET Digital CONICET
dc.language.none.fl_str_mv	eng
language	eng
dc.relation.none.fl_str_mv	info:eu-repo/semantics/altIdentifier/url/https://www.sciencedirect.com/science/article/pii/S0167947309003314 info:eu-repo/semantics/altIdentifier/doi/10.1016/j.csda.2009.09.015
dc.rights.none.fl_str_mv	info:eu-repo/semantics/openAccess https://creativecommons.org/licenses/by-nc-nd/2.5/ar/
eu_rights_str_mv	openAccess
rights_invalid_str_mv	https://creativecommons.org/licenses/by-nc-nd/2.5/ar/
dc.format.none.fl_str_mv	application/pdf application/pdf application/pdf
dc.publisher.none.fl_str_mv	Elsevier Science
publisher.none.fl_str_mv	Elsevier Science
dc.source.none.fl_str_mv	reponame:CONICET Digital (CONICET) instname:Consejo Nacional de Investigaciones Científicas y Técnicas
reponame_str	CONICET Digital (CONICET)
collection	CONICET Digital (CONICET)
instname_str	Consejo Nacional de Investigaciones Científicas y Técnicas
repository.name.fl_str_mv	CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicas
repository.mail.fl_str_mv	dasensio@conicet.gov.ar; lcarlino@conicet.gov.ar
_version_	1856403229324083200
score	13.106097

Correcting MM estimates for "fat" data sets

Publicaciones similares