Multivariate location and scatter matrix estimation under cellwise and casewise contamination
- Autores
- Leung, Andy; Yohai, Victor Jaime; Zamar, Ruben Horacio
- Año de publicación
- 2017
- Idioma
- inglés
- Tipo de recurso
- artículo
- Estado
- versión publicada
- Descripción
- Real data may contain both cellwise outliers and casewise outliers. There is a vast literature on robust estimation for casewise outliers, but only a scant literature for cellwise outliers and almost none for both types of outliers. Estimation of multivariate location and scatter matrix is a corner stone in multivariate data analysis. A two-step approach was recently proposed to perform robust estimation of multivariate location and scatter matrix in the presence of cellwise and casewise outliers. In the first step a univariate filter was applied to remove cellwise outliers. In the second step a generalized S-estimator was used to downweight casewise outliers. This proposal can be further improved in three main directions. First, through the introduction of a consistent bivariate filter to be used in combination with the univariate filter in the first step. Second, through the proposal of a new fast subsampling procedure to generate starting points for the generalized S-estimator in the second step. Third, through the use of a non-monotonic weight function for the generalized S-estimator to better handle casewise outliers in high dimension. A simulation study and a real data example show that, unlike the original two-step procedure, the modified two-step approach performs and scales well in high dimension. Moreover, they show that the modified procedure outperforms the original one and other state-of-the-art robust procedures under cellwise and casewise data contamination.
Fil: Leung, Andy. University of British Columbia; Canadá
Fil: Yohai, Victor Jaime. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Departamento de Matemática; Argentina
Fil: Zamar, Ruben Horacio. University of British Columbia; Canadá - Materia
-
Cellwise Outliers
Componentwise Contamination
Multivariate Location And Scatter
Robust Estimation - Nivel de accesibilidad
- acceso abierto
- Condiciones de uso
- https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
- Repositorio
.jpg)
- Institución
- Consejo Nacional de Investigaciones Científicas y Técnicas
- OAI Identificador
- oai:ri.conicet.gov.ar:11336/66009
Ver los metadatos del registro completo
| id |
CONICETDig_a65e885b38690ba4bdb3723535aa6bf9 |
|---|---|
| oai_identifier_str |
oai:ri.conicet.gov.ar:11336/66009 |
| network_acronym_str |
CONICETDig |
| repository_id_str |
3498 |
| network_name_str |
CONICET Digital (CONICET) |
| spelling |
Multivariate location and scatter matrix estimation under cellwise and casewise contaminationLeung, AndyYohai, Victor JaimeZamar, Ruben HoracioCellwise OutliersComponentwise ContaminationMultivariate Location And ScatterRobust Estimationhttps://purl.org/becyt/ford/1.1https://purl.org/becyt/ford/1Real data may contain both cellwise outliers and casewise outliers. There is a vast literature on robust estimation for casewise outliers, but only a scant literature for cellwise outliers and almost none for both types of outliers. Estimation of multivariate location and scatter matrix is a corner stone in multivariate data analysis. A two-step approach was recently proposed to perform robust estimation of multivariate location and scatter matrix in the presence of cellwise and casewise outliers. In the first step a univariate filter was applied to remove cellwise outliers. In the second step a generalized S-estimator was used to downweight casewise outliers. This proposal can be further improved in three main directions. First, through the introduction of a consistent bivariate filter to be used in combination with the univariate filter in the first step. Second, through the proposal of a new fast subsampling procedure to generate starting points for the generalized S-estimator in the second step. Third, through the use of a non-monotonic weight function for the generalized S-estimator to better handle casewise outliers in high dimension. A simulation study and a real data example show that, unlike the original two-step procedure, the modified two-step approach performs and scales well in high dimension. Moreover, they show that the modified procedure outperforms the original one and other state-of-the-art robust procedures under cellwise and casewise data contamination.Fil: Leung, Andy. University of British Columbia; CanadáFil: Yohai, Victor Jaime. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Departamento de Matemática; ArgentinaFil: Zamar, Ruben Horacio. University of British Columbia; CanadáElsevier Science2017-07info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionhttp://purl.org/coar/resource_type/c_6501info:ar-repo/semantics/articuloapplication/pdfapplication/pdfhttp://hdl.handle.net/11336/66009Leung, Andy; Yohai, Victor Jaime; Zamar, Ruben Horacio; Multivariate location and scatter matrix estimation under cellwise and casewise contamination; Elsevier Science; Computational Statistics and Data Analysis; 111; 7-2017; 59-760167-9473CONICET DigitalCONICETenginfo:eu-repo/semantics/altIdentifier/doi/10.1016/j.csda.2017.02.007info:eu-repo/semantics/altIdentifier/url/https://www.sciencedirect.com/science/article/pii/S0167947317300270info:eu-repo/semantics/openAccesshttps://creativecommons.org/licenses/by-nc-sa/2.5/ar/reponame:CONICET Digital (CONICET)instname:Consejo Nacional de Investigaciones Científicas y Técnicas2025-11-05T09:52:44Zoai:ri.conicet.gov.ar:11336/66009instacron:CONICETInstitucionalhttp://ri.conicet.gov.ar/Organismo científico-tecnológicoNo correspondehttp://ri.conicet.gov.ar/oai/requestdasensio@conicet.gov.ar; lcarlino@conicet.gov.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:34982025-11-05 09:52:44.455CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicasfalse |
| dc.title.none.fl_str_mv |
Multivariate location and scatter matrix estimation under cellwise and casewise contamination |
| title |
Multivariate location and scatter matrix estimation under cellwise and casewise contamination |
| spellingShingle |
Multivariate location and scatter matrix estimation under cellwise and casewise contamination Leung, Andy Cellwise Outliers Componentwise Contamination Multivariate Location And Scatter Robust Estimation |
| title_short |
Multivariate location and scatter matrix estimation under cellwise and casewise contamination |
| title_full |
Multivariate location and scatter matrix estimation under cellwise and casewise contamination |
| title_fullStr |
Multivariate location and scatter matrix estimation under cellwise and casewise contamination |
| title_full_unstemmed |
Multivariate location and scatter matrix estimation under cellwise and casewise contamination |
| title_sort |
Multivariate location and scatter matrix estimation under cellwise and casewise contamination |
| dc.creator.none.fl_str_mv |
Leung, Andy Yohai, Victor Jaime Zamar, Ruben Horacio |
| author |
Leung, Andy |
| author_facet |
Leung, Andy Yohai, Victor Jaime Zamar, Ruben Horacio |
| author_role |
author |
| author2 |
Yohai, Victor Jaime Zamar, Ruben Horacio |
| author2_role |
author author |
| dc.subject.none.fl_str_mv |
Cellwise Outliers Componentwise Contamination Multivariate Location And Scatter Robust Estimation |
| topic |
Cellwise Outliers Componentwise Contamination Multivariate Location And Scatter Robust Estimation |
| purl_subject.fl_str_mv |
https://purl.org/becyt/ford/1.1 https://purl.org/becyt/ford/1 |
| dc.description.none.fl_txt_mv |
Real data may contain both cellwise outliers and casewise outliers. There is a vast literature on robust estimation for casewise outliers, but only a scant literature for cellwise outliers and almost none for both types of outliers. Estimation of multivariate location and scatter matrix is a corner stone in multivariate data analysis. A two-step approach was recently proposed to perform robust estimation of multivariate location and scatter matrix in the presence of cellwise and casewise outliers. In the first step a univariate filter was applied to remove cellwise outliers. In the second step a generalized S-estimator was used to downweight casewise outliers. This proposal can be further improved in three main directions. First, through the introduction of a consistent bivariate filter to be used in combination with the univariate filter in the first step. Second, through the proposal of a new fast subsampling procedure to generate starting points for the generalized S-estimator in the second step. Third, through the use of a non-monotonic weight function for the generalized S-estimator to better handle casewise outliers in high dimension. A simulation study and a real data example show that, unlike the original two-step procedure, the modified two-step approach performs and scales well in high dimension. Moreover, they show that the modified procedure outperforms the original one and other state-of-the-art robust procedures under cellwise and casewise data contamination. Fil: Leung, Andy. University of British Columbia; Canadá Fil: Yohai, Victor Jaime. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Departamento de Matemática; Argentina Fil: Zamar, Ruben Horacio. University of British Columbia; Canadá |
| description |
Real data may contain both cellwise outliers and casewise outliers. There is a vast literature on robust estimation for casewise outliers, but only a scant literature for cellwise outliers and almost none for both types of outliers. Estimation of multivariate location and scatter matrix is a corner stone in multivariate data analysis. A two-step approach was recently proposed to perform robust estimation of multivariate location and scatter matrix in the presence of cellwise and casewise outliers. In the first step a univariate filter was applied to remove cellwise outliers. In the second step a generalized S-estimator was used to downweight casewise outliers. This proposal can be further improved in three main directions. First, through the introduction of a consistent bivariate filter to be used in combination with the univariate filter in the first step. Second, through the proposal of a new fast subsampling procedure to generate starting points for the generalized S-estimator in the second step. Third, through the use of a non-monotonic weight function for the generalized S-estimator to better handle casewise outliers in high dimension. A simulation study and a real data example show that, unlike the original two-step procedure, the modified two-step approach performs and scales well in high dimension. Moreover, they show that the modified procedure outperforms the original one and other state-of-the-art robust procedures under cellwise and casewise data contamination. |
| publishDate |
2017 |
| dc.date.none.fl_str_mv |
2017-07 |
| dc.type.none.fl_str_mv |
info:eu-repo/semantics/article info:eu-repo/semantics/publishedVersion http://purl.org/coar/resource_type/c_6501 info:ar-repo/semantics/articulo |
| format |
article |
| status_str |
publishedVersion |
| dc.identifier.none.fl_str_mv |
http://hdl.handle.net/11336/66009 Leung, Andy; Yohai, Victor Jaime; Zamar, Ruben Horacio; Multivariate location and scatter matrix estimation under cellwise and casewise contamination; Elsevier Science; Computational Statistics and Data Analysis; 111; 7-2017; 59-76 0167-9473 CONICET Digital CONICET |
| url |
http://hdl.handle.net/11336/66009 |
| identifier_str_mv |
Leung, Andy; Yohai, Victor Jaime; Zamar, Ruben Horacio; Multivariate location and scatter matrix estimation under cellwise and casewise contamination; Elsevier Science; Computational Statistics and Data Analysis; 111; 7-2017; 59-76 0167-9473 CONICET Digital CONICET |
| dc.language.none.fl_str_mv |
eng |
| language |
eng |
| dc.relation.none.fl_str_mv |
info:eu-repo/semantics/altIdentifier/doi/10.1016/j.csda.2017.02.007 info:eu-repo/semantics/altIdentifier/url/https://www.sciencedirect.com/science/article/pii/S0167947317300270 |
| dc.rights.none.fl_str_mv |
info:eu-repo/semantics/openAccess https://creativecommons.org/licenses/by-nc-sa/2.5/ar/ |
| eu_rights_str_mv |
openAccess |
| rights_invalid_str_mv |
https://creativecommons.org/licenses/by-nc-sa/2.5/ar/ |
| dc.format.none.fl_str_mv |
application/pdf application/pdf |
| dc.publisher.none.fl_str_mv |
Elsevier Science |
| publisher.none.fl_str_mv |
Elsevier Science |
| dc.source.none.fl_str_mv |
reponame:CONICET Digital (CONICET) instname:Consejo Nacional de Investigaciones Científicas y Técnicas |
| reponame_str |
CONICET Digital (CONICET) |
| collection |
CONICET Digital (CONICET) |
| instname_str |
Consejo Nacional de Investigaciones Científicas y Técnicas |
| repository.name.fl_str_mv |
CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicas |
| repository.mail.fl_str_mv |
dasensio@conicet.gov.ar; lcarlino@conicet.gov.ar |
| _version_ |
1847977268536672256 |
| score |
13.087074 |