Improved multiclass feature selection via list combination
- Autores
- Izetta Riera, Carlos Javier; Verdes, Pablo Fabian; Granitto, Pablo Miguel
- Año de publicación
- 2017
- Idioma
- inglés
- Tipo de recurso
- artículo
- Estado
- versión publicada
- Descripción
- Feature selection is a crucial machine learning technique aimed at reducing the dimensionality of the input space. By discarding useless or redundant variables, not only it improves model performance but also facilitates its interpretability. The well-known Support Vector Machines–Recursive Feature Elimination (SVM-RFE) algorithm provides good performance with moderate computational efforts, in particular for wide datasets. When using SVM-RFE on a multiclass classification problem, the usual strategy is to decompose it into a series of binary ones, and to generate an importance statistics for each feature on each binary problem. These importances are then averaged over the set of binary problems to synthesize a single value for feature ranking. In some cases, however, this procedure can lead to poor selection. In this paper we discuss six new strategies, based on list combination, designed to yield improved selections starting from the importances given by the binary problems. We evaluate them on artificial and real-world datasets, using both One–Vs–One (OVO) and One–Vs–All (OVA) strategies. Our results suggest that the OVO decomposition is most effective for feature selection on multiclass problems. We also find that in most situations the new K-First strategy can find better subsets of features than the traditional weight average approach.
Fil: Izetta Riera, Carlos Javier. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Rosario. Centro Internacional Franco Argentino de Ciencias de la Información y de Sistemas. Universidad Nacional de Rosario. Centro Internacional Franco Argentino de Ciencias de la Información y de Sistemas; Argentina
Fil: Verdes, Pablo Fabian. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Rosario. Centro Internacional Franco Argentino de Ciencias de la Información y de Sistemas. Universidad Nacional de Rosario. Centro Internacional Franco Argentino de Ciencias de la Información y de Sistemas; Argentina
Fil: Granitto, Pablo Miguel. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Rosario. Centro Internacional Franco Argentino de Ciencias de la Información y de Sistemas. Universidad Nacional de Rosario. Centro Internacional Franco Argentino de Ciencias de la Información y de Sistemas; Argentina - Materia
-
FEATURE SELECTION
MULTICLASS PROBLEMS
SUPPORT VECTOR MACHINE - Nivel de accesibilidad
- acceso embargado
- Condiciones de uso
- https://creativecommons.org/licenses/by-nc-nd/2.5/ar/
- Repositorio
- Institución
- Consejo Nacional de Investigaciones Científicas y Técnicas
- OAI Identificador
- oai:ri.conicet.gov.ar:11336/50349
Ver los metadatos del registro completo
id |
CONICETDig_37b6226af028997b3541724d9f04e76b |
---|---|
oai_identifier_str |
oai:ri.conicet.gov.ar:11336/50349 |
network_acronym_str |
CONICETDig |
repository_id_str |
3498 |
network_name_str |
CONICET Digital (CONICET) |
spelling |
Improved multiclass feature selection via list combinationIzetta Riera, Carlos JavierVerdes, Pablo FabianGranitto, Pablo MiguelFEATURE SELECTIONMULTICLASS PROBLEMSSUPPORT VECTOR MACHINEhttps://purl.org/becyt/ford/1.2https://purl.org/becyt/ford/1Feature selection is a crucial machine learning technique aimed at reducing the dimensionality of the input space. By discarding useless or redundant variables, not only it improves model performance but also facilitates its interpretability. The well-known Support Vector Machines–Recursive Feature Elimination (SVM-RFE) algorithm provides good performance with moderate computational efforts, in particular for wide datasets. When using SVM-RFE on a multiclass classification problem, the usual strategy is to decompose it into a series of binary ones, and to generate an importance statistics for each feature on each binary problem. These importances are then averaged over the set of binary problems to synthesize a single value for feature ranking. In some cases, however, this procedure can lead to poor selection. In this paper we discuss six new strategies, based on list combination, designed to yield improved selections starting from the importances given by the binary problems. We evaluate them on artificial and real-world datasets, using both One–Vs–One (OVO) and One–Vs–All (OVA) strategies. Our results suggest that the OVO decomposition is most effective for feature selection on multiclass problems. We also find that in most situations the new K-First strategy can find better subsets of features than the traditional weight average approach.Fil: Izetta Riera, Carlos Javier. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Rosario. Centro Internacional Franco Argentino de Ciencias de la Información y de Sistemas. Universidad Nacional de Rosario. Centro Internacional Franco Argentino de Ciencias de la Información y de Sistemas; ArgentinaFil: Verdes, Pablo Fabian. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Rosario. Centro Internacional Franco Argentino de Ciencias de la Información y de Sistemas. Universidad Nacional de Rosario. Centro Internacional Franco Argentino de Ciencias de la Información y de Sistemas; ArgentinaFil: Granitto, Pablo Miguel. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Rosario. Centro Internacional Franco Argentino de Ciencias de la Información y de Sistemas. Universidad Nacional de Rosario. Centro Internacional Franco Argentino de Ciencias de la Información y de Sistemas; ArgentinaPergamon-Elsevier Science Ltd2017-12info:eu-repo/date/embargoEnd/2018-07-01info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionhttp://purl.org/coar/resource_type/c_6501info:ar-repo/semantics/articuloapplication/pdfapplication/pdfhttp://hdl.handle.net/11336/50349Izetta Riera, Carlos Javier; Verdes, Pablo Fabian; Granitto, Pablo Miguel; Improved multiclass feature selection via list combination; Pergamon-Elsevier Science Ltd; Expert Systems with Applications; 88; 12-2017; 205-2160957-4174CONICET DigitalCONICETenginfo:eu-repo/semantics/altIdentifier/doi/10.1016/j.eswa.2017.06.043info:eu-repo/semantics/altIdentifier/url/https://www.sciencedirect.com/science/article/pii/S0957417417304670info:eu-repo/semantics/embargoedAccesshttps://creativecommons.org/licenses/by-nc-nd/2.5/ar/reponame:CONICET Digital (CONICET)instname:Consejo Nacional de Investigaciones Científicas y Técnicas2025-09-29T09:50:37Zoai:ri.conicet.gov.ar:11336/50349instacron:CONICETInstitucionalhttp://ri.conicet.gov.ar/Organismo científico-tecnológicoNo correspondehttp://ri.conicet.gov.ar/oai/requestdasensio@conicet.gov.ar; lcarlino@conicet.gov.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:34982025-09-29 09:50:38.166CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicasfalse |
dc.title.none.fl_str_mv |
Improved multiclass feature selection via list combination |
title |
Improved multiclass feature selection via list combination |
spellingShingle |
Improved multiclass feature selection via list combination Izetta Riera, Carlos Javier FEATURE SELECTION MULTICLASS PROBLEMS SUPPORT VECTOR MACHINE |
title_short |
Improved multiclass feature selection via list combination |
title_full |
Improved multiclass feature selection via list combination |
title_fullStr |
Improved multiclass feature selection via list combination |
title_full_unstemmed |
Improved multiclass feature selection via list combination |
title_sort |
Improved multiclass feature selection via list combination |
dc.creator.none.fl_str_mv |
Izetta Riera, Carlos Javier Verdes, Pablo Fabian Granitto, Pablo Miguel |
author |
Izetta Riera, Carlos Javier |
author_facet |
Izetta Riera, Carlos Javier Verdes, Pablo Fabian Granitto, Pablo Miguel |
author_role |
author |
author2 |
Verdes, Pablo Fabian Granitto, Pablo Miguel |
author2_role |
author author |
dc.subject.none.fl_str_mv |
FEATURE SELECTION MULTICLASS PROBLEMS SUPPORT VECTOR MACHINE |
topic |
FEATURE SELECTION MULTICLASS PROBLEMS SUPPORT VECTOR MACHINE |
purl_subject.fl_str_mv |
https://purl.org/becyt/ford/1.2 https://purl.org/becyt/ford/1 |
dc.description.none.fl_txt_mv |
Feature selection is a crucial machine learning technique aimed at reducing the dimensionality of the input space. By discarding useless or redundant variables, not only it improves model performance but also facilitates its interpretability. The well-known Support Vector Machines–Recursive Feature Elimination (SVM-RFE) algorithm provides good performance with moderate computational efforts, in particular for wide datasets. When using SVM-RFE on a multiclass classification problem, the usual strategy is to decompose it into a series of binary ones, and to generate an importance statistics for each feature on each binary problem. These importances are then averaged over the set of binary problems to synthesize a single value for feature ranking. In some cases, however, this procedure can lead to poor selection. In this paper we discuss six new strategies, based on list combination, designed to yield improved selections starting from the importances given by the binary problems. We evaluate them on artificial and real-world datasets, using both One–Vs–One (OVO) and One–Vs–All (OVA) strategies. Our results suggest that the OVO decomposition is most effective for feature selection on multiclass problems. We also find that in most situations the new K-First strategy can find better subsets of features than the traditional weight average approach. Fil: Izetta Riera, Carlos Javier. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Rosario. Centro Internacional Franco Argentino de Ciencias de la Información y de Sistemas. Universidad Nacional de Rosario. Centro Internacional Franco Argentino de Ciencias de la Información y de Sistemas; Argentina Fil: Verdes, Pablo Fabian. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Rosario. Centro Internacional Franco Argentino de Ciencias de la Información y de Sistemas. Universidad Nacional de Rosario. Centro Internacional Franco Argentino de Ciencias de la Información y de Sistemas; Argentina Fil: Granitto, Pablo Miguel. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Rosario. Centro Internacional Franco Argentino de Ciencias de la Información y de Sistemas. Universidad Nacional de Rosario. Centro Internacional Franco Argentino de Ciencias de la Información y de Sistemas; Argentina |
description |
Feature selection is a crucial machine learning technique aimed at reducing the dimensionality of the input space. By discarding useless or redundant variables, not only it improves model performance but also facilitates its interpretability. The well-known Support Vector Machines–Recursive Feature Elimination (SVM-RFE) algorithm provides good performance with moderate computational efforts, in particular for wide datasets. When using SVM-RFE on a multiclass classification problem, the usual strategy is to decompose it into a series of binary ones, and to generate an importance statistics for each feature on each binary problem. These importances are then averaged over the set of binary problems to synthesize a single value for feature ranking. In some cases, however, this procedure can lead to poor selection. In this paper we discuss six new strategies, based on list combination, designed to yield improved selections starting from the importances given by the binary problems. We evaluate them on artificial and real-world datasets, using both One–Vs–One (OVO) and One–Vs–All (OVA) strategies. Our results suggest that the OVO decomposition is most effective for feature selection on multiclass problems. We also find that in most situations the new K-First strategy can find better subsets of features than the traditional weight average approach. |
publishDate |
2017 |
dc.date.none.fl_str_mv |
2017-12 info:eu-repo/date/embargoEnd/2018-07-01 |
dc.type.none.fl_str_mv |
info:eu-repo/semantics/article info:eu-repo/semantics/publishedVersion http://purl.org/coar/resource_type/c_6501 info:ar-repo/semantics/articulo |
format |
article |
status_str |
publishedVersion |
dc.identifier.none.fl_str_mv |
http://hdl.handle.net/11336/50349 Izetta Riera, Carlos Javier; Verdes, Pablo Fabian; Granitto, Pablo Miguel; Improved multiclass feature selection via list combination; Pergamon-Elsevier Science Ltd; Expert Systems with Applications; 88; 12-2017; 205-216 0957-4174 CONICET Digital CONICET |
url |
http://hdl.handle.net/11336/50349 |
identifier_str_mv |
Izetta Riera, Carlos Javier; Verdes, Pablo Fabian; Granitto, Pablo Miguel; Improved multiclass feature selection via list combination; Pergamon-Elsevier Science Ltd; Expert Systems with Applications; 88; 12-2017; 205-216 0957-4174 CONICET Digital CONICET |
dc.language.none.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
info:eu-repo/semantics/altIdentifier/doi/10.1016/j.eswa.2017.06.043 info:eu-repo/semantics/altIdentifier/url/https://www.sciencedirect.com/science/article/pii/S0957417417304670 |
dc.rights.none.fl_str_mv |
info:eu-repo/semantics/embargoedAccess https://creativecommons.org/licenses/by-nc-nd/2.5/ar/ |
eu_rights_str_mv |
embargoedAccess |
rights_invalid_str_mv |
https://creativecommons.org/licenses/by-nc-nd/2.5/ar/ |
dc.format.none.fl_str_mv |
application/pdf application/pdf |
dc.publisher.none.fl_str_mv |
Pergamon-Elsevier Science Ltd |
publisher.none.fl_str_mv |
Pergamon-Elsevier Science Ltd |
dc.source.none.fl_str_mv |
reponame:CONICET Digital (CONICET) instname:Consejo Nacional de Investigaciones Científicas y Técnicas |
reponame_str |
CONICET Digital (CONICET) |
collection |
CONICET Digital (CONICET) |
instname_str |
Consejo Nacional de Investigaciones Científicas y Técnicas |
repository.name.fl_str_mv |
CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicas |
repository.mail.fl_str_mv |
dasensio@conicet.gov.ar; lcarlino@conicet.gov.ar |
_version_ |
1844613560592760832 |
score |
13.070432 |