Improved multiclass feature selection via list combination

Autores: Izetta Riera, Carlos Javier; Verdes, Pablo Fabian; Granitto, Pablo Miguel
Año de publicación: 2017
Idioma: inglés
Tipo de recurso: artículo
Estado: versión publicada
Descripción: Feature selection is a crucial machine learning technique aimed at reducing the dimensionality of the input space. By discarding useless or redundant variables, not only it improves model performance but also facilitates its interpretability. The well-known Support Vector Machines–Recursive Feature Elimination (SVM-RFE) algorithm provides good performance with moderate computational efforts, in particular for wide datasets. When using SVM-RFE on a multiclass classification problem, the usual strategy is to decompose it into a series of binary ones, and to generate an importance statistics for each feature on each binary problem. These importances are then averaged over the set of binary problems to synthesize a single value for feature ranking. In some cases, however, this procedure can lead to poor selection. In this paper we discuss six new strategies, based on list combination, designed to yield improved selections starting from the importances given by the binary problems. We evaluate them on artificial and real-world datasets, using both One–Vs–One (OVO) and One–Vs–All (OVA) strategies. Our results suggest that the OVO decomposition is most effective for feature selection on multiclass problems. We also find that in most situations the new K-First strategy can find better subsets of features than the traditional weight average approach.
Fil: Izetta Riera, Carlos Javier. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Rosario. Centro Internacional Franco Argentino de Ciencias de la Información y de Sistemas. Universidad Nacional de Rosario. Centro Internacional Franco Argentino de Ciencias de la Información y de Sistemas; Argentina
Fil: Verdes, Pablo Fabian. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Rosario. Centro Internacional Franco Argentino de Ciencias de la Información y de Sistemas. Universidad Nacional de Rosario. Centro Internacional Franco Argentino de Ciencias de la Información y de Sistemas; Argentina
Fil: Granitto, Pablo Miguel. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Rosario. Centro Internacional Franco Argentino de Ciencias de la Información y de Sistemas. Universidad Nacional de Rosario. Centro Internacional Franco Argentino de Ciencias de la Información y de Sistemas; Argentina
Materia: FEATURE SELECTION
MULTICLASS PROBLEMS
SUPPORT VECTOR MACHINE
Nivel de accesibilidad: acceso embargado
Condiciones de uso: https://creativecommons.org/licenses/by-nc-nd/2.5/ar/
Repositorio
Institución: Consejo Nacional de Investigaciones Científicas y Técnicas
OAI Identificador: oai:ri.conicet.gov.ar:11336/50349

Acceder

id	CONICETDig_37b6226af028997b3541724d9f04e76b
oai_identifier_str	oai:ri.conicet.gov.ar:11336/50349
network_acronym_str	CONICETDig
repository_id_str	3498
network_name_str	CONICET Digital (CONICET)
spelling	Improved multiclass feature selection via list combinationIzetta Riera, Carlos JavierVerdes, Pablo FabianGranitto, Pablo MiguelFEATURE SELECTIONMULTICLASS PROBLEMSSUPPORT VECTOR MACHINEhttps://purl.org/becyt/ford/1.2https://purl.org/becyt/ford/1Feature selection is a crucial machine learning technique aimed at reducing the dimensionality of the input space. By discarding useless or redundant variables, not only it improves model performance but also facilitates its interpretability. The well-known Support Vector Machines–Recursive Feature Elimination (SVM-RFE) algorithm provides good performance with moderate computational efforts, in particular for wide datasets. When using SVM-RFE on a multiclass classification problem, the usual strategy is to decompose it into a series of binary ones, and to generate an importance statistics for each feature on each binary problem. These importances are then averaged over the set of binary problems to synthesize a single value for feature ranking. In some cases, however, this procedure can lead to poor selection. In this paper we discuss six new strategies, based on list combination, designed to yield improved selections starting from the importances given by the binary problems. We evaluate them on artificial and real-world datasets, using both One–Vs–One (OVO) and One–Vs–All (OVA) strategies. Our results suggest that the OVO decomposition is most effective for feature selection on multiclass problems. We also find that in most situations the new K-First strategy can find better subsets of features than the traditional weight average approach.Fil: Izetta Riera, Carlos Javier. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Rosario. Centro Internacional Franco Argentino de Ciencias de la Información y de Sistemas. Universidad Nacional de Rosario. Centro Internacional Franco Argentino de Ciencias de la Información y de Sistemas; ArgentinaFil: Verdes, Pablo Fabian. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Rosario. Centro Internacional Franco Argentino de Ciencias de la Información y de Sistemas. Universidad Nacional de Rosario. Centro Internacional Franco Argentino de Ciencias de la Información y de Sistemas; ArgentinaFil: Granitto, Pablo Miguel. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Rosario. Centro Internacional Franco Argentino de Ciencias de la Información y de Sistemas. Universidad Nacional de Rosario. Centro Internacional Franco Argentino de Ciencias de la Información y de Sistemas; ArgentinaPergamon-Elsevier Science Ltd2017-12info:eu-repo/date/embargoEnd/2018-07-01info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionhttp://purl.org/coar/resource_type/c_6501info:ar-repo/semantics/articuloapplication/pdfapplication/pdfhttp://hdl.handle.net/11336/50349Izetta Riera, Carlos Javier; Verdes, Pablo Fabian; Granitto, Pablo Miguel; Improved multiclass feature selection via list combination; Pergamon-Elsevier Science Ltd; Expert Systems with Applications; 88; 12-2017; 205-2160957-4174CONICET DigitalCONICETenginfo:eu-repo/semantics/altIdentifier/doi/10.1016/j.eswa.2017.06.043info:eu-repo/semantics/altIdentifier/url/https://www.sciencedirect.com/science/article/pii/S0957417417304670info:eu-repo/semantics/embargoedAccesshttps://creativecommons.org/licenses/by-nc-nd/2.5/ar/reponame:CONICET Digital (CONICET)instname:Consejo Nacional de Investigaciones Científicas y Técnicas2026-05-27T14:22:31Zoai:ri.conicet.gov.ar:11336/50349instacron:CONICETInstitucionalhttp://ri.conicet.gov.ar/Organismo científico-tecnológicoNo correspondehttp://ri.conicet.gov.ar/oai/requestdasensio@conicet.gov.ar; lcarlino@conicet.gov.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:34982026-05-27 14:22:31.945CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicasfalse
dc.title.none.fl_str_mv	Improved multiclass feature selection via list combination
title	Improved multiclass feature selection via list combination
spellingShingle	Improved multiclass feature selection via list combination Izetta Riera, Carlos Javier FEATURE SELECTION MULTICLASS PROBLEMS SUPPORT VECTOR MACHINE
title_short	Improved multiclass feature selection via list combination
title_full	Improved multiclass feature selection via list combination
title_fullStr	Improved multiclass feature selection via list combination
title_full_unstemmed	Improved multiclass feature selection via list combination
title_sort	Improved multiclass feature selection via list combination
dc.creator.none.fl_str_mv	Izetta Riera, Carlos Javier Verdes, Pablo Fabian Granitto, Pablo Miguel
author	Izetta Riera, Carlos Javier
author_facet	Izetta Riera, Carlos Javier Verdes, Pablo Fabian Granitto, Pablo Miguel
author_role	author
author2	Verdes, Pablo Fabian Granitto, Pablo Miguel
author2_role	author author
dc.subject.none.fl_str_mv	FEATURE SELECTION MULTICLASS PROBLEMS SUPPORT VECTOR MACHINE
topic	FEATURE SELECTION MULTICLASS PROBLEMS SUPPORT VECTOR MACHINE
purl_subject.fl_str_mv	https://purl.org/becyt/ford/1.2 https://purl.org/becyt/ford/1
dc.description.none.fl_txt_mv	Feature selection is a crucial machine learning technique aimed at reducing the dimensionality of the input space. By discarding useless or redundant variables, not only it improves model performance but also facilitates its interpretability. The well-known Support Vector Machines–Recursive Feature Elimination (SVM-RFE) algorithm provides good performance with moderate computational efforts, in particular for wide datasets. When using SVM-RFE on a multiclass classification problem, the usual strategy is to decompose it into a series of binary ones, and to generate an importance statistics for each feature on each binary problem. These importances are then averaged over the set of binary problems to synthesize a single value for feature ranking. In some cases, however, this procedure can lead to poor selection. In this paper we discuss six new strategies, based on list combination, designed to yield improved selections starting from the importances given by the binary problems. We evaluate them on artificial and real-world datasets, using both One–Vs–One (OVO) and One–Vs–All (OVA) strategies. Our results suggest that the OVO decomposition is most effective for feature selection on multiclass problems. We also find that in most situations the new K-First strategy can find better subsets of features than the traditional weight average approach. Fil: Izetta Riera, Carlos Javier. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Rosario. Centro Internacional Franco Argentino de Ciencias de la Información y de Sistemas. Universidad Nacional de Rosario. Centro Internacional Franco Argentino de Ciencias de la Información y de Sistemas; Argentina Fil: Verdes, Pablo Fabian. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Rosario. Centro Internacional Franco Argentino de Ciencias de la Información y de Sistemas. Universidad Nacional de Rosario. Centro Internacional Franco Argentino de Ciencias de la Información y de Sistemas; Argentina Fil: Granitto, Pablo Miguel. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Rosario. Centro Internacional Franco Argentino de Ciencias de la Información y de Sistemas. Universidad Nacional de Rosario. Centro Internacional Franco Argentino de Ciencias de la Información y de Sistemas; Argentina
description	Feature selection is a crucial machine learning technique aimed at reducing the dimensionality of the input space. By discarding useless or redundant variables, not only it improves model performance but also facilitates its interpretability. The well-known Support Vector Machines–Recursive Feature Elimination (SVM-RFE) algorithm provides good performance with moderate computational efforts, in particular for wide datasets. When using SVM-RFE on a multiclass classification problem, the usual strategy is to decompose it into a series of binary ones, and to generate an importance statistics for each feature on each binary problem. These importances are then averaged over the set of binary problems to synthesize a single value for feature ranking. In some cases, however, this procedure can lead to poor selection. In this paper we discuss six new strategies, based on list combination, designed to yield improved selections starting from the importances given by the binary problems. We evaluate them on artificial and real-world datasets, using both One–Vs–One (OVO) and One–Vs–All (OVA) strategies. Our results suggest that the OVO decomposition is most effective for feature selection on multiclass problems. We also find that in most situations the new K-First strategy can find better subsets of features than the traditional weight average approach.
publishDate	2017
dc.date.none.fl_str_mv	2017-12 info:eu-repo/date/embargoEnd/2018-07-01
dc.type.none.fl_str_mv	info:eu-repo/semantics/article info:eu-repo/semantics/publishedVersion http://purl.org/coar/resource_type/c_6501 info:ar-repo/semantics/articulo
format	article
status_str	publishedVersion
dc.identifier.none.fl_str_mv	http://hdl.handle.net/11336/50349 Izetta Riera, Carlos Javier; Verdes, Pablo Fabian; Granitto, Pablo Miguel; Improved multiclass feature selection via list combination; Pergamon-Elsevier Science Ltd; Expert Systems with Applications; 88; 12-2017; 205-216 0957-4174 CONICET Digital CONICET
url	http://hdl.handle.net/11336/50349
identifier_str_mv	Izetta Riera, Carlos Javier; Verdes, Pablo Fabian; Granitto, Pablo Miguel; Improved multiclass feature selection via list combination; Pergamon-Elsevier Science Ltd; Expert Systems with Applications; 88; 12-2017; 205-216 0957-4174 CONICET Digital CONICET
dc.language.none.fl_str_mv	eng
language	eng
dc.relation.none.fl_str_mv	info:eu-repo/semantics/altIdentifier/doi/10.1016/j.eswa.2017.06.043 info:eu-repo/semantics/altIdentifier/url/https://www.sciencedirect.com/science/article/pii/S0957417417304670
dc.rights.none.fl_str_mv	info:eu-repo/semantics/embargoedAccess https://creativecommons.org/licenses/by-nc-nd/2.5/ar/
eu_rights_str_mv	embargoedAccess
rights_invalid_str_mv	https://creativecommons.org/licenses/by-nc-nd/2.5/ar/
dc.format.none.fl_str_mv	application/pdf application/pdf
dc.publisher.none.fl_str_mv	Pergamon-Elsevier Science Ltd
publisher.none.fl_str_mv	Pergamon-Elsevier Science Ltd
dc.source.none.fl_str_mv	reponame:CONICET Digital (CONICET) instname:Consejo Nacional de Investigaciones Científicas y Técnicas
reponame_str	CONICET Digital (CONICET)
collection	CONICET Digital (CONICET)
instname_str	Consejo Nacional de Investigaciones Científicas y Técnicas
repository.name.fl_str_mv	CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicas
repository.mail.fl_str_mv	dasensio@conicet.gov.ar; lcarlino@conicet.gov.ar
_version_	1866372712582086656
score	13.143419

Improved multiclass feature selection via list combination

Publicaciones similares