How Many Clusters: A Validation Index for Arbitrary-Shaped Clusters
- Autores
- Baya, Ariel Emilio; Granitto, Pablo Miguel
- Año de publicación
- 2013
- Idioma
- inglés
- Tipo de recurso
- artículo
- Estado
- versión publicada
- Descripción
- Clustering validation indexes are intended to assess the goodness of clustering results. Many methods used to estimate the number of clusters rely on a validation index as a key element to find the correct answer. This paper presents a new validation index based on graph concepts, which has been designed to find arbitrary shaped clusters by exploiting the spatial layout of the patterns and their clustering label. This new clustering index is combined with a solid statistical detection framework, the Gap Statistic. The resulting method is able to find the right number of arbitrary shaped clusters in diverse situations, as we show with examples where this information is available. A comparison with several relevant validation methods is carried out using artificial and gene expression datasets. The results are very encouraging, showing that the underlying structure in the data can be more accurately detected with the new clustering index. Our gene expression data results also indicate that this new index is stable under perturbation of the input data.
Fil: Baya, Ariel Emilio. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico - Conicet - Rosario. Instituto Rosario de Investigaciones En Ciencias de la Educación; Argentina
Fil: Granitto, Pablo Miguel. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico - Conicet - Rosario. Instituto Rosario de Investigaciones En Ciencias de la Educación; Argentina - Materia
-
CLUSTERING
GENOMIC DATA
VALIDATION INDEX - Nivel de accesibilidad
- acceso abierto
- Condiciones de uso
- https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
- Repositorio
.jpg)
- Institución
- Consejo Nacional de Investigaciones Científicas y Técnicas
- OAI Identificador
- oai:ri.conicet.gov.ar:11336/1459
Ver los metadatos del registro completo
| id |
CONICETDig_c63088dc940c96e55212676509e97a0a |
|---|---|
| oai_identifier_str |
oai:ri.conicet.gov.ar:11336/1459 |
| network_acronym_str |
CONICETDig |
| repository_id_str |
3498 |
| network_name_str |
CONICET Digital (CONICET) |
| spelling |
How Many Clusters: A Validation Index for Arbitrary-Shaped ClustersBaya, Ariel EmilioGranitto, Pablo MiguelCLUSTERINGGENOMIC DATAVALIDATION INDEXhttps://purl.org/becyt/ford/2.2https://purl.org/becyt/ford/2Clustering validation indexes are intended to assess the goodness of clustering results. Many methods used to estimate the number of clusters rely on a validation index as a key element to find the correct answer. This paper presents a new validation index based on graph concepts, which has been designed to find arbitrary shaped clusters by exploiting the spatial layout of the patterns and their clustering label. This new clustering index is combined with a solid statistical detection framework, the Gap Statistic. The resulting method is able to find the right number of arbitrary shaped clusters in diverse situations, as we show with examples where this information is available. A comparison with several relevant validation methods is carried out using artificial and gene expression datasets. The results are very encouraging, showing that the underlying structure in the data can be more accurately detected with the new clustering index. Our gene expression data results also indicate that this new index is stable under perturbation of the input data.Fil: Baya, Ariel Emilio. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico - Conicet - Rosario. Instituto Rosario de Investigaciones En Ciencias de la Educación; ArgentinaFil: Granitto, Pablo Miguel. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico - Conicet - Rosario. Instituto Rosario de Investigaciones En Ciencias de la Educación; ArgentinaIEEE Computer Society2013-04info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionhttp://purl.org/coar/resource_type/c_6501info:ar-repo/semantics/articuloapplication/pdfapplication/pdfhttp://hdl.handle.net/11336/1459Baya, Ariel Emilio; Granitto, Pablo Miguel; How Many Clusters: A Validation Index for Arbitrary-Shaped Clusters; IEEE Computer Society; Ieee-acm Transactions On Computational Biology And Bioinformatics; 10; 2; 4-2013; 401-4141545-5963enginfo:eu-repo/semantics/altIdentifier/doi/10.1109/TCBB.2013.32info:eu-repo/semantics/openAccesshttps://creativecommons.org/licenses/by-nc-sa/2.5/ar/reponame:CONICET Digital (CONICET)instname:Consejo Nacional de Investigaciones Científicas y Técnicas2025-10-15T14:39:10Zoai:ri.conicet.gov.ar:11336/1459instacron:CONICETInstitucionalhttp://ri.conicet.gov.ar/Organismo científico-tecnológicoNo correspondehttp://ri.conicet.gov.ar/oai/requestdasensio@conicet.gov.ar; lcarlino@conicet.gov.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:34982025-10-15 14:39:10.484CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicasfalse |
| dc.title.none.fl_str_mv |
How Many Clusters: A Validation Index for Arbitrary-Shaped Clusters |
| title |
How Many Clusters: A Validation Index for Arbitrary-Shaped Clusters |
| spellingShingle |
How Many Clusters: A Validation Index for Arbitrary-Shaped Clusters Baya, Ariel Emilio CLUSTERING GENOMIC DATA VALIDATION INDEX |
| title_short |
How Many Clusters: A Validation Index for Arbitrary-Shaped Clusters |
| title_full |
How Many Clusters: A Validation Index for Arbitrary-Shaped Clusters |
| title_fullStr |
How Many Clusters: A Validation Index for Arbitrary-Shaped Clusters |
| title_full_unstemmed |
How Many Clusters: A Validation Index for Arbitrary-Shaped Clusters |
| title_sort |
How Many Clusters: A Validation Index for Arbitrary-Shaped Clusters |
| dc.creator.none.fl_str_mv |
Baya, Ariel Emilio Granitto, Pablo Miguel |
| author |
Baya, Ariel Emilio |
| author_facet |
Baya, Ariel Emilio Granitto, Pablo Miguel |
| author_role |
author |
| author2 |
Granitto, Pablo Miguel |
| author2_role |
author |
| dc.subject.none.fl_str_mv |
CLUSTERING GENOMIC DATA VALIDATION INDEX |
| topic |
CLUSTERING GENOMIC DATA VALIDATION INDEX |
| purl_subject.fl_str_mv |
https://purl.org/becyt/ford/2.2 https://purl.org/becyt/ford/2 |
| dc.description.none.fl_txt_mv |
Clustering validation indexes are intended to assess the goodness of clustering results. Many methods used to estimate the number of clusters rely on a validation index as a key element to find the correct answer. This paper presents a new validation index based on graph concepts, which has been designed to find arbitrary shaped clusters by exploiting the spatial layout of the patterns and their clustering label. This new clustering index is combined with a solid statistical detection framework, the Gap Statistic. The resulting method is able to find the right number of arbitrary shaped clusters in diverse situations, as we show with examples where this information is available. A comparison with several relevant validation methods is carried out using artificial and gene expression datasets. The results are very encouraging, showing that the underlying structure in the data can be more accurately detected with the new clustering index. Our gene expression data results also indicate that this new index is stable under perturbation of the input data. Fil: Baya, Ariel Emilio. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico - Conicet - Rosario. Instituto Rosario de Investigaciones En Ciencias de la Educación; Argentina Fil: Granitto, Pablo Miguel. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico - Conicet - Rosario. Instituto Rosario de Investigaciones En Ciencias de la Educación; Argentina |
| description |
Clustering validation indexes are intended to assess the goodness of clustering results. Many methods used to estimate the number of clusters rely on a validation index as a key element to find the correct answer. This paper presents a new validation index based on graph concepts, which has been designed to find arbitrary shaped clusters by exploiting the spatial layout of the patterns and their clustering label. This new clustering index is combined with a solid statistical detection framework, the Gap Statistic. The resulting method is able to find the right number of arbitrary shaped clusters in diverse situations, as we show with examples where this information is available. A comparison with several relevant validation methods is carried out using artificial and gene expression datasets. The results are very encouraging, showing that the underlying structure in the data can be more accurately detected with the new clustering index. Our gene expression data results also indicate that this new index is stable under perturbation of the input data. |
| publishDate |
2013 |
| dc.date.none.fl_str_mv |
2013-04 |
| dc.type.none.fl_str_mv |
info:eu-repo/semantics/article info:eu-repo/semantics/publishedVersion http://purl.org/coar/resource_type/c_6501 info:ar-repo/semantics/articulo |
| format |
article |
| status_str |
publishedVersion |
| dc.identifier.none.fl_str_mv |
http://hdl.handle.net/11336/1459 Baya, Ariel Emilio; Granitto, Pablo Miguel; How Many Clusters: A Validation Index for Arbitrary-Shaped Clusters; IEEE Computer Society; Ieee-acm Transactions On Computational Biology And Bioinformatics; 10; 2; 4-2013; 401-414 1545-5963 |
| url |
http://hdl.handle.net/11336/1459 |
| identifier_str_mv |
Baya, Ariel Emilio; Granitto, Pablo Miguel; How Many Clusters: A Validation Index for Arbitrary-Shaped Clusters; IEEE Computer Society; Ieee-acm Transactions On Computational Biology And Bioinformatics; 10; 2; 4-2013; 401-414 1545-5963 |
| dc.language.none.fl_str_mv |
eng |
| language |
eng |
| dc.relation.none.fl_str_mv |
info:eu-repo/semantics/altIdentifier/doi/10.1109/TCBB.2013.32 |
| dc.rights.none.fl_str_mv |
info:eu-repo/semantics/openAccess https://creativecommons.org/licenses/by-nc-sa/2.5/ar/ |
| eu_rights_str_mv |
openAccess |
| rights_invalid_str_mv |
https://creativecommons.org/licenses/by-nc-sa/2.5/ar/ |
| dc.format.none.fl_str_mv |
application/pdf application/pdf |
| dc.publisher.none.fl_str_mv |
IEEE Computer Society |
| publisher.none.fl_str_mv |
IEEE Computer Society |
| dc.source.none.fl_str_mv |
reponame:CONICET Digital (CONICET) instname:Consejo Nacional de Investigaciones Científicas y Técnicas |
| reponame_str |
CONICET Digital (CONICET) |
| collection |
CONICET Digital (CONICET) |
| instname_str |
Consejo Nacional de Investigaciones Científicas y Técnicas |
| repository.name.fl_str_mv |
CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicas |
| repository.mail.fl_str_mv |
dasensio@conicet.gov.ar; lcarlino@conicet.gov.ar |
| _version_ |
1846082875010580480 |
| score |
13.22299 |