Silhouette + Attraction: A Simple and Effective Method for Text Clustering
- Autores
- Errecalde, Marcelo L.; Cagnina, Leticia Cecilia; Rosso, Paolo
- Año de publicación
- 2015
- Idioma
- inglés
- Tipo de recurso
- artículo
- Estado
- versión publicada
- Descripción
- This article presents Sil-Att, a simple and effective method for text clustering, which is based on two main concepts: the silhouette coefficient and the idea of attraction. The combination of both principles allows to obtain a general technique that can be used either as a boosting method, which improves results of other clustering algorithms, or as an independent clustering algorithm. The experimental work shows that Sil-Att is able to obtain high quality results on text corpora with very different characteristics. Furthermore, its stable performance on all the considered corpora is indicative that it is a very robust method. This is a very interesting positive aspect of Sil-Att with respect to the other algorithms used in the experiments, whose performances heavily depend on specific characteristics of the corpora being considered.
Fil: Errecalde, Marcelo L.. Universidad Nacional de San Luis. Facultad de Ciencias Físico Matemáticas y Naturales. Departamento de Informática. Laboratorio Investigación y Desarrollo En Inteligencia Computacional; Argentina
Fil: Cagnina, Leticia Cecilia. Universidad Nacional de San Luis. Facultad de Ciencias Físico Matemáticas y Naturales. Departamento de Informática. Laboratorio Investigación y Desarrollo En Inteligencia Computacional; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina
Fil: Rosso, Paolo. Universidad Politecnica de Valencia; España - Materia
-
Clustering
Short Texts Corpora
Attraction
Silhouette - Nivel de accesibilidad
- acceso abierto
- Condiciones de uso
- https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
- Repositorio
.jpg)
- Institución
- Consejo Nacional de Investigaciones Científicas y Técnicas
- OAI Identificador
- oai:ri.conicet.gov.ar:11336/7135
Ver los metadatos del registro completo
| id |
CONICETDig_a0eafe318e475318ec978473581364fb |
|---|---|
| oai_identifier_str |
oai:ri.conicet.gov.ar:11336/7135 |
| network_acronym_str |
CONICETDig |
| repository_id_str |
3498 |
| network_name_str |
CONICET Digital (CONICET) |
| spelling |
Silhouette + Attraction: A Simple and Effective Method for Text ClusteringErrecalde, Marcelo L.Cagnina, Leticia CeciliaRosso, PaoloClusteringShort Texts CorporaAttractionSilhouettehttps://purl.org/becyt/ford/1.2https://purl.org/becyt/ford/1This article presents Sil-Att, a simple and effective method for text clustering, which is based on two main concepts: the silhouette coefficient and the idea of attraction. The combination of both principles allows to obtain a general technique that can be used either as a boosting method, which improves results of other clustering algorithms, or as an independent clustering algorithm. The experimental work shows that Sil-Att is able to obtain high quality results on text corpora with very different characteristics. Furthermore, its stable performance on all the considered corpora is indicative that it is a very robust method. This is a very interesting positive aspect of Sil-Att with respect to the other algorithms used in the experiments, whose performances heavily depend on specific characteristics of the corpora being considered.Fil: Errecalde, Marcelo L.. Universidad Nacional de San Luis. Facultad de Ciencias Físico Matemáticas y Naturales. Departamento de Informática. Laboratorio Investigación y Desarrollo En Inteligencia Computacional; ArgentinaFil: Cagnina, Leticia Cecilia. Universidad Nacional de San Luis. Facultad de Ciencias Físico Matemáticas y Naturales. Departamento de Informática. Laboratorio Investigación y Desarrollo En Inteligencia Computacional; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; ArgentinaFil: Rosso, Paolo. Universidad Politecnica de Valencia; EspañaCambridge University Press2015-08-14info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionhttp://purl.org/coar/resource_type/c_6501info:ar-repo/semantics/articuloapplication/pdfapplication/zipapplication/pdfhttp://hdl.handle.net/11336/7135Errecalde, Marcelo L.; Cagnina, Leticia Cecilia; Rosso, Paolo; Silhouette + Attraction: A Simple and Effective Method for Text Clustering; Cambridge University Press; Natural Language Engineering; 1; 14-8-2015; 1-401351-3249enginfo:eu-repo/semantics/altIdentifier/doi/10.1017/S1351324915000273info:eu-repo/semantics/altIdentifier/url/http://journals.cambridge.org/action/displayAbstract?fromPage=online&aid=9910907info:eu-repo/semantics/openAccesshttps://creativecommons.org/licenses/by-nc-sa/2.5/ar/reponame:CONICET Digital (CONICET)instname:Consejo Nacional de Investigaciones Científicas y Técnicas2025-11-12T09:58:27Zoai:ri.conicet.gov.ar:11336/7135instacron:CONICETInstitucionalhttp://ri.conicet.gov.ar/Organismo científico-tecnológicoNo correspondehttp://ri.conicet.gov.ar/oai/requestdasensio@conicet.gov.ar; lcarlino@conicet.gov.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:34982025-11-12 09:58:27.603CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicasfalse |
| dc.title.none.fl_str_mv |
Silhouette + Attraction: A Simple and Effective Method for Text Clustering |
| title |
Silhouette + Attraction: A Simple and Effective Method for Text Clustering |
| spellingShingle |
Silhouette + Attraction: A Simple and Effective Method for Text Clustering Errecalde, Marcelo L. Clustering Short Texts Corpora Attraction Silhouette |
| title_short |
Silhouette + Attraction: A Simple and Effective Method for Text Clustering |
| title_full |
Silhouette + Attraction: A Simple and Effective Method for Text Clustering |
| title_fullStr |
Silhouette + Attraction: A Simple and Effective Method for Text Clustering |
| title_full_unstemmed |
Silhouette + Attraction: A Simple and Effective Method for Text Clustering |
| title_sort |
Silhouette + Attraction: A Simple and Effective Method for Text Clustering |
| dc.creator.none.fl_str_mv |
Errecalde, Marcelo L. Cagnina, Leticia Cecilia Rosso, Paolo |
| author |
Errecalde, Marcelo L. |
| author_facet |
Errecalde, Marcelo L. Cagnina, Leticia Cecilia Rosso, Paolo |
| author_role |
author |
| author2 |
Cagnina, Leticia Cecilia Rosso, Paolo |
| author2_role |
author author |
| dc.subject.none.fl_str_mv |
Clustering Short Texts Corpora Attraction Silhouette |
| topic |
Clustering Short Texts Corpora Attraction Silhouette |
| purl_subject.fl_str_mv |
https://purl.org/becyt/ford/1.2 https://purl.org/becyt/ford/1 |
| dc.description.none.fl_txt_mv |
This article presents Sil-Att, a simple and effective method for text clustering, which is based on two main concepts: the silhouette coefficient and the idea of attraction. The combination of both principles allows to obtain a general technique that can be used either as a boosting method, which improves results of other clustering algorithms, or as an independent clustering algorithm. The experimental work shows that Sil-Att is able to obtain high quality results on text corpora with very different characteristics. Furthermore, its stable performance on all the considered corpora is indicative that it is a very robust method. This is a very interesting positive aspect of Sil-Att with respect to the other algorithms used in the experiments, whose performances heavily depend on specific characteristics of the corpora being considered. Fil: Errecalde, Marcelo L.. Universidad Nacional de San Luis. Facultad de Ciencias Físico Matemáticas y Naturales. Departamento de Informática. Laboratorio Investigación y Desarrollo En Inteligencia Computacional; Argentina Fil: Cagnina, Leticia Cecilia. Universidad Nacional de San Luis. Facultad de Ciencias Físico Matemáticas y Naturales. Departamento de Informática. Laboratorio Investigación y Desarrollo En Inteligencia Computacional; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina Fil: Rosso, Paolo. Universidad Politecnica de Valencia; España |
| description |
This article presents Sil-Att, a simple and effective method for text clustering, which is based on two main concepts: the silhouette coefficient and the idea of attraction. The combination of both principles allows to obtain a general technique that can be used either as a boosting method, which improves results of other clustering algorithms, or as an independent clustering algorithm. The experimental work shows that Sil-Att is able to obtain high quality results on text corpora with very different characteristics. Furthermore, its stable performance on all the considered corpora is indicative that it is a very robust method. This is a very interesting positive aspect of Sil-Att with respect to the other algorithms used in the experiments, whose performances heavily depend on specific characteristics of the corpora being considered. |
| publishDate |
2015 |
| dc.date.none.fl_str_mv |
2015-08-14 |
| dc.type.none.fl_str_mv |
info:eu-repo/semantics/article info:eu-repo/semantics/publishedVersion http://purl.org/coar/resource_type/c_6501 info:ar-repo/semantics/articulo |
| format |
article |
| status_str |
publishedVersion |
| dc.identifier.none.fl_str_mv |
http://hdl.handle.net/11336/7135 Errecalde, Marcelo L.; Cagnina, Leticia Cecilia; Rosso, Paolo; Silhouette + Attraction: A Simple and Effective Method for Text Clustering; Cambridge University Press; Natural Language Engineering; 1; 14-8-2015; 1-40 1351-3249 |
| url |
http://hdl.handle.net/11336/7135 |
| identifier_str_mv |
Errecalde, Marcelo L.; Cagnina, Leticia Cecilia; Rosso, Paolo; Silhouette + Attraction: A Simple and Effective Method for Text Clustering; Cambridge University Press; Natural Language Engineering; 1; 14-8-2015; 1-40 1351-3249 |
| dc.language.none.fl_str_mv |
eng |
| language |
eng |
| dc.relation.none.fl_str_mv |
info:eu-repo/semantics/altIdentifier/doi/10.1017/S1351324915000273 info:eu-repo/semantics/altIdentifier/url/http://journals.cambridge.org/action/displayAbstract?fromPage=online&aid=9910907 |
| dc.rights.none.fl_str_mv |
info:eu-repo/semantics/openAccess https://creativecommons.org/licenses/by-nc-sa/2.5/ar/ |
| eu_rights_str_mv |
openAccess |
| rights_invalid_str_mv |
https://creativecommons.org/licenses/by-nc-sa/2.5/ar/ |
| dc.format.none.fl_str_mv |
application/pdf application/zip application/pdf |
| dc.publisher.none.fl_str_mv |
Cambridge University Press |
| publisher.none.fl_str_mv |
Cambridge University Press |
| dc.source.none.fl_str_mv |
reponame:CONICET Digital (CONICET) instname:Consejo Nacional de Investigaciones Científicas y Técnicas |
| reponame_str |
CONICET Digital (CONICET) |
| collection |
CONICET Digital (CONICET) |
| instname_str |
Consejo Nacional de Investigaciones Científicas y Técnicas |
| repository.name.fl_str_mv |
CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicas |
| repository.mail.fl_str_mv |
dasensio@conicet.gov.ar; lcarlino@conicet.gov.ar |
| _version_ |
1848598409746841600 |
| score |
13.25334 |