A PSO-based clustering approach assisted by initial clustering information
- Autores
- Velázquez, Carlos; Cagnina, Leticia; Errecalde, Marcelo Luis
- Año de publicación
- 2012
- Idioma
- inglés
- Tipo de recurso
- documento de conferencia
- Estado
- versión publicada
- Descripción
- Clustering of short texts is an important research area because of its applicability in information retrieval and text mining. To this end was proposed CLUDIPSO, a discrete Particle Swarm Optimization algorithm to cluster short texts. Initial results showed that CLUDIPSO has performed well in small collections of short texts. However, later works showed some drawbacks when dealing with larger collections. In this paper we present a hybridization of CLUDIPSO to overcome these drawbacks, by providing information in the initial cycles of the algorithm to avoid a random search and thus speed up the convergence process. This is achieved by using a pre-clustering obtained with the Expectation-Maximization method which is included in the initial population of the algorithm. The results obtained with the hybrid version show a significant improvement over those obtained with the original version.
Eje: Workshop Bases de datos y minería de datos (WBDDM)
Red de Universidades con Carreras en Informática (RedUNCI) - Materia
-
Ciencias Informáticas
Short-Text Clustering
Bio-Inspired Methods
PSO-based Clustering
Hybrid Methods
Expectation-Maximization
Initialization Approaches
Clustering
base de datos
Data mining - Nivel de accesibilidad
- acceso abierto
- Condiciones de uso
- http://creativecommons.org/licenses/by-nc-sa/2.5/ar/
- Repositorio
- Institución
- Universidad Nacional de La Plata
- OAI Identificador
- oai:sedici.unlp.edu.ar:10915/23753
Ver los metadatos del registro completo
id |
SEDICI_9ff018b85fa42be4a6c03d547a9f9461 |
---|---|
oai_identifier_str |
oai:sedici.unlp.edu.ar:10915/23753 |
network_acronym_str |
SEDICI |
repository_id_str |
1329 |
network_name_str |
SEDICI (UNLP) |
spelling |
A PSO-based clustering approach assisted by initial clustering informationVelázquez, CarlosCagnina, LeticiaErrecalde, Marcelo LuisCiencias InformáticasShort-Text ClusteringBio-Inspired MethodsPSO-based ClusteringHybrid MethodsExpectation-MaximizationInitialization ApproachesClusteringbase de datosData miningClustering of short texts is an important research area because of its applicability in information retrieval and text mining. To this end was proposed CLUDIPSO, a discrete Particle Swarm Optimization algorithm to cluster short texts. Initial results showed that CLUDIPSO has performed well in small collections of short texts. However, later works showed some drawbacks when dealing with larger collections. In this paper we present a hybridization of CLUDIPSO to overcome these drawbacks, by providing information in the initial cycles of the algorithm to avoid a random search and thus speed up the convergence process. This is achieved by using a pre-clustering obtained with the Expectation-Maximization method which is included in the initial population of the algorithm. The results obtained with the hybrid version show a significant improvement over those obtained with the original version.Eje: Workshop Bases de datos y minería de datos (WBDDM)Red de Universidades con Carreras en Informática (RedUNCI)2012-10info:eu-repo/semantics/conferenceObjectinfo:eu-repo/semantics/publishedVersionObjeto de conferenciahttp://purl.org/coar/resource_type/c_5794info:ar-repo/semantics/documentoDeConferenciaapplication/pdfhttp://sedici.unlp.edu.ar/handle/10915/23753enginfo:eu-repo/semantics/openAccesshttp://creativecommons.org/licenses/by-nc-sa/2.5/ar/Creative Commons Attribution-NonCommercial-ShareAlike 2.5 Argentina (CC BY-NC-SA 2.5)reponame:SEDICI (UNLP)instname:Universidad Nacional de La Platainstacron:UNLP2025-09-29T10:55:35Zoai:sedici.unlp.edu.ar:10915/23753Institucionalhttp://sedici.unlp.edu.ar/Universidad públicaNo correspondehttp://sedici.unlp.edu.ar/oai/snrdalira@sedici.unlp.edu.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:13292025-09-29 10:55:35.82SEDICI (UNLP) - Universidad Nacional de La Platafalse |
dc.title.none.fl_str_mv |
A PSO-based clustering approach assisted by initial clustering information |
title |
A PSO-based clustering approach assisted by initial clustering information |
spellingShingle |
A PSO-based clustering approach assisted by initial clustering information Velázquez, Carlos Ciencias Informáticas Short-Text Clustering Bio-Inspired Methods PSO-based Clustering Hybrid Methods Expectation-Maximization Initialization Approaches Clustering base de datos Data mining |
title_short |
A PSO-based clustering approach assisted by initial clustering information |
title_full |
A PSO-based clustering approach assisted by initial clustering information |
title_fullStr |
A PSO-based clustering approach assisted by initial clustering information |
title_full_unstemmed |
A PSO-based clustering approach assisted by initial clustering information |
title_sort |
A PSO-based clustering approach assisted by initial clustering information |
dc.creator.none.fl_str_mv |
Velázquez, Carlos Cagnina, Leticia Errecalde, Marcelo Luis |
author |
Velázquez, Carlos |
author_facet |
Velázquez, Carlos Cagnina, Leticia Errecalde, Marcelo Luis |
author_role |
author |
author2 |
Cagnina, Leticia Errecalde, Marcelo Luis |
author2_role |
author author |
dc.subject.none.fl_str_mv |
Ciencias Informáticas Short-Text Clustering Bio-Inspired Methods PSO-based Clustering Hybrid Methods Expectation-Maximization Initialization Approaches Clustering base de datos Data mining |
topic |
Ciencias Informáticas Short-Text Clustering Bio-Inspired Methods PSO-based Clustering Hybrid Methods Expectation-Maximization Initialization Approaches Clustering base de datos Data mining |
dc.description.none.fl_txt_mv |
Clustering of short texts is an important research area because of its applicability in information retrieval and text mining. To this end was proposed CLUDIPSO, a discrete Particle Swarm Optimization algorithm to cluster short texts. Initial results showed that CLUDIPSO has performed well in small collections of short texts. However, later works showed some drawbacks when dealing with larger collections. In this paper we present a hybridization of CLUDIPSO to overcome these drawbacks, by providing information in the initial cycles of the algorithm to avoid a random search and thus speed up the convergence process. This is achieved by using a pre-clustering obtained with the Expectation-Maximization method which is included in the initial population of the algorithm. The results obtained with the hybrid version show a significant improvement over those obtained with the original version. Eje: Workshop Bases de datos y minería de datos (WBDDM) Red de Universidades con Carreras en Informática (RedUNCI) |
description |
Clustering of short texts is an important research area because of its applicability in information retrieval and text mining. To this end was proposed CLUDIPSO, a discrete Particle Swarm Optimization algorithm to cluster short texts. Initial results showed that CLUDIPSO has performed well in small collections of short texts. However, later works showed some drawbacks when dealing with larger collections. In this paper we present a hybridization of CLUDIPSO to overcome these drawbacks, by providing information in the initial cycles of the algorithm to avoid a random search and thus speed up the convergence process. This is achieved by using a pre-clustering obtained with the Expectation-Maximization method which is included in the initial population of the algorithm. The results obtained with the hybrid version show a significant improvement over those obtained with the original version. |
publishDate |
2012 |
dc.date.none.fl_str_mv |
2012-10 |
dc.type.none.fl_str_mv |
info:eu-repo/semantics/conferenceObject info:eu-repo/semantics/publishedVersion Objeto de conferencia http://purl.org/coar/resource_type/c_5794 info:ar-repo/semantics/documentoDeConferencia |
format |
conferenceObject |
status_str |
publishedVersion |
dc.identifier.none.fl_str_mv |
http://sedici.unlp.edu.ar/handle/10915/23753 |
url |
http://sedici.unlp.edu.ar/handle/10915/23753 |
dc.language.none.fl_str_mv |
eng |
language |
eng |
dc.rights.none.fl_str_mv |
info:eu-repo/semantics/openAccess http://creativecommons.org/licenses/by-nc-sa/2.5/ar/ Creative Commons Attribution-NonCommercial-ShareAlike 2.5 Argentina (CC BY-NC-SA 2.5) |
eu_rights_str_mv |
openAccess |
rights_invalid_str_mv |
http://creativecommons.org/licenses/by-nc-sa/2.5/ar/ Creative Commons Attribution-NonCommercial-ShareAlike 2.5 Argentina (CC BY-NC-SA 2.5) |
dc.format.none.fl_str_mv |
application/pdf |
dc.source.none.fl_str_mv |
reponame:SEDICI (UNLP) instname:Universidad Nacional de La Plata instacron:UNLP |
reponame_str |
SEDICI (UNLP) |
collection |
SEDICI (UNLP) |
instname_str |
Universidad Nacional de La Plata |
instacron_str |
UNLP |
institution |
UNLP |
repository.name.fl_str_mv |
SEDICI (UNLP) - Universidad Nacional de La Plata |
repository.mail.fl_str_mv |
alira@sedici.unlp.edu.ar |
_version_ |
1844615815287013376 |
score |
13.070432 |