A PSO-based clustering approach assisted by initial clustering information

Authors
Velázquez, Carlos; Cagnina, Leticia; Errecalde, Marcelo Luis
Publication Year
2012
Language
English
Format
conference paper
Status
Published version
Description
Clustering of short texts is an important research area because of its applicability in information retrieval and text mining. To this end was proposed CLUDIPSO, a discrete Particle Swarm Optimization algorithm to cluster short texts. Initial results showed that CLUDIPSO has performed well in small collections of short texts. However, later works showed some drawbacks when dealing with larger collections. In this paper we present a hybridization of CLUDIPSO to overcome these drawbacks, by providing information in the initial cycles of the algorithm to avoid a random search and thus speed up the convergence process. This is achieved by using a pre-clustering obtained with the Expectation-Maximization method which is included in the initial population of the algorithm. The results obtained with the hybrid version show a significant improvement over those obtained with the original version.
Eje: Workshop Bases de datos y minería de datos (WBDDM)
Red de Universidades con Carreras en Informática (RedUNCI)
Subject
Ciencias Informáticas
Access level
Open access
License
Creative Commons Attribution-NonCommercial-ShareAlike 2.5 Argentina (CC BY-NC-SA 2.5)
Repository
SEDICI (UNLP)
Institution
Universidad Nacional de La Plata
OAI Identifier
oai:sedici.unlp.edu.ar:10915/23753