Comparison of SVM and some older classification algorithms in text classification tasks
- Autores
- Colas, Fabrice; Brazdil, Pavel
- Año de publicación
- 2006
- Idioma
- inglés
- Tipo de recurso
- documento de conferencia
- Estado
- versión publicada
- Descripción
- Document classification has already been widely studied. In fact, some studies compared feature selection techniques or feature space transformation whereas some others compared the performance of different algorithms. Recently, following the rising interest towards the Support Vector Machine, various studies showed that SVM outperforms other classification algorithms. So should we just not bother about other classification algorithms and opt always for SVM We have decided to investigate this issue and compared SVM to kNN and naive Bayes on binary classification tasks. An important issue is to compare optimized versions of these algorithms, which is what we have done. Our results show all the classifiers achieved comparable performance on most problems. One surprising result is that SVM was not a clear winner, despite quite good overall performance. If a suitable preprocessing is used with kNN, this algorithm continues to achieve very good results and scales up well with the number of documents, which is not the case for SVM. As for naive Bayes, it also achieved good performance.
IFIP International Conference on Artificial Intelligence in Theory and Practice - Knowledge Acquisition and Data Mining
Red de Universidades con Carreras en Informática (RedUNCI) - Materia
-
Ciencias Informáticas
gestión de documentos
support vector machine
Model classification
Algorithms - Nivel de accesibilidad
- acceso abierto
- Condiciones de uso
- http://creativecommons.org/licenses/by-nc-sa/2.5/ar/
- Repositorio
.jpg)
- Institución
- Universidad Nacional de La Plata
- OAI Identificador
- oai:sedici.unlp.edu.ar:10915/23885
Ver los metadatos del registro completo
| id |
SEDICI_afe57df1e295b21d0676ec694e5314d3 |
|---|---|
| oai_identifier_str |
oai:sedici.unlp.edu.ar:10915/23885 |
| network_acronym_str |
SEDICI |
| repository_id_str |
1329 |
| network_name_str |
SEDICI (UNLP) |
| spelling |
Comparison of SVM and some older classification algorithms in text classification tasksColas, FabriceBrazdil, PavelCiencias Informáticasgestión de documentossupport vector machineModel classificationAlgorithmsDocument classification has already been widely studied. In fact, some studies compared feature selection techniques or feature space transformation whereas some others compared the performance of different algorithms. Recently, following the rising interest towards the Support Vector Machine, various studies showed that SVM outperforms other classification algorithms. So should we just not bother about other classification algorithms and opt always for SVM We have decided to investigate this issue and compared SVM to kNN and naive Bayes on binary classification tasks. An important issue is to compare optimized versions of these algorithms, which is what we have done. Our results show all the classifiers achieved comparable performance on most problems. One surprising result is that SVM was not a clear winner, despite quite good overall performance. If a suitable preprocessing is used with kNN, this algorithm continues to achieve very good results and scales up well with the number of documents, which is not the case for SVM. As for naive Bayes, it also achieved good performance.IFIP International Conference on Artificial Intelligence in Theory and Practice - Knowledge Acquisition and Data MiningRed de Universidades con Carreras en Informática (RedUNCI)2006-08info:eu-repo/semantics/conferenceObjectinfo:eu-repo/semantics/publishedVersionObjeto de conferenciahttp://purl.org/coar/resource_type/c_5794info:ar-repo/semantics/documentoDeConferenciaapplication/pdfhttp://sedici.unlp.edu.ar/handle/10915/23885enginfo:eu-repo/semantics/altIdentifier/isbn/0-387-34654-6info:eu-repo/semantics/openAccesshttp://creativecommons.org/licenses/by-nc-sa/2.5/ar/Creative Commons Attribution-NonCommercial-ShareAlike 2.5 Argentina (CC BY-NC-SA 2.5)reponame:SEDICI (UNLP)instname:Universidad Nacional de La Platainstacron:UNLP2025-10-22T16:37:10Zoai:sedici.unlp.edu.ar:10915/23885Institucionalhttp://sedici.unlp.edu.ar/Universidad públicaNo correspondehttp://sedici.unlp.edu.ar/oai/snrdalira@sedici.unlp.edu.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:13292025-10-22 16:37:10.308SEDICI (UNLP) - Universidad Nacional de La Platafalse |
| dc.title.none.fl_str_mv |
Comparison of SVM and some older classification algorithms in text classification tasks |
| title |
Comparison of SVM and some older classification algorithms in text classification tasks |
| spellingShingle |
Comparison of SVM and some older classification algorithms in text classification tasks Colas, Fabrice Ciencias Informáticas gestión de documentos support vector machine Model classification Algorithms |
| title_short |
Comparison of SVM and some older classification algorithms in text classification tasks |
| title_full |
Comparison of SVM and some older classification algorithms in text classification tasks |
| title_fullStr |
Comparison of SVM and some older classification algorithms in text classification tasks |
| title_full_unstemmed |
Comparison of SVM and some older classification algorithms in text classification tasks |
| title_sort |
Comparison of SVM and some older classification algorithms in text classification tasks |
| dc.creator.none.fl_str_mv |
Colas, Fabrice Brazdil, Pavel |
| author |
Colas, Fabrice |
| author_facet |
Colas, Fabrice Brazdil, Pavel |
| author_role |
author |
| author2 |
Brazdil, Pavel |
| author2_role |
author |
| dc.subject.none.fl_str_mv |
Ciencias Informáticas gestión de documentos support vector machine Model classification Algorithms |
| topic |
Ciencias Informáticas gestión de documentos support vector machine Model classification Algorithms |
| dc.description.none.fl_txt_mv |
Document classification has already been widely studied. In fact, some studies compared feature selection techniques or feature space transformation whereas some others compared the performance of different algorithms. Recently, following the rising interest towards the Support Vector Machine, various studies showed that SVM outperforms other classification algorithms. So should we just not bother about other classification algorithms and opt always for SVM We have decided to investigate this issue and compared SVM to kNN and naive Bayes on binary classification tasks. An important issue is to compare optimized versions of these algorithms, which is what we have done. Our results show all the classifiers achieved comparable performance on most problems. One surprising result is that SVM was not a clear winner, despite quite good overall performance. If a suitable preprocessing is used with kNN, this algorithm continues to achieve very good results and scales up well with the number of documents, which is not the case for SVM. As for naive Bayes, it also achieved good performance. IFIP International Conference on Artificial Intelligence in Theory and Practice - Knowledge Acquisition and Data Mining Red de Universidades con Carreras en Informática (RedUNCI) |
| description |
Document classification has already been widely studied. In fact, some studies compared feature selection techniques or feature space transformation whereas some others compared the performance of different algorithms. Recently, following the rising interest towards the Support Vector Machine, various studies showed that SVM outperforms other classification algorithms. So should we just not bother about other classification algorithms and opt always for SVM We have decided to investigate this issue and compared SVM to kNN and naive Bayes on binary classification tasks. An important issue is to compare optimized versions of these algorithms, which is what we have done. Our results show all the classifiers achieved comparable performance on most problems. One surprising result is that SVM was not a clear winner, despite quite good overall performance. If a suitable preprocessing is used with kNN, this algorithm continues to achieve very good results and scales up well with the number of documents, which is not the case for SVM. As for naive Bayes, it also achieved good performance. |
| publishDate |
2006 |
| dc.date.none.fl_str_mv |
2006-08 |
| dc.type.none.fl_str_mv |
info:eu-repo/semantics/conferenceObject info:eu-repo/semantics/publishedVersion Objeto de conferencia http://purl.org/coar/resource_type/c_5794 info:ar-repo/semantics/documentoDeConferencia |
| format |
conferenceObject |
| status_str |
publishedVersion |
| dc.identifier.none.fl_str_mv |
http://sedici.unlp.edu.ar/handle/10915/23885 |
| url |
http://sedici.unlp.edu.ar/handle/10915/23885 |
| dc.language.none.fl_str_mv |
eng |
| language |
eng |
| dc.relation.none.fl_str_mv |
info:eu-repo/semantics/altIdentifier/isbn/0-387-34654-6 |
| dc.rights.none.fl_str_mv |
info:eu-repo/semantics/openAccess http://creativecommons.org/licenses/by-nc-sa/2.5/ar/ Creative Commons Attribution-NonCommercial-ShareAlike 2.5 Argentina (CC BY-NC-SA 2.5) |
| eu_rights_str_mv |
openAccess |
| rights_invalid_str_mv |
http://creativecommons.org/licenses/by-nc-sa/2.5/ar/ Creative Commons Attribution-NonCommercial-ShareAlike 2.5 Argentina (CC BY-NC-SA 2.5) |
| dc.format.none.fl_str_mv |
application/pdf |
| dc.source.none.fl_str_mv |
reponame:SEDICI (UNLP) instname:Universidad Nacional de La Plata instacron:UNLP |
| reponame_str |
SEDICI (UNLP) |
| collection |
SEDICI (UNLP) |
| instname_str |
Universidad Nacional de La Plata |
| instacron_str |
UNLP |
| institution |
UNLP |
| repository.name.fl_str_mv |
SEDICI (UNLP) - Universidad Nacional de La Plata |
| repository.mail.fl_str_mv |
alira@sedici.unlp.edu.ar |
| _version_ |
1846782832275357696 |
| score |
12.982451 |