Quality Flaws Prediction in Wikipedia by Using Deep Learning Approaches

Autores
Capodici, Gianfranco; Bazán Pereyra, Gerónimo; Bonnin, Rodolfo; Ferretti, Edgardo
Año de publicación
2022
Idioma
español castellano
Tipo de recurso
documento de conferencia
Estado
versión publicada
Descripción
Quality flaws prediction in Wikipedia is an ongoing research trend. In particular, in this work we tackle the problem of automatically predicting four out of the ten most frequent quality flaws; namely: No footnotes, Notability, Primary Sources and Refmprove. Different deep learning state-of-the-art approaches were evaluated on the test corpus from the 1st International Competition on Quality Flaw Prediction in Wikipedia; a well-known uniform evaluation corpus from this research field. Particularly, the results show that TabNet reachs or improves the existing benchmarks for the Notability and Refmprove flaws, and performs in a very competitive way for the other two remaining flaws.
XIX Workshop base de datos y Minería de datos (WBDMD)
Red de Universidades con Carreras en Informática
Materia
Ciencias Informáticas
Wikipedia
Information Quality
Quality Flaws Prediction
Deep Learning
Nivel de accesibilidad
acceso abierto
Condiciones de uso
http://creativecommons.org/licenses/by-nc-sa/4.0/
Repositorio
SEDICI (UNLP)
Institución
Universidad Nacional de La Plata
OAI Identificador
oai:sedici.unlp.edu.ar:10915/149435

id SEDICI_5a8250da35dcda9e7ea46faa36aaef72
oai_identifier_str oai:sedici.unlp.edu.ar:10915/149435
network_acronym_str SEDICI
repository_id_str 1329
network_name_str SEDICI (UNLP)
spelling Quality Flaws Prediction in Wikipedia by Using Deep Learning ApproachesCapodici, GianfrancoBazán Pereyra, GerónimoBonnin, RodolfoFerretti, EdgardoCiencias InformáticasWikipediaInformation QualityQuality Flaws PredictionDeep LearningQuality flaws prediction in Wikipedia is an ongoing research trend. In particular, in this work we tackle the problem of automatically predicting four out of the ten most frequent quality flaws; namely: No footnotes, Notability, Primary Sources and Refmprove. Different deep learning state-of-the-art approaches were evaluated on the test corpus from the 1st International Competition on Quality Flaw Prediction in Wikipedia; a well-known uniform evaluation corpus from this research field. Particularly, the results show that TabNet reachs or improves the existing benchmarks for the Notability and Refmprove flaws, and performs in a very competitive way for the other two remaining flaws.XIX Workshop base de datos y Minería de datos (WBDMD)Red de Universidades con Carreras en Informática2022-10info:eu-repo/semantics/conferenceObjectinfo:eu-repo/semantics/publishedVersionObjeto de conferenciahttp://purl.org/coar/resource_type/c_5794info:ar-repo/semantics/documentoDeConferenciaapplication/pdf375-384http://sedici.unlp.edu.ar/handle/10915/149435spainfo:eu-repo/semantics/altIdentifier/isbn/978-987-1364-31-2info:eu-repo/semantics/reference/hdl/10915/149102info:eu-repo/semantics/openAccesshttp://creativecommons.org/licenses/by-nc-sa/4.0/Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)reponame:SEDICI (UNLP)instname:Universidad Nacional de La Platainstacron:UNLP2025-09-29T11:38:21Zoai:sedici.unlp.edu.ar:10915/149435Institucionalhttp://sedici.unlp.edu.ar/Universidad públicaNo correspondehttp://sedici.unlp.edu.ar/oai/snrdalira@sedici.unlp.edu.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:13292025-09-29 11:38:22.051SEDICI (UNLP) - Universidad Nacional de La Platafalse
dc.title.none.fl_str_mv Quality Flaws Prediction in Wikipedia by Using Deep Learning Approaches
title Quality Flaws Prediction in Wikipedia by Using Deep Learning Approaches
spellingShingle Quality Flaws Prediction in Wikipedia by Using Deep Learning Approaches
Capodici, Gianfranco
Ciencias Informáticas
Wikipedia
Information Quality
Quality Flaws Prediction
Deep Learning
title_short Quality Flaws Prediction in Wikipedia by Using Deep Learning Approaches
title_full Quality Flaws Prediction in Wikipedia by Using Deep Learning Approaches
title_fullStr Quality Flaws Prediction in Wikipedia by Using Deep Learning Approaches
title_full_unstemmed Quality Flaws Prediction in Wikipedia by Using Deep Learning Approaches
title_sort Quality Flaws Prediction in Wikipedia by Using Deep Learning Approaches
dc.creator.none.fl_str_mv Capodici, Gianfranco
Bazán Pereyra, Gerónimo
Bonnin, Rodolfo
Ferretti, Edgardo
author Capodici, Gianfranco
author_facet Capodici, Gianfranco
Bazán Pereyra, Gerónimo
Bonnin, Rodolfo
Ferretti, Edgardo
author_role author
author2 Bazán Pereyra, Gerónimo
Bonnin, Rodolfo
Ferretti, Edgardo
author2_role author
author
author
dc.subject.none.fl_str_mv Ciencias Informáticas
Wikipedia
Information Quality
Quality Flaws Prediction
Deep Learning
topic Ciencias Informáticas
Wikipedia
Information Quality
Quality Flaws Prediction
Deep Learning
dc.description.none.fl_txt_mv Quality flaws prediction in Wikipedia is an ongoing research trend. In particular, in this work we tackle the problem of automatically predicting four out of the ten most frequent quality flaws; namely: No footnotes, Notability, Primary Sources and Refmprove. Different deep learning state-of-the-art approaches were evaluated on the test corpus from the 1st International Competition on Quality Flaw Prediction in Wikipedia; a well-known uniform evaluation corpus from this research field. Particularly, the results show that TabNet reachs or improves the existing benchmarks for the Notability and Refmprove flaws, and performs in a very competitive way for the other two remaining flaws.
XIX Workshop base de datos y Minería de datos (WBDMD)
Red de Universidades con Carreras en Informática
description Quality flaws prediction in Wikipedia is an ongoing research trend. In particular, in this work we tackle the problem of automatically predicting four out of the ten most frequent quality flaws; namely: No footnotes, Notability, Primary Sources and Refmprove. Different deep learning state-of-the-art approaches were evaluated on the test corpus from the 1st International Competition on Quality Flaw Prediction in Wikipedia; a well-known uniform evaluation corpus from this research field. Particularly, the results show that TabNet reachs or improves the existing benchmarks for the Notability and Refmprove flaws, and performs in a very competitive way for the other two remaining flaws.
publishDate 2022
dc.date.none.fl_str_mv 2022-10
dc.type.none.fl_str_mv info:eu-repo/semantics/conferenceObject
info:eu-repo/semantics/publishedVersion
Objeto de conferencia
http://purl.org/coar/resource_type/c_5794
info:ar-repo/semantics/documentoDeConferencia
format conferenceObject
status_str publishedVersion
dc.identifier.none.fl_str_mv http://sedici.unlp.edu.ar/handle/10915/149435
url http://sedici.unlp.edu.ar/handle/10915/149435
dc.language.none.fl_str_mv spa
language spa
dc.relation.none.fl_str_mv info:eu-repo/semantics/altIdentifier/isbn/978-987-1364-31-2
info:eu-repo/semantics/reference/hdl/10915/149102
dc.rights.none.fl_str_mv info:eu-repo/semantics/openAccess
http://creativecommons.org/licenses/by-nc-sa/4.0/
Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)
eu_rights_str_mv openAccess
rights_invalid_str_mv http://creativecommons.org/licenses/by-nc-sa/4.0/
Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)
dc.format.none.fl_str_mv application/pdf
375-384
dc.source.none.fl_str_mv reponame:SEDICI (UNLP)
instname:Universidad Nacional de La Plata
instacron:UNLP
reponame_str SEDICI (UNLP)
collection SEDICI (UNLP)
instname_str Universidad Nacional de La Plata
instacron_str UNLP
institution UNLP
repository.name.fl_str_mv SEDICI (UNLP) - Universidad Nacional de La Plata
repository.mail.fl_str_mv alira@sedici.unlp.edu.ar
_version_ 1844616258708832256
score 13.070432