A study on influential features for predicting best answers in community question-answering forums

Autores
Zoratto, Valeria; Godoy, Daniela Lis; Aranda, Gabriela Noemi
Año de publicación
2023
Idioma
inglés
Tipo de recurso
artículo
Estado
versión publicada
Descripción
The knowledge provided by user communities in question-answering (QA) forums is a highly valuable source of information for satisfying user information needs. However, finding the best answer for a posted question can be challenging. User-generated content in forums can be of unequal quality given the free nature of natural language and the varied levels of user expertise. Answers to a question posted in a forum are compiled in a discussion thread, concentrating also posterior activity such as comments and votes. There are usually multiple reasons why an answer successfully fulfills a certain information need and gets accepted as the best answer among a (possibly) high number of answers. In this work, we study the influence that different aspects of answers have on the prediction of the best answers in a QA forum. We collected the discussion threads of a real-world forum concerning computer programming, and we evaluated different features for representing the answers and the context in which they appear in a thread. Multiple classification models were used to compare the performance of the different features, finding that readability is one of the most important factors for detecting the best answers. The goal of this study is to shed some light on the reasons why answers are more likely to receive more votes and be selected as the best answer for a posted question. Such knowledge enables users to enhance their answers which leads, in turn, to an improvement in the overall quality of the content produced in a platform.
Fil: Zoratto, Valeria. Universidad Nacional del Comahue. Facultad de Informatica; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina
Fil: Godoy, Daniela Lis. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Tandil. Instituto Superior de Ingeniería del Software. Universidad Nacional del Centro de la Provincia de Buenos Aires. Instituto Superior de Ingeniería del Software; Argentina
Fil: Aranda, Gabriela Noemi. Universidad Nacional del Comahue. Facultad de Informatica; Argentina
Materia
CQA forums
best answer prediction
information retrieval
Nivel de accesibilidad
acceso abierto
Condiciones de uso
https://creativecommons.org/licenses/by/2.5/ar/
Repositorio
CONICET Digital (CONICET)
Institución
Consejo Nacional de Investigaciones Científicas y Técnicas
OAI Identificador
oai:ri.conicet.gov.ar:11336/231426

id CONICETDig_0fde489486d550aae04d9e81db11e3d9
oai_identifier_str oai:ri.conicet.gov.ar:11336/231426
network_acronym_str CONICETDig
repository_id_str 3498
network_name_str CONICET Digital (CONICET)
spelling A study on influential features for predicting best answers in community question-answering forumsZoratto, ValeriaGodoy, Daniela LisAranda, Gabriela NoemiCQA forumsbest answer predictioninformation retrievalhttps://purl.org/becyt/ford/1.2https://purl.org/becyt/ford/1The knowledge provided by user communities in question-answering (QA) forums is a highly valuable source of information for satisfying user information needs. However, finding the best answer for a posted question can be challenging. User-generated content in forums can be of unequal quality given the free nature of natural language and the varied levels of user expertise. Answers to a question posted in a forum are compiled in a discussion thread, concentrating also posterior activity such as comments and votes. There are usually multiple reasons why an answer successfully fulfills a certain information need and gets accepted as the best answer among a (possibly) high number of answers. In this work, we study the influence that different aspects of answers have on the prediction of the best answers in a QA forum. We collected the discussion threads of a real-world forum concerning computer programming, and we evaluated different features for representing the answers and the context in which they appear in a thread. Multiple classification models were used to compare the performance of the different features, finding that readability is one of the most important factors for detecting the best answers. The goal of this study is to shed some light on the reasons why answers are more likely to receive more votes and be selected as the best answer for a posted question. Such knowledge enables users to enhance their answers which leads, in turn, to an improvement in the overall quality of the content produced in a platform.Fil: Zoratto, Valeria. Universidad Nacional del Comahue. Facultad de Informatica; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; ArgentinaFil: Godoy, Daniela Lis. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Tandil. Instituto Superior de Ingeniería del Software. Universidad Nacional del Centro de la Provincia de Buenos Aires. Instituto Superior de Ingeniería del Software; ArgentinaFil: Aranda, Gabriela Noemi. Universidad Nacional del Comahue. Facultad de Informatica; ArgentinaMDPI2023-09info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionhttp://purl.org/coar/resource_type/c_6501info:ar-repo/semantics/articuloapplication/pdfapplication/pdfapplication/pdfhttp://hdl.handle.net/11336/231426Zoratto, Valeria; Godoy, Daniela Lis; Aranda, Gabriela Noemi; A study on influential features for predicting best answers in community question-answering forums; MDPI; Information; 14; 9; 9-2023; 1-222078-2489CONICET DigitalCONICETenginfo:eu-repo/semantics/altIdentifier/url/https://www.mdpi.com/2078-2489/14/9/496info:eu-repo/semantics/altIdentifier/doi/10.3390/info14090496info:eu-repo/semantics/openAccesshttps://creativecommons.org/licenses/by/2.5/ar/reponame:CONICET Digital (CONICET)instname:Consejo Nacional de Investigaciones Científicas y Técnicas2025-09-29T09:38:39Zoai:ri.conicet.gov.ar:11336/231426instacron:CONICETInstitucionalhttp://ri.conicet.gov.ar/Organismo científico-tecnológicoNo correspondehttp://ri.conicet.gov.ar/oai/requestdasensio@conicet.gov.ar; lcarlino@conicet.gov.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:34982025-09-29 09:38:39.948CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicasfalse
dc.title.none.fl_str_mv A study on influential features for predicting best answers in community question-answering forums
title A study on influential features for predicting best answers in community question-answering forums
spellingShingle A study on influential features for predicting best answers in community question-answering forums
Zoratto, Valeria
CQA forums
best answer prediction
information retrieval
title_short A study on influential features for predicting best answers in community question-answering forums
title_full A study on influential features for predicting best answers in community question-answering forums
title_fullStr A study on influential features for predicting best answers in community question-answering forums
title_full_unstemmed A study on influential features for predicting best answers in community question-answering forums
title_sort A study on influential features for predicting best answers in community question-answering forums
dc.creator.none.fl_str_mv Zoratto, Valeria
Godoy, Daniela Lis
Aranda, Gabriela Noemi
author Zoratto, Valeria
author_facet Zoratto, Valeria
Godoy, Daniela Lis
Aranda, Gabriela Noemi
author_role author
author2 Godoy, Daniela Lis
Aranda, Gabriela Noemi
author2_role author
author
dc.subject.none.fl_str_mv CQA forums
best answer prediction
information retrieval
topic CQA forums
best answer prediction
information retrieval
purl_subject.fl_str_mv https://purl.org/becyt/ford/1.2
https://purl.org/becyt/ford/1
dc.description.none.fl_txt_mv The knowledge provided by user communities in question-answering (QA) forums is a highly valuable source of information for satisfying user information needs. However, finding the best answer for a posted question can be challenging. User-generated content in forums can be of unequal quality given the free nature of natural language and the varied levels of user expertise. Answers to a question posted in a forum are compiled in a discussion thread, concentrating also posterior activity such as comments and votes. There are usually multiple reasons why an answer successfully fulfills a certain information need and gets accepted as the best answer among a (possibly) high number of answers. In this work, we study the influence that different aspects of answers have on the prediction of the best answers in a QA forum. We collected the discussion threads of a real-world forum concerning computer programming, and we evaluated different features for representing the answers and the context in which they appear in a thread. Multiple classification models were used to compare the performance of the different features, finding that readability is one of the most important factors for detecting the best answers. The goal of this study is to shed some light on the reasons why answers are more likely to receive more votes and be selected as the best answer for a posted question. Such knowledge enables users to enhance their answers which leads, in turn, to an improvement in the overall quality of the content produced in a platform.
Fil: Zoratto, Valeria. Universidad Nacional del Comahue. Facultad de Informatica; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina
Fil: Godoy, Daniela Lis. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Tandil. Instituto Superior de Ingeniería del Software. Universidad Nacional del Centro de la Provincia de Buenos Aires. Instituto Superior de Ingeniería del Software; Argentina
Fil: Aranda, Gabriela Noemi. Universidad Nacional del Comahue. Facultad de Informatica; Argentina
description The knowledge provided by user communities in question-answering (QA) forums is a highly valuable source of information for satisfying user information needs. However, finding the best answer for a posted question can be challenging. User-generated content in forums can be of unequal quality given the free nature of natural language and the varied levels of user expertise. Answers to a question posted in a forum are compiled in a discussion thread, concentrating also posterior activity such as comments and votes. There are usually multiple reasons why an answer successfully fulfills a certain information need and gets accepted as the best answer among a (possibly) high number of answers. In this work, we study the influence that different aspects of answers have on the prediction of the best answers in a QA forum. We collected the discussion threads of a real-world forum concerning computer programming, and we evaluated different features for representing the answers and the context in which they appear in a thread. Multiple classification models were used to compare the performance of the different features, finding that readability is one of the most important factors for detecting the best answers. The goal of this study is to shed some light on the reasons why answers are more likely to receive more votes and be selected as the best answer for a posted question. Such knowledge enables users to enhance their answers which leads, in turn, to an improvement in the overall quality of the content produced in a platform.
publishDate 2023
dc.date.none.fl_str_mv 2023-09
dc.type.none.fl_str_mv info:eu-repo/semantics/article
info:eu-repo/semantics/publishedVersion
http://purl.org/coar/resource_type/c_6501
info:ar-repo/semantics/articulo
format article
status_str publishedVersion
dc.identifier.none.fl_str_mv http://hdl.handle.net/11336/231426
Zoratto, Valeria; Godoy, Daniela Lis; Aranda, Gabriela Noemi; A study on influential features for predicting best answers in community question-answering forums; MDPI; Information; 14; 9; 9-2023; 1-22
2078-2489
CONICET Digital
CONICET
url http://hdl.handle.net/11336/231426
identifier_str_mv Zoratto, Valeria; Godoy, Daniela Lis; Aranda, Gabriela Noemi; A study on influential features for predicting best answers in community question-answering forums; MDPI; Information; 14; 9; 9-2023; 1-22
2078-2489
CONICET Digital
CONICET
dc.language.none.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv info:eu-repo/semantics/altIdentifier/url/https://www.mdpi.com/2078-2489/14/9/496
info:eu-repo/semantics/altIdentifier/doi/10.3390/info14090496
dc.rights.none.fl_str_mv info:eu-repo/semantics/openAccess
https://creativecommons.org/licenses/by/2.5/ar/
eu_rights_str_mv openAccess
rights_invalid_str_mv https://creativecommons.org/licenses/by/2.5/ar/
dc.format.none.fl_str_mv application/pdf
application/pdf
application/pdf
dc.publisher.none.fl_str_mv MDPI
publisher.none.fl_str_mv MDPI
dc.source.none.fl_str_mv reponame:CONICET Digital (CONICET)
instname:Consejo Nacional de Investigaciones Científicas y Técnicas
reponame_str CONICET Digital (CONICET)
collection CONICET Digital (CONICET)
instname_str Consejo Nacional de Investigaciones Científicas y Técnicas
repository.name.fl_str_mv CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicas
repository.mail.fl_str_mv dasensio@conicet.gov.ar; lcarlino@conicet.gov.ar
_version_ 1844613222705922048
score 13.070432