A study on influential features for predicting best answers in community question-answering forums
- Autores
- Zoratto, Valeria; Godoy, Daniela Lis; Aranda, Gabriela Noemi
- Año de publicación
- 2023
- Idioma
- inglés
- Tipo de recurso
- artículo
- Estado
- versión publicada
- Descripción
- The knowledge provided by user communities in question-answering (QA) forums is a highly valuable source of information for satisfying user information needs. However, finding the best answer for a posted question can be challenging. User-generated content in forums can be of unequal quality given the free nature of natural language and the varied levels of user expertise. Answers to a question posted in a forum are compiled in a discussion thread, concentrating also posterior activity such as comments and votes. There are usually multiple reasons why an answer successfully fulfills a certain information need and gets accepted as the best answer among a (possibly) high number of answers. In this work, we study the influence that different aspects of answers have on the prediction of the best answers in a QA forum. We collected the discussion threads of a real-world forum concerning computer programming, and we evaluated different features for representing the answers and the context in which they appear in a thread. Multiple classification models were used to compare the performance of the different features, finding that readability is one of the most important factors for detecting the best answers. The goal of this study is to shed some light on the reasons why answers are more likely to receive more votes and be selected as the best answer for a posted question. Such knowledge enables users to enhance their answers which leads, in turn, to an improvement in the overall quality of the content produced in a platform.
Fil: Zoratto, Valeria. Universidad Nacional del Comahue. Facultad de Informatica; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina
Fil: Godoy, Daniela Lis. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Tandil. Instituto Superior de Ingeniería del Software. Universidad Nacional del Centro de la Provincia de Buenos Aires. Instituto Superior de Ingeniería del Software; Argentina
Fil: Aranda, Gabriela Noemi. Universidad Nacional del Comahue. Facultad de Informatica; Argentina - Materia
-
CQA forums
best answer prediction
information retrieval - Nivel de accesibilidad
- acceso abierto
- Condiciones de uso
- https://creativecommons.org/licenses/by/2.5/ar/
- Repositorio
- Institución
- Consejo Nacional de Investigaciones Científicas y Técnicas
- OAI Identificador
- oai:ri.conicet.gov.ar:11336/231426
Ver los metadatos del registro completo
id |
CONICETDig_0fde489486d550aae04d9e81db11e3d9 |
---|---|
oai_identifier_str |
oai:ri.conicet.gov.ar:11336/231426 |
network_acronym_str |
CONICETDig |
repository_id_str |
3498 |
network_name_str |
CONICET Digital (CONICET) |
spelling |
A study on influential features for predicting best answers in community question-answering forumsZoratto, ValeriaGodoy, Daniela LisAranda, Gabriela NoemiCQA forumsbest answer predictioninformation retrievalhttps://purl.org/becyt/ford/1.2https://purl.org/becyt/ford/1The knowledge provided by user communities in question-answering (QA) forums is a highly valuable source of information for satisfying user information needs. However, finding the best answer for a posted question can be challenging. User-generated content in forums can be of unequal quality given the free nature of natural language and the varied levels of user expertise. Answers to a question posted in a forum are compiled in a discussion thread, concentrating also posterior activity such as comments and votes. There are usually multiple reasons why an answer successfully fulfills a certain information need and gets accepted as the best answer among a (possibly) high number of answers. In this work, we study the influence that different aspects of answers have on the prediction of the best answers in a QA forum. We collected the discussion threads of a real-world forum concerning computer programming, and we evaluated different features for representing the answers and the context in which they appear in a thread. Multiple classification models were used to compare the performance of the different features, finding that readability is one of the most important factors for detecting the best answers. The goal of this study is to shed some light on the reasons why answers are more likely to receive more votes and be selected as the best answer for a posted question. Such knowledge enables users to enhance their answers which leads, in turn, to an improvement in the overall quality of the content produced in a platform.Fil: Zoratto, Valeria. Universidad Nacional del Comahue. Facultad de Informatica; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; ArgentinaFil: Godoy, Daniela Lis. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Tandil. Instituto Superior de Ingeniería del Software. Universidad Nacional del Centro de la Provincia de Buenos Aires. Instituto Superior de Ingeniería del Software; ArgentinaFil: Aranda, Gabriela Noemi. Universidad Nacional del Comahue. Facultad de Informatica; ArgentinaMDPI2023-09info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionhttp://purl.org/coar/resource_type/c_6501info:ar-repo/semantics/articuloapplication/pdfapplication/pdfapplication/pdfhttp://hdl.handle.net/11336/231426Zoratto, Valeria; Godoy, Daniela Lis; Aranda, Gabriela Noemi; A study on influential features for predicting best answers in community question-answering forums; MDPI; Information; 14; 9; 9-2023; 1-222078-2489CONICET DigitalCONICETenginfo:eu-repo/semantics/altIdentifier/url/https://www.mdpi.com/2078-2489/14/9/496info:eu-repo/semantics/altIdentifier/doi/10.3390/info14090496info:eu-repo/semantics/openAccesshttps://creativecommons.org/licenses/by/2.5/ar/reponame:CONICET Digital (CONICET)instname:Consejo Nacional de Investigaciones Científicas y Técnicas2025-09-29T09:38:39Zoai:ri.conicet.gov.ar:11336/231426instacron:CONICETInstitucionalhttp://ri.conicet.gov.ar/Organismo científico-tecnológicoNo correspondehttp://ri.conicet.gov.ar/oai/requestdasensio@conicet.gov.ar; lcarlino@conicet.gov.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:34982025-09-29 09:38:39.948CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicasfalse |
dc.title.none.fl_str_mv |
A study on influential features for predicting best answers in community question-answering forums |
title |
A study on influential features for predicting best answers in community question-answering forums |
spellingShingle |
A study on influential features for predicting best answers in community question-answering forums Zoratto, Valeria CQA forums best answer prediction information retrieval |
title_short |
A study on influential features for predicting best answers in community question-answering forums |
title_full |
A study on influential features for predicting best answers in community question-answering forums |
title_fullStr |
A study on influential features for predicting best answers in community question-answering forums |
title_full_unstemmed |
A study on influential features for predicting best answers in community question-answering forums |
title_sort |
A study on influential features for predicting best answers in community question-answering forums |
dc.creator.none.fl_str_mv |
Zoratto, Valeria Godoy, Daniela Lis Aranda, Gabriela Noemi |
author |
Zoratto, Valeria |
author_facet |
Zoratto, Valeria Godoy, Daniela Lis Aranda, Gabriela Noemi |
author_role |
author |
author2 |
Godoy, Daniela Lis Aranda, Gabriela Noemi |
author2_role |
author author |
dc.subject.none.fl_str_mv |
CQA forums best answer prediction information retrieval |
topic |
CQA forums best answer prediction information retrieval |
purl_subject.fl_str_mv |
https://purl.org/becyt/ford/1.2 https://purl.org/becyt/ford/1 |
dc.description.none.fl_txt_mv |
The knowledge provided by user communities in question-answering (QA) forums is a highly valuable source of information for satisfying user information needs. However, finding the best answer for a posted question can be challenging. User-generated content in forums can be of unequal quality given the free nature of natural language and the varied levels of user expertise. Answers to a question posted in a forum are compiled in a discussion thread, concentrating also posterior activity such as comments and votes. There are usually multiple reasons why an answer successfully fulfills a certain information need and gets accepted as the best answer among a (possibly) high number of answers. In this work, we study the influence that different aspects of answers have on the prediction of the best answers in a QA forum. We collected the discussion threads of a real-world forum concerning computer programming, and we evaluated different features for representing the answers and the context in which they appear in a thread. Multiple classification models were used to compare the performance of the different features, finding that readability is one of the most important factors for detecting the best answers. The goal of this study is to shed some light on the reasons why answers are more likely to receive more votes and be selected as the best answer for a posted question. Such knowledge enables users to enhance their answers which leads, in turn, to an improvement in the overall quality of the content produced in a platform. Fil: Zoratto, Valeria. Universidad Nacional del Comahue. Facultad de Informatica; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina Fil: Godoy, Daniela Lis. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Tandil. Instituto Superior de Ingeniería del Software. Universidad Nacional del Centro de la Provincia de Buenos Aires. Instituto Superior de Ingeniería del Software; Argentina Fil: Aranda, Gabriela Noemi. Universidad Nacional del Comahue. Facultad de Informatica; Argentina |
description |
The knowledge provided by user communities in question-answering (QA) forums is a highly valuable source of information for satisfying user information needs. However, finding the best answer for a posted question can be challenging. User-generated content in forums can be of unequal quality given the free nature of natural language and the varied levels of user expertise. Answers to a question posted in a forum are compiled in a discussion thread, concentrating also posterior activity such as comments and votes. There are usually multiple reasons why an answer successfully fulfills a certain information need and gets accepted as the best answer among a (possibly) high number of answers. In this work, we study the influence that different aspects of answers have on the prediction of the best answers in a QA forum. We collected the discussion threads of a real-world forum concerning computer programming, and we evaluated different features for representing the answers and the context in which they appear in a thread. Multiple classification models were used to compare the performance of the different features, finding that readability is one of the most important factors for detecting the best answers. The goal of this study is to shed some light on the reasons why answers are more likely to receive more votes and be selected as the best answer for a posted question. Such knowledge enables users to enhance their answers which leads, in turn, to an improvement in the overall quality of the content produced in a platform. |
publishDate |
2023 |
dc.date.none.fl_str_mv |
2023-09 |
dc.type.none.fl_str_mv |
info:eu-repo/semantics/article info:eu-repo/semantics/publishedVersion http://purl.org/coar/resource_type/c_6501 info:ar-repo/semantics/articulo |
format |
article |
status_str |
publishedVersion |
dc.identifier.none.fl_str_mv |
http://hdl.handle.net/11336/231426 Zoratto, Valeria; Godoy, Daniela Lis; Aranda, Gabriela Noemi; A study on influential features for predicting best answers in community question-answering forums; MDPI; Information; 14; 9; 9-2023; 1-22 2078-2489 CONICET Digital CONICET |
url |
http://hdl.handle.net/11336/231426 |
identifier_str_mv |
Zoratto, Valeria; Godoy, Daniela Lis; Aranda, Gabriela Noemi; A study on influential features for predicting best answers in community question-answering forums; MDPI; Information; 14; 9; 9-2023; 1-22 2078-2489 CONICET Digital CONICET |
dc.language.none.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
info:eu-repo/semantics/altIdentifier/url/https://www.mdpi.com/2078-2489/14/9/496 info:eu-repo/semantics/altIdentifier/doi/10.3390/info14090496 |
dc.rights.none.fl_str_mv |
info:eu-repo/semantics/openAccess https://creativecommons.org/licenses/by/2.5/ar/ |
eu_rights_str_mv |
openAccess |
rights_invalid_str_mv |
https://creativecommons.org/licenses/by/2.5/ar/ |
dc.format.none.fl_str_mv |
application/pdf application/pdf application/pdf |
dc.publisher.none.fl_str_mv |
MDPI |
publisher.none.fl_str_mv |
MDPI |
dc.source.none.fl_str_mv |
reponame:CONICET Digital (CONICET) instname:Consejo Nacional de Investigaciones Científicas y Técnicas |
reponame_str |
CONICET Digital (CONICET) |
collection |
CONICET Digital (CONICET) |
instname_str |
Consejo Nacional de Investigaciones Científicas y Técnicas |
repository.name.fl_str_mv |
CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicas |
repository.mail.fl_str_mv |
dasensio@conicet.gov.ar; lcarlino@conicet.gov.ar |
_version_ |
1844613222705922048 |
score |
13.070432 |