Toxicity, polarizations and cultural diversity in social networks : Using machine learning and natural language processing to analyze these phenomena in social networks
- Autores
- Oppenheim, Abi; Albanese, Federico; Feuerstein, Esteban
- Año de publicación
- 2022
- Idioma
- inglés
- Tipo de recurso
- documento de conferencia
- Estado
- versión publicada
- Descripción
- Social media have increased the amount of information that people consume as well as the number of interactions between them. Nevertheless, most people tend to promote their favored narratives and hence form polarized groups. This encourages polarization and extremism resulting in extreme violence. Against this backdrop, it is in our interest to find environments, strategies and mechanisms that allow us to reduce toxicity on social media (defining “toxicity” as a rude, disrespectful or unreasonable comment that is likely to make people leave a discussion). We address the hypothesis that a higher cultural diversity among community users reduces the toxicity of the user messages. We use Reddit as a case study, since this platform is characterized by a variety of discussion sub-forums where users debate political and cultural issues. Using community2vec, we generate an embedding for each community that allows us to portray users in a demographic and ideological aspect. In order to analyze each user statement, we process the data with different models, thereby obtaining which are the topics of debate and what are the levels of aggressiveness and negativism in them. Finally, we will seek to corroborate the hypothesis by analyzing the relationship between the cultural diversity present in each discussion group and the toxicity found in their posts.
Sociedad Argentina de Informática e Investigación Operativa - Materia
-
Ciencias Informáticas
machine learning
social media
Reddit
data mining
toxicity - Nivel de accesibilidad
- acceso abierto
- Condiciones de uso
- http://creativecommons.org/licenses/by-nc-sa/4.0/
- Repositorio
- Institución
- Universidad Nacional de La Plata
- OAI Identificador
- oai:sedici.unlp.edu.ar:10915/151588
Ver los metadatos del registro completo
id |
SEDICI_a17ee98ed0062bc2f7d8bc3398ae8157 |
---|---|
oai_identifier_str |
oai:sedici.unlp.edu.ar:10915/151588 |
network_acronym_str |
SEDICI |
repository_id_str |
1329 |
network_name_str |
SEDICI (UNLP) |
spelling |
Toxicity, polarizations and cultural diversity in social networks : Using machine learning and natural language processing to analyze these phenomena in social networksOppenheim, AbiAlbanese, FedericoFeuerstein, EstebanCiencias Informáticasmachine learningsocial mediaRedditdata miningtoxicitySocial media have increased the amount of information that people consume as well as the number of interactions between them. Nevertheless, most people tend to promote their favored narratives and hence form polarized groups. This encourages polarization and extremism resulting in extreme violence. Against this backdrop, it is in our interest to find environments, strategies and mechanisms that allow us to reduce toxicity on social media (defining “toxicity” as a rude, disrespectful or unreasonable comment that is likely to make people leave a discussion). We address the hypothesis that a higher cultural diversity among community users reduces the toxicity of the user messages. We use Reddit as a case study, since this platform is characterized by a variety of discussion sub-forums where users debate political and cultural issues. Using community2vec, we generate an embedding for each community that allows us to portray users in a demographic and ideological aspect. In order to analyze each user statement, we process the data with different models, thereby obtaining which are the topics of debate and what are the levels of aggressiveness and negativism in them. Finally, we will seek to corroborate the hypothesis by analyzing the relationship between the cultural diversity present in each discussion group and the toxicity found in their posts.Sociedad Argentina de Informática e Investigación Operativa2022-10info:eu-repo/semantics/conferenceObjectinfo:eu-repo/semantics/publishedVersionResumenhttp://purl.org/coar/resource_type/c_5794info:ar-repo/semantics/documentoDeConferenciaapplication/pdf28-29http://sedici.unlp.edu.ar/handle/10915/151588enginfo:eu-repo/semantics/altIdentifier/url/https://publicaciones.sadio.org.ar/index.php/JAIIO/article/download/248/207info:eu-repo/semantics/altIdentifier/issn/2451-7496info:eu-repo/semantics/openAccesshttp://creativecommons.org/licenses/by-nc-sa/4.0/Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)reponame:SEDICI (UNLP)instname:Universidad Nacional de La Platainstacron:UNLP2025-10-15T11:30:50Zoai:sedici.unlp.edu.ar:10915/151588Institucionalhttp://sedici.unlp.edu.ar/Universidad públicaNo correspondehttp://sedici.unlp.edu.ar/oai/snrdalira@sedici.unlp.edu.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:13292025-10-15 11:30:50.879SEDICI (UNLP) - Universidad Nacional de La Platafalse |
dc.title.none.fl_str_mv |
Toxicity, polarizations and cultural diversity in social networks : Using machine learning and natural language processing to analyze these phenomena in social networks |
title |
Toxicity, polarizations and cultural diversity in social networks : Using machine learning and natural language processing to analyze these phenomena in social networks |
spellingShingle |
Toxicity, polarizations and cultural diversity in social networks : Using machine learning and natural language processing to analyze these phenomena in social networks Oppenheim, Abi Ciencias Informáticas machine learning social media data mining toxicity |
title_short |
Toxicity, polarizations and cultural diversity in social networks : Using machine learning and natural language processing to analyze these phenomena in social networks |
title_full |
Toxicity, polarizations and cultural diversity in social networks : Using machine learning and natural language processing to analyze these phenomena in social networks |
title_fullStr |
Toxicity, polarizations and cultural diversity in social networks : Using machine learning and natural language processing to analyze these phenomena in social networks |
title_full_unstemmed |
Toxicity, polarizations and cultural diversity in social networks : Using machine learning and natural language processing to analyze these phenomena in social networks |
title_sort |
Toxicity, polarizations and cultural diversity in social networks : Using machine learning and natural language processing to analyze these phenomena in social networks |
dc.creator.none.fl_str_mv |
Oppenheim, Abi Albanese, Federico Feuerstein, Esteban |
author |
Oppenheim, Abi |
author_facet |
Oppenheim, Abi Albanese, Federico Feuerstein, Esteban |
author_role |
author |
author2 |
Albanese, Federico Feuerstein, Esteban |
author2_role |
author author |
dc.subject.none.fl_str_mv |
Ciencias Informáticas machine learning social media data mining toxicity |
topic |
Ciencias Informáticas machine learning social media data mining toxicity |
dc.description.none.fl_txt_mv |
Social media have increased the amount of information that people consume as well as the number of interactions between them. Nevertheless, most people tend to promote their favored narratives and hence form polarized groups. This encourages polarization and extremism resulting in extreme violence. Against this backdrop, it is in our interest to find environments, strategies and mechanisms that allow us to reduce toxicity on social media (defining “toxicity” as a rude, disrespectful or unreasonable comment that is likely to make people leave a discussion). We address the hypothesis that a higher cultural diversity among community users reduces the toxicity of the user messages. We use Reddit as a case study, since this platform is characterized by a variety of discussion sub-forums where users debate political and cultural issues. Using community2vec, we generate an embedding for each community that allows us to portray users in a demographic and ideological aspect. In order to analyze each user statement, we process the data with different models, thereby obtaining which are the topics of debate and what are the levels of aggressiveness and negativism in them. Finally, we will seek to corroborate the hypothesis by analyzing the relationship between the cultural diversity present in each discussion group and the toxicity found in their posts. Sociedad Argentina de Informática e Investigación Operativa |
description |
Social media have increased the amount of information that people consume as well as the number of interactions between them. Nevertheless, most people tend to promote their favored narratives and hence form polarized groups. This encourages polarization and extremism resulting in extreme violence. Against this backdrop, it is in our interest to find environments, strategies and mechanisms that allow us to reduce toxicity on social media (defining “toxicity” as a rude, disrespectful or unreasonable comment that is likely to make people leave a discussion). We address the hypothesis that a higher cultural diversity among community users reduces the toxicity of the user messages. We use Reddit as a case study, since this platform is characterized by a variety of discussion sub-forums where users debate political and cultural issues. Using community2vec, we generate an embedding for each community that allows us to portray users in a demographic and ideological aspect. In order to analyze each user statement, we process the data with different models, thereby obtaining which are the topics of debate and what are the levels of aggressiveness and negativism in them. Finally, we will seek to corroborate the hypothesis by analyzing the relationship between the cultural diversity present in each discussion group and the toxicity found in their posts. |
publishDate |
2022 |
dc.date.none.fl_str_mv |
2022-10 |
dc.type.none.fl_str_mv |
info:eu-repo/semantics/conferenceObject info:eu-repo/semantics/publishedVersion Resumen http://purl.org/coar/resource_type/c_5794 info:ar-repo/semantics/documentoDeConferencia |
format |
conferenceObject |
status_str |
publishedVersion |
dc.identifier.none.fl_str_mv |
http://sedici.unlp.edu.ar/handle/10915/151588 |
url |
http://sedici.unlp.edu.ar/handle/10915/151588 |
dc.language.none.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
info:eu-repo/semantics/altIdentifier/url/https://publicaciones.sadio.org.ar/index.php/JAIIO/article/download/248/207 info:eu-repo/semantics/altIdentifier/issn/2451-7496 |
dc.rights.none.fl_str_mv |
info:eu-repo/semantics/openAccess http://creativecommons.org/licenses/by-nc-sa/4.0/ Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) |
eu_rights_str_mv |
openAccess |
rights_invalid_str_mv |
http://creativecommons.org/licenses/by-nc-sa/4.0/ Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) |
dc.format.none.fl_str_mv |
application/pdf 28-29 |
dc.source.none.fl_str_mv |
reponame:SEDICI (UNLP) instname:Universidad Nacional de La Plata instacron:UNLP |
reponame_str |
SEDICI (UNLP) |
collection |
SEDICI (UNLP) |
instname_str |
Universidad Nacional de La Plata |
instacron_str |
UNLP |
institution |
UNLP |
repository.name.fl_str_mv |
SEDICI (UNLP) - Universidad Nacional de La Plata |
repository.mail.fl_str_mv |
alira@sedici.unlp.edu.ar |
_version_ |
1846064345981648896 |
score |
13.22299 |