Toxicity, polarizations and cultural diversity in social networks : Using machine learning and natural language processing to analyze these phenomena in social networks

Autores
Oppenheim, Abi; Albanese, Federico; Feuerstein, Esteban
Año de publicación
2022
Idioma
inglés
Tipo de recurso
documento de conferencia
Estado
versión publicada
Descripción
Social media have increased the amount of information that people consume as well as the number of interactions between them. Nevertheless, most people tend to promote their favored narratives and hence form polarized groups. This encourages polarization and extremism resulting in extreme violence. Against this backdrop, it is in our interest to find environments, strategies and mechanisms that allow us to reduce toxicity on social media (defining “toxicity” as a rude, disrespectful or unreasonable comment that is likely to make people leave a discussion). We address the hypothesis that a higher cultural diversity among community users reduces the toxicity of the user messages. We use Reddit as a case study, since this platform is characterized by a variety of discussion sub-forums where users debate political and cultural issues. Using community2vec, we generate an embedding for each community that allows us to portray users in a demographic and ideological aspect. In order to analyze each user statement, we process the data with different models, thereby obtaining which are the topics of debate and what are the levels of aggressiveness and negativism in them. Finally, we will seek to corroborate the hypothesis by analyzing the relationship between the cultural diversity present in each discussion group and the toxicity found in their posts.
Sociedad Argentina de Informática e Investigación Operativa
Materia
Ciencias Informáticas
machine learning
social media
Reddit
data mining
toxicity
Nivel de accesibilidad
acceso abierto
Condiciones de uso
http://creativecommons.org/licenses/by-nc-sa/4.0/
Repositorio
SEDICI (UNLP)
Institución
Universidad Nacional de La Plata
OAI Identificador
oai:sedici.unlp.edu.ar:10915/151588

id SEDICI_a17ee98ed0062bc2f7d8bc3398ae8157
oai_identifier_str oai:sedici.unlp.edu.ar:10915/151588
network_acronym_str SEDICI
repository_id_str 1329
network_name_str SEDICI (UNLP)
spelling Toxicity, polarizations and cultural diversity in social networks : Using machine learning and natural language processing to analyze these phenomena in social networksOppenheim, AbiAlbanese, FedericoFeuerstein, EstebanCiencias Informáticasmachine learningsocial mediaRedditdata miningtoxicitySocial media have increased the amount of information that people consume as well as the number of interactions between them. Nevertheless, most people tend to promote their favored narratives and hence form polarized groups. This encourages polarization and extremism resulting in extreme violence. Against this backdrop, it is in our interest to find environments, strategies and mechanisms that allow us to reduce toxicity on social media (defining “toxicity” as a rude, disrespectful or unreasonable comment that is likely to make people leave a discussion). We address the hypothesis that a higher cultural diversity among community users reduces the toxicity of the user messages. We use Reddit as a case study, since this platform is characterized by a variety of discussion sub-forums where users debate political and cultural issues. Using community2vec, we generate an embedding for each community that allows us to portray users in a demographic and ideological aspect. In order to analyze each user statement, we process the data with different models, thereby obtaining which are the topics of debate and what are the levels of aggressiveness and negativism in them. Finally, we will seek to corroborate the hypothesis by analyzing the relationship between the cultural diversity present in each discussion group and the toxicity found in their posts.Sociedad Argentina de Informática e Investigación Operativa2022-10info:eu-repo/semantics/conferenceObjectinfo:eu-repo/semantics/publishedVersionResumenhttp://purl.org/coar/resource_type/c_5794info:ar-repo/semantics/documentoDeConferenciaapplication/pdf28-29http://sedici.unlp.edu.ar/handle/10915/151588enginfo:eu-repo/semantics/altIdentifier/url/https://publicaciones.sadio.org.ar/index.php/JAIIO/article/download/248/207info:eu-repo/semantics/altIdentifier/issn/2451-7496info:eu-repo/semantics/openAccesshttp://creativecommons.org/licenses/by-nc-sa/4.0/Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)reponame:SEDICI (UNLP)instname:Universidad Nacional de La Platainstacron:UNLP2025-10-15T11:30:50Zoai:sedici.unlp.edu.ar:10915/151588Institucionalhttp://sedici.unlp.edu.ar/Universidad públicaNo correspondehttp://sedici.unlp.edu.ar/oai/snrdalira@sedici.unlp.edu.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:13292025-10-15 11:30:50.879SEDICI (UNLP) - Universidad Nacional de La Platafalse
dc.title.none.fl_str_mv Toxicity, polarizations and cultural diversity in social networks : Using machine learning and natural language processing to analyze these phenomena in social networks
title Toxicity, polarizations and cultural diversity in social networks : Using machine learning and natural language processing to analyze these phenomena in social networks
spellingShingle Toxicity, polarizations and cultural diversity in social networks : Using machine learning and natural language processing to analyze these phenomena in social networks
Oppenheim, Abi
Ciencias Informáticas
machine learning
social media
Reddit
data mining
toxicity
title_short Toxicity, polarizations and cultural diversity in social networks : Using machine learning and natural language processing to analyze these phenomena in social networks
title_full Toxicity, polarizations and cultural diversity in social networks : Using machine learning and natural language processing to analyze these phenomena in social networks
title_fullStr Toxicity, polarizations and cultural diversity in social networks : Using machine learning and natural language processing to analyze these phenomena in social networks
title_full_unstemmed Toxicity, polarizations and cultural diversity in social networks : Using machine learning and natural language processing to analyze these phenomena in social networks
title_sort Toxicity, polarizations and cultural diversity in social networks : Using machine learning and natural language processing to analyze these phenomena in social networks
dc.creator.none.fl_str_mv Oppenheim, Abi
Albanese, Federico
Feuerstein, Esteban
author Oppenheim, Abi
author_facet Oppenheim, Abi
Albanese, Federico
Feuerstein, Esteban
author_role author
author2 Albanese, Federico
Feuerstein, Esteban
author2_role author
author
dc.subject.none.fl_str_mv Ciencias Informáticas
machine learning
social media
Reddit
data mining
toxicity
topic Ciencias Informáticas
machine learning
social media
Reddit
data mining
toxicity
dc.description.none.fl_txt_mv Social media have increased the amount of information that people consume as well as the number of interactions between them. Nevertheless, most people tend to promote their favored narratives and hence form polarized groups. This encourages polarization and extremism resulting in extreme violence. Against this backdrop, it is in our interest to find environments, strategies and mechanisms that allow us to reduce toxicity on social media (defining “toxicity” as a rude, disrespectful or unreasonable comment that is likely to make people leave a discussion). We address the hypothesis that a higher cultural diversity among community users reduces the toxicity of the user messages. We use Reddit as a case study, since this platform is characterized by a variety of discussion sub-forums where users debate political and cultural issues. Using community2vec, we generate an embedding for each community that allows us to portray users in a demographic and ideological aspect. In order to analyze each user statement, we process the data with different models, thereby obtaining which are the topics of debate and what are the levels of aggressiveness and negativism in them. Finally, we will seek to corroborate the hypothesis by analyzing the relationship between the cultural diversity present in each discussion group and the toxicity found in their posts.
Sociedad Argentina de Informática e Investigación Operativa
description Social media have increased the amount of information that people consume as well as the number of interactions between them. Nevertheless, most people tend to promote their favored narratives and hence form polarized groups. This encourages polarization and extremism resulting in extreme violence. Against this backdrop, it is in our interest to find environments, strategies and mechanisms that allow us to reduce toxicity on social media (defining “toxicity” as a rude, disrespectful or unreasonable comment that is likely to make people leave a discussion). We address the hypothesis that a higher cultural diversity among community users reduces the toxicity of the user messages. We use Reddit as a case study, since this platform is characterized by a variety of discussion sub-forums where users debate political and cultural issues. Using community2vec, we generate an embedding for each community that allows us to portray users in a demographic and ideological aspect. In order to analyze each user statement, we process the data with different models, thereby obtaining which are the topics of debate and what are the levels of aggressiveness and negativism in them. Finally, we will seek to corroborate the hypothesis by analyzing the relationship between the cultural diversity present in each discussion group and the toxicity found in their posts.
publishDate 2022
dc.date.none.fl_str_mv 2022-10
dc.type.none.fl_str_mv info:eu-repo/semantics/conferenceObject
info:eu-repo/semantics/publishedVersion
Resumen
http://purl.org/coar/resource_type/c_5794
info:ar-repo/semantics/documentoDeConferencia
format conferenceObject
status_str publishedVersion
dc.identifier.none.fl_str_mv http://sedici.unlp.edu.ar/handle/10915/151588
url http://sedici.unlp.edu.ar/handle/10915/151588
dc.language.none.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv info:eu-repo/semantics/altIdentifier/url/https://publicaciones.sadio.org.ar/index.php/JAIIO/article/download/248/207
info:eu-repo/semantics/altIdentifier/issn/2451-7496
dc.rights.none.fl_str_mv info:eu-repo/semantics/openAccess
http://creativecommons.org/licenses/by-nc-sa/4.0/
Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)
eu_rights_str_mv openAccess
rights_invalid_str_mv http://creativecommons.org/licenses/by-nc-sa/4.0/
Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)
dc.format.none.fl_str_mv application/pdf
28-29
dc.source.none.fl_str_mv reponame:SEDICI (UNLP)
instname:Universidad Nacional de La Plata
instacron:UNLP
reponame_str SEDICI (UNLP)
collection SEDICI (UNLP)
instname_str Universidad Nacional de La Plata
instacron_str UNLP
institution UNLP
repository.name.fl_str_mv SEDICI (UNLP) - Universidad Nacional de La Plata
repository.mail.fl_str_mv alira@sedici.unlp.edu.ar
_version_ 1846064345981648896
score 13.22299