Assessing Causality Structures learned from Digital Text Media

Autores
Maisonnave, Mariano; Delbianco, Fernando Andrés; Tohmé, Fernando Abel; Maguitman, Ana Gabriela; Milios, Evangelos E.
Año de publicación
2020
Idioma
inglés
Tipo de recurso
documento de conferencia
Estado
versión publicada
Descripción
In this paper we describe a framework to uncover potential causal relations between event mentions from streaming text of news media. This framework relies on a dataset of manually labeled events to train a recurrent neural network for event detection. It then creates a time series of event clusters, where clusters are based on BERT contextual word embedding representations of the identified events. Using these time series dataset, we assess four methods based on Granger causality for inferring causal relations. Granger causality is a statistical concept of causality that is based on forecasting. It states that a cause occurs before the effect, and the cause produces unique changes in the effect, so past values of the cause help predict future values of the effect. The four analyzed methods are the pairwise Granger test, VAR(1), BigVar and SiMoNe. The framework is applied to the New York Times dataset, which covers news for a period of 246 months. This preliminary analysis delivers important insights into the nature of each method, identifies differences and commonalities, and points out some of their strengths and weaknesses.
Fil: Maisonnave, Mariano. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Bahía Blanca. Instituto de Ciencias e Ingeniería de la Computación. Universidad Nacional del Sur. Departamento de Ciencias e Ingeniería de la Computación. Instituto de Ciencias e Ingeniería de la Computación; Argentina
Fil: Delbianco, Fernando Andrés. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Bahía Blanca. Instituto de Matemática Bahía Blanca. Universidad Nacional del Sur. Departamento de Matemática. Instituto de Matemática Bahía Blanca; Argentina. Universidad Nacional del Sur. Departamento de Economía; Argentina
Fil: Tohmé, Fernando Abel. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Bahía Blanca. Instituto de Matemática Bahía Blanca. Universidad Nacional del Sur. Departamento de Matemática. Instituto de Matemática Bahía Blanca; Argentina. Universidad Nacional del Sur. Departamento de Economía; Argentina
Fil: Maguitman, Ana Gabriela. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Bahía Blanca. Instituto de Ciencias e Ingeniería de la Computación. Universidad Nacional del Sur. Departamento de Ciencias e Ingeniería de la Computación. Instituto de Ciencias e Ingeniería de la Computación; Argentina
Fil: Milios, Evangelos E.. Dalhousie University. Faculty of Computer Science; Canadá
DocEng '20: ACM Symposium on Document Engineering 2020
New York
Estados Unidos
Association for Computing Machinery
Materia
GRANGER CAUSALITY
EVENT DETECTION
TIMES SERIES
Nivel de accesibilidad
acceso abierto
Condiciones de uso
https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
Repositorio
CONICET Digital (CONICET)
Institución
Consejo Nacional de Investigaciones Científicas y Técnicas
OAI Identificador
oai:ri.conicet.gov.ar:11336/138139

id CONICETDig_efa6df222d8c43086d6cac9630a21097
oai_identifier_str oai:ri.conicet.gov.ar:11336/138139
network_acronym_str CONICETDig
repository_id_str 3498
network_name_str CONICET Digital (CONICET)
spelling Assessing Causality Structures learned from Digital Text MediaMaisonnave, MarianoDelbianco, Fernando AndrésTohmé, Fernando AbelMaguitman, Ana GabrielaMilios, Evangelos E.GRANGER CAUSALITYEVENT DETECTIONTIMES SERIEShttps://purl.org/becyt/ford/1.2https://purl.org/becyt/ford/1In this paper we describe a framework to uncover potential causal relations between event mentions from streaming text of news media. This framework relies on a dataset of manually labeled events to train a recurrent neural network for event detection. It then creates a time series of event clusters, where clusters are based on BERT contextual word embedding representations of the identified events. Using these time series dataset, we assess four methods based on Granger causality for inferring causal relations. Granger causality is a statistical concept of causality that is based on forecasting. It states that a cause occurs before the effect, and the cause produces unique changes in the effect, so past values of the cause help predict future values of the effect. The four analyzed methods are the pairwise Granger test, VAR(1), BigVar and SiMoNe. The framework is applied to the New York Times dataset, which covers news for a period of 246 months. This preliminary analysis delivers important insights into the nature of each method, identifies differences and commonalities, and points out some of their strengths and weaknesses.Fil: Maisonnave, Mariano. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Bahía Blanca. Instituto de Ciencias e Ingeniería de la Computación. Universidad Nacional del Sur. Departamento de Ciencias e Ingeniería de la Computación. Instituto de Ciencias e Ingeniería de la Computación; ArgentinaFil: Delbianco, Fernando Andrés. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Bahía Blanca. Instituto de Matemática Bahía Blanca. Universidad Nacional del Sur. Departamento de Matemática. Instituto de Matemática Bahía Blanca; Argentina. Universidad Nacional del Sur. Departamento de Economía; ArgentinaFil: Tohmé, Fernando Abel. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Bahía Blanca. Instituto de Matemática Bahía Blanca. Universidad Nacional del Sur. Departamento de Matemática. Instituto de Matemática Bahía Blanca; Argentina. Universidad Nacional del Sur. Departamento de Economía; ArgentinaFil: Maguitman, Ana Gabriela. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Bahía Blanca. Instituto de Ciencias e Ingeniería de la Computación. Universidad Nacional del Sur. Departamento de Ciencias e Ingeniería de la Computación. Instituto de Ciencias e Ingeniería de la Computación; ArgentinaFil: Milios, Evangelos E.. Dalhousie University. Faculty of Computer Science; CanadáDocEng '20: ACM Symposium on Document Engineering 2020New YorkEstados UnidosAssociation for Computing MachineryAssociation for Computing Machinery2020info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/conferenceObjectSimposioBookhttp://purl.org/coar/resource_type/c_5794info:ar-repo/semantics/documentoDeConferenciaapplication/pdfapplication/pdfhttp://hdl.handle.net/11336/138139Assessing Causality Structures learned from Digital Text Media; DocEng '20: ACM Symposium on Document Engineering 2020; New York; Estados Unidos; 2020; 1-4978-1-4503-8000-3CONICET DigitalCONICETenginfo:eu-repo/semantics/altIdentifier/url/https://dl.acm.org/doi/10.1145/3395027.3419594info:eu-repo/semantics/altIdentifier/doi/10.1145/3395027.3419594Internacionalinfo:eu-repo/semantics/openAccesshttps://creativecommons.org/licenses/by-nc-sa/2.5/ar/reponame:CONICET Digital (CONICET)instname:Consejo Nacional de Investigaciones Científicas y Técnicas2025-09-29T09:34:19Zoai:ri.conicet.gov.ar:11336/138139instacron:CONICETInstitucionalhttp://ri.conicet.gov.ar/Organismo científico-tecnológicoNo correspondehttp://ri.conicet.gov.ar/oai/requestdasensio@conicet.gov.ar; lcarlino@conicet.gov.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:34982025-09-29 09:34:20.251CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicasfalse
dc.title.none.fl_str_mv Assessing Causality Structures learned from Digital Text Media
title Assessing Causality Structures learned from Digital Text Media
spellingShingle Assessing Causality Structures learned from Digital Text Media
Maisonnave, Mariano
GRANGER CAUSALITY
EVENT DETECTION
TIMES SERIES
title_short Assessing Causality Structures learned from Digital Text Media
title_full Assessing Causality Structures learned from Digital Text Media
title_fullStr Assessing Causality Structures learned from Digital Text Media
title_full_unstemmed Assessing Causality Structures learned from Digital Text Media
title_sort Assessing Causality Structures learned from Digital Text Media
dc.creator.none.fl_str_mv Maisonnave, Mariano
Delbianco, Fernando Andrés
Tohmé, Fernando Abel
Maguitman, Ana Gabriela
Milios, Evangelos E.
author Maisonnave, Mariano
author_facet Maisonnave, Mariano
Delbianco, Fernando Andrés
Tohmé, Fernando Abel
Maguitman, Ana Gabriela
Milios, Evangelos E.
author_role author
author2 Delbianco, Fernando Andrés
Tohmé, Fernando Abel
Maguitman, Ana Gabriela
Milios, Evangelos E.
author2_role author
author
author
author
dc.subject.none.fl_str_mv GRANGER CAUSALITY
EVENT DETECTION
TIMES SERIES
topic GRANGER CAUSALITY
EVENT DETECTION
TIMES SERIES
purl_subject.fl_str_mv https://purl.org/becyt/ford/1.2
https://purl.org/becyt/ford/1
dc.description.none.fl_txt_mv In this paper we describe a framework to uncover potential causal relations between event mentions from streaming text of news media. This framework relies on a dataset of manually labeled events to train a recurrent neural network for event detection. It then creates a time series of event clusters, where clusters are based on BERT contextual word embedding representations of the identified events. Using these time series dataset, we assess four methods based on Granger causality for inferring causal relations. Granger causality is a statistical concept of causality that is based on forecasting. It states that a cause occurs before the effect, and the cause produces unique changes in the effect, so past values of the cause help predict future values of the effect. The four analyzed methods are the pairwise Granger test, VAR(1), BigVar and SiMoNe. The framework is applied to the New York Times dataset, which covers news for a period of 246 months. This preliminary analysis delivers important insights into the nature of each method, identifies differences and commonalities, and points out some of their strengths and weaknesses.
Fil: Maisonnave, Mariano. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Bahía Blanca. Instituto de Ciencias e Ingeniería de la Computación. Universidad Nacional del Sur. Departamento de Ciencias e Ingeniería de la Computación. Instituto de Ciencias e Ingeniería de la Computación; Argentina
Fil: Delbianco, Fernando Andrés. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Bahía Blanca. Instituto de Matemática Bahía Blanca. Universidad Nacional del Sur. Departamento de Matemática. Instituto de Matemática Bahía Blanca; Argentina. Universidad Nacional del Sur. Departamento de Economía; Argentina
Fil: Tohmé, Fernando Abel. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Bahía Blanca. Instituto de Matemática Bahía Blanca. Universidad Nacional del Sur. Departamento de Matemática. Instituto de Matemática Bahía Blanca; Argentina. Universidad Nacional del Sur. Departamento de Economía; Argentina
Fil: Maguitman, Ana Gabriela. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Bahía Blanca. Instituto de Ciencias e Ingeniería de la Computación. Universidad Nacional del Sur. Departamento de Ciencias e Ingeniería de la Computación. Instituto de Ciencias e Ingeniería de la Computación; Argentina
Fil: Milios, Evangelos E.. Dalhousie University. Faculty of Computer Science; Canadá
DocEng '20: ACM Symposium on Document Engineering 2020
New York
Estados Unidos
Association for Computing Machinery
description In this paper we describe a framework to uncover potential causal relations between event mentions from streaming text of news media. This framework relies on a dataset of manually labeled events to train a recurrent neural network for event detection. It then creates a time series of event clusters, where clusters are based on BERT contextual word embedding representations of the identified events. Using these time series dataset, we assess four methods based on Granger causality for inferring causal relations. Granger causality is a statistical concept of causality that is based on forecasting. It states that a cause occurs before the effect, and the cause produces unique changes in the effect, so past values of the cause help predict future values of the effect. The four analyzed methods are the pairwise Granger test, VAR(1), BigVar and SiMoNe. The framework is applied to the New York Times dataset, which covers news for a period of 246 months. This preliminary analysis delivers important insights into the nature of each method, identifies differences and commonalities, and points out some of their strengths and weaknesses.
publishDate 2020
dc.date.none.fl_str_mv 2020
dc.type.none.fl_str_mv info:eu-repo/semantics/publishedVersion
info:eu-repo/semantics/conferenceObject
Simposio
Book
http://purl.org/coar/resource_type/c_5794
info:ar-repo/semantics/documentoDeConferencia
status_str publishedVersion
format conferenceObject
dc.identifier.none.fl_str_mv http://hdl.handle.net/11336/138139
Assessing Causality Structures learned from Digital Text Media; DocEng '20: ACM Symposium on Document Engineering 2020; New York; Estados Unidos; 2020; 1-4
978-1-4503-8000-3
CONICET Digital
CONICET
url http://hdl.handle.net/11336/138139
identifier_str_mv Assessing Causality Structures learned from Digital Text Media; DocEng '20: ACM Symposium on Document Engineering 2020; New York; Estados Unidos; 2020; 1-4
978-1-4503-8000-3
CONICET Digital
CONICET
dc.language.none.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv info:eu-repo/semantics/altIdentifier/url/https://dl.acm.org/doi/10.1145/3395027.3419594
info:eu-repo/semantics/altIdentifier/doi/10.1145/3395027.3419594
dc.rights.none.fl_str_mv info:eu-repo/semantics/openAccess
https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
eu_rights_str_mv openAccess
rights_invalid_str_mv https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
dc.format.none.fl_str_mv application/pdf
application/pdf
dc.coverage.none.fl_str_mv Internacional
dc.publisher.none.fl_str_mv Association for Computing Machinery
publisher.none.fl_str_mv Association for Computing Machinery
dc.source.none.fl_str_mv reponame:CONICET Digital (CONICET)
instname:Consejo Nacional de Investigaciones Científicas y Técnicas
reponame_str CONICET Digital (CONICET)
collection CONICET Digital (CONICET)
instname_str Consejo Nacional de Investigaciones Científicas y Técnicas
repository.name.fl_str_mv CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicas
repository.mail.fl_str_mv dasensio@conicet.gov.ar; lcarlino@conicet.gov.ar
_version_ 1844613061416058880
score 13.070432