Assessing Causality Structures learned from Digital Text Media
- Autores
- Maisonnave, Mariano; Delbianco, Fernando Andrés; Tohmé, Fernando Abel; Maguitman, Ana Gabriela; Milios, Evangelos E.
- Año de publicación
- 2020
- Idioma
- inglés
- Tipo de recurso
- documento de conferencia
- Estado
- versión publicada
- Descripción
- In this paper we describe a framework to uncover potential causal relations between event mentions from streaming text of news media. This framework relies on a dataset of manually labeled events to train a recurrent neural network for event detection. It then creates a time series of event clusters, where clusters are based on BERT contextual word embedding representations of the identified events. Using these time series dataset, we assess four methods based on Granger causality for inferring causal relations. Granger causality is a statistical concept of causality that is based on forecasting. It states that a cause occurs before the effect, and the cause produces unique changes in the effect, so past values of the cause help predict future values of the effect. The four analyzed methods are the pairwise Granger test, VAR(1), BigVar and SiMoNe. The framework is applied to the New York Times dataset, which covers news for a period of 246 months. This preliminary analysis delivers important insights into the nature of each method, identifies differences and commonalities, and points out some of their strengths and weaknesses.
Fil: Maisonnave, Mariano. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Bahía Blanca. Instituto de Ciencias e Ingeniería de la Computación. Universidad Nacional del Sur. Departamento de Ciencias e Ingeniería de la Computación. Instituto de Ciencias e Ingeniería de la Computación; Argentina
Fil: Delbianco, Fernando Andrés. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Bahía Blanca. Instituto de Matemática Bahía Blanca. Universidad Nacional del Sur. Departamento de Matemática. Instituto de Matemática Bahía Blanca; Argentina. Universidad Nacional del Sur. Departamento de Economía; Argentina
Fil: Tohmé, Fernando Abel. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Bahía Blanca. Instituto de Matemática Bahía Blanca. Universidad Nacional del Sur. Departamento de Matemática. Instituto de Matemática Bahía Blanca; Argentina. Universidad Nacional del Sur. Departamento de Economía; Argentina
Fil: Maguitman, Ana Gabriela. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Bahía Blanca. Instituto de Ciencias e Ingeniería de la Computación. Universidad Nacional del Sur. Departamento de Ciencias e Ingeniería de la Computación. Instituto de Ciencias e Ingeniería de la Computación; Argentina
Fil: Milios, Evangelos E.. Dalhousie University. Faculty of Computer Science; Canadá
DocEng '20: ACM Symposium on Document Engineering 2020
New York
Estados Unidos
Association for Computing Machinery - Materia
-
GRANGER CAUSALITY
EVENT DETECTION
TIMES SERIES - Nivel de accesibilidad
- acceso abierto
- Condiciones de uso
- https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
- Repositorio
- Institución
- Consejo Nacional de Investigaciones Científicas y Técnicas
- OAI Identificador
- oai:ri.conicet.gov.ar:11336/138139
Ver los metadatos del registro completo
id |
CONICETDig_efa6df222d8c43086d6cac9630a21097 |
---|---|
oai_identifier_str |
oai:ri.conicet.gov.ar:11336/138139 |
network_acronym_str |
CONICETDig |
repository_id_str |
3498 |
network_name_str |
CONICET Digital (CONICET) |
spelling |
Assessing Causality Structures learned from Digital Text MediaMaisonnave, MarianoDelbianco, Fernando AndrésTohmé, Fernando AbelMaguitman, Ana GabrielaMilios, Evangelos E.GRANGER CAUSALITYEVENT DETECTIONTIMES SERIEShttps://purl.org/becyt/ford/1.2https://purl.org/becyt/ford/1In this paper we describe a framework to uncover potential causal relations between event mentions from streaming text of news media. This framework relies on a dataset of manually labeled events to train a recurrent neural network for event detection. It then creates a time series of event clusters, where clusters are based on BERT contextual word embedding representations of the identified events. Using these time series dataset, we assess four methods based on Granger causality for inferring causal relations. Granger causality is a statistical concept of causality that is based on forecasting. It states that a cause occurs before the effect, and the cause produces unique changes in the effect, so past values of the cause help predict future values of the effect. The four analyzed methods are the pairwise Granger test, VAR(1), BigVar and SiMoNe. The framework is applied to the New York Times dataset, which covers news for a period of 246 months. This preliminary analysis delivers important insights into the nature of each method, identifies differences and commonalities, and points out some of their strengths and weaknesses.Fil: Maisonnave, Mariano. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Bahía Blanca. Instituto de Ciencias e Ingeniería de la Computación. Universidad Nacional del Sur. Departamento de Ciencias e Ingeniería de la Computación. Instituto de Ciencias e Ingeniería de la Computación; ArgentinaFil: Delbianco, Fernando Andrés. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Bahía Blanca. Instituto de Matemática Bahía Blanca. Universidad Nacional del Sur. Departamento de Matemática. Instituto de Matemática Bahía Blanca; Argentina. Universidad Nacional del Sur. Departamento de Economía; ArgentinaFil: Tohmé, Fernando Abel. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Bahía Blanca. Instituto de Matemática Bahía Blanca. Universidad Nacional del Sur. Departamento de Matemática. Instituto de Matemática Bahía Blanca; Argentina. Universidad Nacional del Sur. Departamento de Economía; ArgentinaFil: Maguitman, Ana Gabriela. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Bahía Blanca. Instituto de Ciencias e Ingeniería de la Computación. Universidad Nacional del Sur. Departamento de Ciencias e Ingeniería de la Computación. Instituto de Ciencias e Ingeniería de la Computación; ArgentinaFil: Milios, Evangelos E.. Dalhousie University. Faculty of Computer Science; CanadáDocEng '20: ACM Symposium on Document Engineering 2020New YorkEstados UnidosAssociation for Computing MachineryAssociation for Computing Machinery2020info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/conferenceObjectSimposioBookhttp://purl.org/coar/resource_type/c_5794info:ar-repo/semantics/documentoDeConferenciaapplication/pdfapplication/pdfhttp://hdl.handle.net/11336/138139Assessing Causality Structures learned from Digital Text Media; DocEng '20: ACM Symposium on Document Engineering 2020; New York; Estados Unidos; 2020; 1-4978-1-4503-8000-3CONICET DigitalCONICETenginfo:eu-repo/semantics/altIdentifier/url/https://dl.acm.org/doi/10.1145/3395027.3419594info:eu-repo/semantics/altIdentifier/doi/10.1145/3395027.3419594Internacionalinfo:eu-repo/semantics/openAccesshttps://creativecommons.org/licenses/by-nc-sa/2.5/ar/reponame:CONICET Digital (CONICET)instname:Consejo Nacional de Investigaciones Científicas y Técnicas2025-09-29T09:34:19Zoai:ri.conicet.gov.ar:11336/138139instacron:CONICETInstitucionalhttp://ri.conicet.gov.ar/Organismo científico-tecnológicoNo correspondehttp://ri.conicet.gov.ar/oai/requestdasensio@conicet.gov.ar; lcarlino@conicet.gov.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:34982025-09-29 09:34:20.251CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicasfalse |
dc.title.none.fl_str_mv |
Assessing Causality Structures learned from Digital Text Media |
title |
Assessing Causality Structures learned from Digital Text Media |
spellingShingle |
Assessing Causality Structures learned from Digital Text Media Maisonnave, Mariano GRANGER CAUSALITY EVENT DETECTION TIMES SERIES |
title_short |
Assessing Causality Structures learned from Digital Text Media |
title_full |
Assessing Causality Structures learned from Digital Text Media |
title_fullStr |
Assessing Causality Structures learned from Digital Text Media |
title_full_unstemmed |
Assessing Causality Structures learned from Digital Text Media |
title_sort |
Assessing Causality Structures learned from Digital Text Media |
dc.creator.none.fl_str_mv |
Maisonnave, Mariano Delbianco, Fernando Andrés Tohmé, Fernando Abel Maguitman, Ana Gabriela Milios, Evangelos E. |
author |
Maisonnave, Mariano |
author_facet |
Maisonnave, Mariano Delbianco, Fernando Andrés Tohmé, Fernando Abel Maguitman, Ana Gabriela Milios, Evangelos E. |
author_role |
author |
author2 |
Delbianco, Fernando Andrés Tohmé, Fernando Abel Maguitman, Ana Gabriela Milios, Evangelos E. |
author2_role |
author author author author |
dc.subject.none.fl_str_mv |
GRANGER CAUSALITY EVENT DETECTION TIMES SERIES |
topic |
GRANGER CAUSALITY EVENT DETECTION TIMES SERIES |
purl_subject.fl_str_mv |
https://purl.org/becyt/ford/1.2 https://purl.org/becyt/ford/1 |
dc.description.none.fl_txt_mv |
In this paper we describe a framework to uncover potential causal relations between event mentions from streaming text of news media. This framework relies on a dataset of manually labeled events to train a recurrent neural network for event detection. It then creates a time series of event clusters, where clusters are based on BERT contextual word embedding representations of the identified events. Using these time series dataset, we assess four methods based on Granger causality for inferring causal relations. Granger causality is a statistical concept of causality that is based on forecasting. It states that a cause occurs before the effect, and the cause produces unique changes in the effect, so past values of the cause help predict future values of the effect. The four analyzed methods are the pairwise Granger test, VAR(1), BigVar and SiMoNe. The framework is applied to the New York Times dataset, which covers news for a period of 246 months. This preliminary analysis delivers important insights into the nature of each method, identifies differences and commonalities, and points out some of their strengths and weaknesses. Fil: Maisonnave, Mariano. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Bahía Blanca. Instituto de Ciencias e Ingeniería de la Computación. Universidad Nacional del Sur. Departamento de Ciencias e Ingeniería de la Computación. Instituto de Ciencias e Ingeniería de la Computación; Argentina Fil: Delbianco, Fernando Andrés. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Bahía Blanca. Instituto de Matemática Bahía Blanca. Universidad Nacional del Sur. Departamento de Matemática. Instituto de Matemática Bahía Blanca; Argentina. Universidad Nacional del Sur. Departamento de Economía; Argentina Fil: Tohmé, Fernando Abel. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Bahía Blanca. Instituto de Matemática Bahía Blanca. Universidad Nacional del Sur. Departamento de Matemática. Instituto de Matemática Bahía Blanca; Argentina. Universidad Nacional del Sur. Departamento de Economía; Argentina Fil: Maguitman, Ana Gabriela. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Bahía Blanca. Instituto de Ciencias e Ingeniería de la Computación. Universidad Nacional del Sur. Departamento de Ciencias e Ingeniería de la Computación. Instituto de Ciencias e Ingeniería de la Computación; Argentina Fil: Milios, Evangelos E.. Dalhousie University. Faculty of Computer Science; Canadá DocEng '20: ACM Symposium on Document Engineering 2020 New York Estados Unidos Association for Computing Machinery |
description |
In this paper we describe a framework to uncover potential causal relations between event mentions from streaming text of news media. This framework relies on a dataset of manually labeled events to train a recurrent neural network for event detection. It then creates a time series of event clusters, where clusters are based on BERT contextual word embedding representations of the identified events. Using these time series dataset, we assess four methods based on Granger causality for inferring causal relations. Granger causality is a statistical concept of causality that is based on forecasting. It states that a cause occurs before the effect, and the cause produces unique changes in the effect, so past values of the cause help predict future values of the effect. The four analyzed methods are the pairwise Granger test, VAR(1), BigVar and SiMoNe. The framework is applied to the New York Times dataset, which covers news for a period of 246 months. This preliminary analysis delivers important insights into the nature of each method, identifies differences and commonalities, and points out some of their strengths and weaknesses. |
publishDate |
2020 |
dc.date.none.fl_str_mv |
2020 |
dc.type.none.fl_str_mv |
info:eu-repo/semantics/publishedVersion info:eu-repo/semantics/conferenceObject Simposio Book http://purl.org/coar/resource_type/c_5794 info:ar-repo/semantics/documentoDeConferencia |
status_str |
publishedVersion |
format |
conferenceObject |
dc.identifier.none.fl_str_mv |
http://hdl.handle.net/11336/138139 Assessing Causality Structures learned from Digital Text Media; DocEng '20: ACM Symposium on Document Engineering 2020; New York; Estados Unidos; 2020; 1-4 978-1-4503-8000-3 CONICET Digital CONICET |
url |
http://hdl.handle.net/11336/138139 |
identifier_str_mv |
Assessing Causality Structures learned from Digital Text Media; DocEng '20: ACM Symposium on Document Engineering 2020; New York; Estados Unidos; 2020; 1-4 978-1-4503-8000-3 CONICET Digital CONICET |
dc.language.none.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
info:eu-repo/semantics/altIdentifier/url/https://dl.acm.org/doi/10.1145/3395027.3419594 info:eu-repo/semantics/altIdentifier/doi/10.1145/3395027.3419594 |
dc.rights.none.fl_str_mv |
info:eu-repo/semantics/openAccess https://creativecommons.org/licenses/by-nc-sa/2.5/ar/ |
eu_rights_str_mv |
openAccess |
rights_invalid_str_mv |
https://creativecommons.org/licenses/by-nc-sa/2.5/ar/ |
dc.format.none.fl_str_mv |
application/pdf application/pdf |
dc.coverage.none.fl_str_mv |
Internacional |
dc.publisher.none.fl_str_mv |
Association for Computing Machinery |
publisher.none.fl_str_mv |
Association for Computing Machinery |
dc.source.none.fl_str_mv |
reponame:CONICET Digital (CONICET) instname:Consejo Nacional de Investigaciones Científicas y Técnicas |
reponame_str |
CONICET Digital (CONICET) |
collection |
CONICET Digital (CONICET) |
instname_str |
Consejo Nacional de Investigaciones Científicas y Técnicas |
repository.name.fl_str_mv |
CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicas |
repository.mail.fl_str_mv |
dasensio@conicet.gov.ar; lcarlino@conicet.gov.ar |
_version_ |
1844613061416058880 |
score |
13.070432 |