Causal graph extraction from news: a comparative study of time-series causality learning techniques
- Autores
- Maisonnave, Mariano; Delbianco, Fernando Andrés; Tohmé, Fernando Abel; Milios, Evangelos; Maguitman, Ana Gabriela
- Año de publicación
- 2022
- Idioma
- inglés
- Tipo de recurso
- artículo
- Estado
- versión publicada
- Descripción
- Causal graph extraction from news has the potential to aid in the understanding of complex scenarios. In particular, it can help explain and predict events, as well as conjecture about possible cause-effect connections. However, limited work has addressed the problem of large-scale extraction of causal graphs from news articles. This article presents a novel framework for extracting causal graphs from digital text media. The framework relies on topic-relevant variables representing terms and ongoing events that are selected from a domain under analysis by applying specially developed information retrieval and natural language processing methods. Events are represented as event-phrase embeddings, which make it possible to group similar events into semantically cohesive clusters. A time series of the selected variables is given as input to a causal structure learning techniques to learn a causal graph associated with the topic that is being examined. The complete framework is applied to the New York Times dataset, which covers news for a period of 246 months (roughly 20 years), and is illustrated through a case study. An initial evaluation based on synthetic data is carried out to gain insight into the most effective time-series causality learning techniques. This evaluation comprises a systematic analysis of nine state-of-the-art causal structure learning techniques and two novel ensemble methods derived from the most effective techniques. Subsequently, the complete framework based on the most promising causal structure learning technique is evaluated with domain experts in a real-world scenario through the use of the presented case study. The proposed analysis offers valuable insights into the problems of identifying topic-relevant variables from large volumes of news and learning causal graphs from time series
Fil: Maisonnave, Mariano. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Bahía Blanca. Instituto de Ciencias e Ingeniería de la Computación. Universidad Nacional del Sur. Departamento de Ciencias e Ingeniería de la Computación. Instituto de Ciencias e Ingeniería de la Computación; Argentina. Dalhousie University Halifax; Canadá
Fil: Delbianco, Fernando Andrés. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Bahía Blanca. Instituto de Matemática Bahía Blanca. Universidad Nacional del Sur. Departamento de Matemática. Instituto de Matemática Bahía Blanca; Argentina
Fil: Tohmé, Fernando Abel. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Bahía Blanca. Instituto de Matemática Bahía Blanca. Universidad Nacional del Sur. Departamento de Matemática. Instituto de Matemática Bahía Blanca; Argentina
Fil: Milios, Evangelos. Dalhousie University Halifax; Canadá
Fil: Maguitman, Ana Gabriela. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Bahía Blanca. Instituto de Ciencias e Ingeniería de la Computación. Universidad Nacional del Sur. Departamento de Ciencias e Ingeniería de la Computación. Instituto de Ciencias e Ingeniería de la Computación; Argentina - Materia
-
CAUSAL GRAPH EXTRACTION
DIGITAL TEXT MEDIA
INFORMATION EXTRACTION FROM NEWS
TIME-SERIES CAUSALITY LEARNING
VARIABLE EXTRACTION - Nivel de accesibilidad
- acceso abierto
- Condiciones de uso
- https://creativecommons.org/licenses/by/2.5/ar/
- Repositorio
- Institución
- Consejo Nacional de Investigaciones Científicas y Técnicas
- OAI Identificador
- oai:ri.conicet.gov.ar:11336/197131
Ver los metadatos del registro completo
id |
CONICETDig_d204e208bcfe4d1ca3153449a09da2e5 |
---|---|
oai_identifier_str |
oai:ri.conicet.gov.ar:11336/197131 |
network_acronym_str |
CONICETDig |
repository_id_str |
3498 |
network_name_str |
CONICET Digital (CONICET) |
spelling |
Causal graph extraction from news: a comparative study of time-series causality learning techniquesMaisonnave, MarianoDelbianco, Fernando AndrésTohmé, Fernando AbelMilios, EvangelosMaguitman, Ana GabrielaCAUSAL GRAPH EXTRACTIONDIGITAL TEXT MEDIAINFORMATION EXTRACTION FROM NEWSTIME-SERIES CAUSALITY LEARNINGVARIABLE EXTRACTIONhttps://purl.org/becyt/ford/1.2https://purl.org/becyt/ford/1Causal graph extraction from news has the potential to aid in the understanding of complex scenarios. In particular, it can help explain and predict events, as well as conjecture about possible cause-effect connections. However, limited work has addressed the problem of large-scale extraction of causal graphs from news articles. This article presents a novel framework for extracting causal graphs from digital text media. The framework relies on topic-relevant variables representing terms and ongoing events that are selected from a domain under analysis by applying specially developed information retrieval and natural language processing methods. Events are represented as event-phrase embeddings, which make it possible to group similar events into semantically cohesive clusters. A time series of the selected variables is given as input to a causal structure learning techniques to learn a causal graph associated with the topic that is being examined. The complete framework is applied to the New York Times dataset, which covers news for a period of 246 months (roughly 20 years), and is illustrated through a case study. An initial evaluation based on synthetic data is carried out to gain insight into the most effective time-series causality learning techniques. This evaluation comprises a systematic analysis of nine state-of-the-art causal structure learning techniques and two novel ensemble methods derived from the most effective techniques. Subsequently, the complete framework based on the most promising causal structure learning technique is evaluated with domain experts in a real-world scenario through the use of the presented case study. The proposed analysis offers valuable insights into the problems of identifying topic-relevant variables from large volumes of news and learning causal graphs from time seriesFil: Maisonnave, Mariano. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Bahía Blanca. Instituto de Ciencias e Ingeniería de la Computación. Universidad Nacional del Sur. Departamento de Ciencias e Ingeniería de la Computación. Instituto de Ciencias e Ingeniería de la Computación; Argentina. Dalhousie University Halifax; CanadáFil: Delbianco, Fernando Andrés. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Bahía Blanca. Instituto de Matemática Bahía Blanca. Universidad Nacional del Sur. Departamento de Matemática. Instituto de Matemática Bahía Blanca; ArgentinaFil: Tohmé, Fernando Abel. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Bahía Blanca. Instituto de Matemática Bahía Blanca. Universidad Nacional del Sur. Departamento de Matemática. Instituto de Matemática Bahía Blanca; ArgentinaFil: Milios, Evangelos. Dalhousie University Halifax; CanadáFil: Maguitman, Ana Gabriela. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Bahía Blanca. Instituto de Ciencias e Ingeniería de la Computación. Universidad Nacional del Sur. Departamento de Ciencias e Ingeniería de la Computación. Instituto de Ciencias e Ingeniería de la Computación; ArgentinaPeerJ2022-08-03info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionhttp://purl.org/coar/resource_type/c_6501info:ar-repo/semantics/articuloapplication/pdfapplication/pdfapplication/pdfhttp://hdl.handle.net/11336/197131Maisonnave, Mariano; Delbianco, Fernando Andrés; Tohmé, Fernando Abel; Milios, Evangelos; Maguitman, Ana Gabriela; Causal graph extraction from news: a comparative study of time-series causality learning techniques; PeerJ; PeerJ Computer Science; 8; 3-8-2022; 2-282376-5992CONICET DigitalCONICETenginfo:eu-repo/semantics/altIdentifier/doi/10.7717/PEERJ-CS.1066info:eu-repo/semantics/altIdentifier/url/https://peerj.com/articles/cs-1066/info:eu-repo/semantics/openAccesshttps://creativecommons.org/licenses/by/2.5/ar/reponame:CONICET Digital (CONICET)instname:Consejo Nacional de Investigaciones Científicas y Técnicas2025-09-03T09:50:43Zoai:ri.conicet.gov.ar:11336/197131instacron:CONICETInstitucionalhttp://ri.conicet.gov.ar/Organismo científico-tecnológicoNo correspondehttp://ri.conicet.gov.ar/oai/requestdasensio@conicet.gov.ar; lcarlino@conicet.gov.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:34982025-09-03 09:50:44.211CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicasfalse |
dc.title.none.fl_str_mv |
Causal graph extraction from news: a comparative study of time-series causality learning techniques |
title |
Causal graph extraction from news: a comparative study of time-series causality learning techniques |
spellingShingle |
Causal graph extraction from news: a comparative study of time-series causality learning techniques Maisonnave, Mariano CAUSAL GRAPH EXTRACTION DIGITAL TEXT MEDIA INFORMATION EXTRACTION FROM NEWS TIME-SERIES CAUSALITY LEARNING VARIABLE EXTRACTION |
title_short |
Causal graph extraction from news: a comparative study of time-series causality learning techniques |
title_full |
Causal graph extraction from news: a comparative study of time-series causality learning techniques |
title_fullStr |
Causal graph extraction from news: a comparative study of time-series causality learning techniques |
title_full_unstemmed |
Causal graph extraction from news: a comparative study of time-series causality learning techniques |
title_sort |
Causal graph extraction from news: a comparative study of time-series causality learning techniques |
dc.creator.none.fl_str_mv |
Maisonnave, Mariano Delbianco, Fernando Andrés Tohmé, Fernando Abel Milios, Evangelos Maguitman, Ana Gabriela |
author |
Maisonnave, Mariano |
author_facet |
Maisonnave, Mariano Delbianco, Fernando Andrés Tohmé, Fernando Abel Milios, Evangelos Maguitman, Ana Gabriela |
author_role |
author |
author2 |
Delbianco, Fernando Andrés Tohmé, Fernando Abel Milios, Evangelos Maguitman, Ana Gabriela |
author2_role |
author author author author |
dc.subject.none.fl_str_mv |
CAUSAL GRAPH EXTRACTION DIGITAL TEXT MEDIA INFORMATION EXTRACTION FROM NEWS TIME-SERIES CAUSALITY LEARNING VARIABLE EXTRACTION |
topic |
CAUSAL GRAPH EXTRACTION DIGITAL TEXT MEDIA INFORMATION EXTRACTION FROM NEWS TIME-SERIES CAUSALITY LEARNING VARIABLE EXTRACTION |
purl_subject.fl_str_mv |
https://purl.org/becyt/ford/1.2 https://purl.org/becyt/ford/1 |
dc.description.none.fl_txt_mv |
Causal graph extraction from news has the potential to aid in the understanding of complex scenarios. In particular, it can help explain and predict events, as well as conjecture about possible cause-effect connections. However, limited work has addressed the problem of large-scale extraction of causal graphs from news articles. This article presents a novel framework for extracting causal graphs from digital text media. The framework relies on topic-relevant variables representing terms and ongoing events that are selected from a domain under analysis by applying specially developed information retrieval and natural language processing methods. Events are represented as event-phrase embeddings, which make it possible to group similar events into semantically cohesive clusters. A time series of the selected variables is given as input to a causal structure learning techniques to learn a causal graph associated with the topic that is being examined. The complete framework is applied to the New York Times dataset, which covers news for a period of 246 months (roughly 20 years), and is illustrated through a case study. An initial evaluation based on synthetic data is carried out to gain insight into the most effective time-series causality learning techniques. This evaluation comprises a systematic analysis of nine state-of-the-art causal structure learning techniques and two novel ensemble methods derived from the most effective techniques. Subsequently, the complete framework based on the most promising causal structure learning technique is evaluated with domain experts in a real-world scenario through the use of the presented case study. The proposed analysis offers valuable insights into the problems of identifying topic-relevant variables from large volumes of news and learning causal graphs from time series Fil: Maisonnave, Mariano. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Bahía Blanca. Instituto de Ciencias e Ingeniería de la Computación. Universidad Nacional del Sur. Departamento de Ciencias e Ingeniería de la Computación. Instituto de Ciencias e Ingeniería de la Computación; Argentina. Dalhousie University Halifax; Canadá Fil: Delbianco, Fernando Andrés. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Bahía Blanca. Instituto de Matemática Bahía Blanca. Universidad Nacional del Sur. Departamento de Matemática. Instituto de Matemática Bahía Blanca; Argentina Fil: Tohmé, Fernando Abel. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Bahía Blanca. Instituto de Matemática Bahía Blanca. Universidad Nacional del Sur. Departamento de Matemática. Instituto de Matemática Bahía Blanca; Argentina Fil: Milios, Evangelos. Dalhousie University Halifax; Canadá Fil: Maguitman, Ana Gabriela. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Bahía Blanca. Instituto de Ciencias e Ingeniería de la Computación. Universidad Nacional del Sur. Departamento de Ciencias e Ingeniería de la Computación. Instituto de Ciencias e Ingeniería de la Computación; Argentina |
description |
Causal graph extraction from news has the potential to aid in the understanding of complex scenarios. In particular, it can help explain and predict events, as well as conjecture about possible cause-effect connections. However, limited work has addressed the problem of large-scale extraction of causal graphs from news articles. This article presents a novel framework for extracting causal graphs from digital text media. The framework relies on topic-relevant variables representing terms and ongoing events that are selected from a domain under analysis by applying specially developed information retrieval and natural language processing methods. Events are represented as event-phrase embeddings, which make it possible to group similar events into semantically cohesive clusters. A time series of the selected variables is given as input to a causal structure learning techniques to learn a causal graph associated with the topic that is being examined. The complete framework is applied to the New York Times dataset, which covers news for a period of 246 months (roughly 20 years), and is illustrated through a case study. An initial evaluation based on synthetic data is carried out to gain insight into the most effective time-series causality learning techniques. This evaluation comprises a systematic analysis of nine state-of-the-art causal structure learning techniques and two novel ensemble methods derived from the most effective techniques. Subsequently, the complete framework based on the most promising causal structure learning technique is evaluated with domain experts in a real-world scenario through the use of the presented case study. The proposed analysis offers valuable insights into the problems of identifying topic-relevant variables from large volumes of news and learning causal graphs from time series |
publishDate |
2022 |
dc.date.none.fl_str_mv |
2022-08-03 |
dc.type.none.fl_str_mv |
info:eu-repo/semantics/article info:eu-repo/semantics/publishedVersion http://purl.org/coar/resource_type/c_6501 info:ar-repo/semantics/articulo |
format |
article |
status_str |
publishedVersion |
dc.identifier.none.fl_str_mv |
http://hdl.handle.net/11336/197131 Maisonnave, Mariano; Delbianco, Fernando Andrés; Tohmé, Fernando Abel; Milios, Evangelos; Maguitman, Ana Gabriela; Causal graph extraction from news: a comparative study of time-series causality learning techniques; PeerJ; PeerJ Computer Science; 8; 3-8-2022; 2-28 2376-5992 CONICET Digital CONICET |
url |
http://hdl.handle.net/11336/197131 |
identifier_str_mv |
Maisonnave, Mariano; Delbianco, Fernando Andrés; Tohmé, Fernando Abel; Milios, Evangelos; Maguitman, Ana Gabriela; Causal graph extraction from news: a comparative study of time-series causality learning techniques; PeerJ; PeerJ Computer Science; 8; 3-8-2022; 2-28 2376-5992 CONICET Digital CONICET |
dc.language.none.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
info:eu-repo/semantics/altIdentifier/doi/10.7717/PEERJ-CS.1066 info:eu-repo/semantics/altIdentifier/url/https://peerj.com/articles/cs-1066/ |
dc.rights.none.fl_str_mv |
info:eu-repo/semantics/openAccess https://creativecommons.org/licenses/by/2.5/ar/ |
eu_rights_str_mv |
openAccess |
rights_invalid_str_mv |
https://creativecommons.org/licenses/by/2.5/ar/ |
dc.format.none.fl_str_mv |
application/pdf application/pdf application/pdf |
dc.publisher.none.fl_str_mv |
PeerJ |
publisher.none.fl_str_mv |
PeerJ |
dc.source.none.fl_str_mv |
reponame:CONICET Digital (CONICET) instname:Consejo Nacional de Investigaciones Científicas y Técnicas |
reponame_str |
CONICET Digital (CONICET) |
collection |
CONICET Digital (CONICET) |
instname_str |
Consejo Nacional de Investigaciones Científicas y Técnicas |
repository.name.fl_str_mv |
CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicas |
repository.mail.fl_str_mv |
dasensio@conicet.gov.ar; lcarlino@conicet.gov.ar |
_version_ |
1842269050044940288 |
score |
13.13397 |