Causal graph extraction from news: a comparative study of time-series causality learning techniques

Autores: Maisonnave, Mariano; Delbianco, Fernando Andrés; Tohmé, Fernando Abel; Milios, Evangelos; Maguitman, Ana Gabriela
Año de publicación: 2022
Idioma: inglés
Tipo de recurso: artículo
Estado: versión publicada
Descripción: Causal graph extraction from news has the potential to aid in the understanding of complex scenarios. In particular, it can help explain and predict events, as well as conjecture about possible cause-effect connections. However, limited work has addressed the problem of large-scale extraction of causal graphs from news articles. This article presents a novel framework for extracting causal graphs from digital text media. The framework relies on topic-relevant variables representing terms and ongoing events that are selected from a domain under analysis by applying specially developed information retrieval and natural language processing methods. Events are represented as event-phrase embeddings, which make it possible to group similar events into semantically cohesive clusters. A time series of the selected variables is given as input to a causal structure learning techniques to learn a causal graph associated with the topic that is being examined. The complete framework is applied to the New York Times dataset, which covers news for a period of 246 months (roughly 20 years), and is illustrated through a case study. An initial evaluation based on synthetic data is carried out to gain insight into the most effective time-series causality learning techniques. This evaluation comprises a systematic analysis of nine state-of-the-art causal structure learning techniques and two novel ensemble methods derived from the most effective techniques. Subsequently, the complete framework based on the most promising causal structure learning technique is evaluated with domain experts in a real-world scenario through the use of the presented case study. The proposed analysis offers valuable insights into the problems of identifying topic-relevant variables from large volumes of news and learning causal graphs from time series
Fil: Maisonnave, Mariano. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Bahía Blanca. Instituto de Ciencias e Ingeniería de la Computación. Universidad Nacional del Sur. Departamento de Ciencias e Ingeniería de la Computación. Instituto de Ciencias e Ingeniería de la Computación; Argentina. Dalhousie University Halifax; Canadá
Fil: Delbianco, Fernando Andrés. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Bahía Blanca. Instituto de Matemática Bahía Blanca. Universidad Nacional del Sur. Departamento de Matemática. Instituto de Matemática Bahía Blanca; Argentina
Fil: Tohmé, Fernando Abel. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Bahía Blanca. Instituto de Matemática Bahía Blanca. Universidad Nacional del Sur. Departamento de Matemática. Instituto de Matemática Bahía Blanca; Argentina
Fil: Milios, Evangelos. Dalhousie University Halifax; Canadá
Fil: Maguitman, Ana Gabriela. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Bahía Blanca. Instituto de Ciencias e Ingeniería de la Computación. Universidad Nacional del Sur. Departamento de Ciencias e Ingeniería de la Computación. Instituto de Ciencias e Ingeniería de la Computación; Argentina
Materia: CAUSAL GRAPH EXTRACTION
DIGITAL TEXT MEDIA
INFORMATION EXTRACTION FROM NEWS
TIME-SERIES CAUSALITY LEARNING
VARIABLE EXTRACTION
Nivel de accesibilidad: acceso abierto
Condiciones de uso: https://creativecommons.org/licenses/by/2.5/ar/
Repositorio
Institución: Consejo Nacional de Investigaciones Científicas y Técnicas
OAI Identificador: oai:ri.conicet.gov.ar:11336/197131

Acceder

id	CONICETDig_d204e208bcfe4d1ca3153449a09da2e5
oai_identifier_str	oai:ri.conicet.gov.ar:11336/197131
network_acronym_str	CONICETDig
repository_id_str	3498
network_name_str	CONICET Digital (CONICET)
spelling	Causal graph extraction from news: a comparative study of time-series causality learning techniquesMaisonnave, MarianoDelbianco, Fernando AndrésTohmé, Fernando AbelMilios, EvangelosMaguitman, Ana GabrielaCAUSAL GRAPH EXTRACTIONDIGITAL TEXT MEDIAINFORMATION EXTRACTION FROM NEWSTIME-SERIES CAUSALITY LEARNINGVARIABLE EXTRACTIONhttps://purl.org/becyt/ford/1.2https://purl.org/becyt/ford/1Causal graph extraction from news has the potential to aid in the understanding of complex scenarios. In particular, it can help explain and predict events, as well as conjecture about possible cause-effect connections. However, limited work has addressed the problem of large-scale extraction of causal graphs from news articles. This article presents a novel framework for extracting causal graphs from digital text media. The framework relies on topic-relevant variables representing terms and ongoing events that are selected from a domain under analysis by applying specially developed information retrieval and natural language processing methods. Events are represented as event-phrase embeddings, which make it possible to group similar events into semantically cohesive clusters. A time series of the selected variables is given as input to a causal structure learning techniques to learn a causal graph associated with the topic that is being examined. The complete framework is applied to the New York Times dataset, which covers news for a period of 246 months (roughly 20 years), and is illustrated through a case study. An initial evaluation based on synthetic data is carried out to gain insight into the most effective time-series causality learning techniques. This evaluation comprises a systematic analysis of nine state-of-the-art causal structure learning techniques and two novel ensemble methods derived from the most effective techniques. Subsequently, the complete framework based on the most promising causal structure learning technique is evaluated with domain experts in a real-world scenario through the use of the presented case study. The proposed analysis offers valuable insights into the problems of identifying topic-relevant variables from large volumes of news and learning causal graphs from time seriesFil: Maisonnave, Mariano. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Bahía Blanca. Instituto de Ciencias e Ingeniería de la Computación. Universidad Nacional del Sur. Departamento de Ciencias e Ingeniería de la Computación. Instituto de Ciencias e Ingeniería de la Computación; Argentina. Dalhousie University Halifax; CanadáFil: Delbianco, Fernando Andrés. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Bahía Blanca. Instituto de Matemática Bahía Blanca. Universidad Nacional del Sur. Departamento de Matemática. Instituto de Matemática Bahía Blanca; ArgentinaFil: Tohmé, Fernando Abel. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Bahía Blanca. Instituto de Matemática Bahía Blanca. Universidad Nacional del Sur. Departamento de Matemática. Instituto de Matemática Bahía Blanca; ArgentinaFil: Milios, Evangelos. Dalhousie University Halifax; CanadáFil: Maguitman, Ana Gabriela. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Bahía Blanca. Instituto de Ciencias e Ingeniería de la Computación. Universidad Nacional del Sur. Departamento de Ciencias e Ingeniería de la Computación. Instituto de Ciencias e Ingeniería de la Computación; ArgentinaPeerJ2022-08-03info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionhttp://purl.org/coar/resource_type/c_6501info:ar-repo/semantics/articuloapplication/pdfapplication/pdfapplication/pdfhttp://hdl.handle.net/11336/197131Maisonnave, Mariano; Delbianco, Fernando Andrés; Tohmé, Fernando Abel; Milios, Evangelos; Maguitman, Ana Gabriela; Causal graph extraction from news: a comparative study of time-series causality learning techniques; PeerJ; PeerJ Computer Science; 8; 3-8-2022; 2-282376-5992CONICET DigitalCONICETenginfo:eu-repo/semantics/altIdentifier/doi/10.7717/PEERJ-CS.1066info:eu-repo/semantics/altIdentifier/url/https://peerj.com/articles/cs-1066/info:eu-repo/semantics/openAccesshttps://creativecommons.org/licenses/by/2.5/ar/reponame:CONICET Digital (CONICET)instname:Consejo Nacional de Investigaciones Científicas y Técnicas2026-02-26T10:05:05Zoai:ri.conicet.gov.ar:11336/197131instacron:CONICETInstitucionalhttp://ri.conicet.gov.ar/Organismo científico-tecnológicoNo correspondehttp://ri.conicet.gov.ar/oai/requestdasensio@conicet.gov.ar; lcarlino@conicet.gov.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:34982026-02-26 10:05:05.834CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicasfalse
dc.title.none.fl_str_mv	Causal graph extraction from news: a comparative study of time-series causality learning techniques
title	Causal graph extraction from news: a comparative study of time-series causality learning techniques
spellingShingle	Causal graph extraction from news: a comparative study of time-series causality learning techniques Maisonnave, Mariano CAUSAL GRAPH EXTRACTION DIGITAL TEXT MEDIA INFORMATION EXTRACTION FROM NEWS TIME-SERIES CAUSALITY LEARNING VARIABLE EXTRACTION
title_short	Causal graph extraction from news: a comparative study of time-series causality learning techniques
title_full	Causal graph extraction from news: a comparative study of time-series causality learning techniques
title_fullStr	Causal graph extraction from news: a comparative study of time-series causality learning techniques
title_full_unstemmed	Causal graph extraction from news: a comparative study of time-series causality learning techniques
title_sort	Causal graph extraction from news: a comparative study of time-series causality learning techniques
dc.creator.none.fl_str_mv	Maisonnave, Mariano Delbianco, Fernando Andrés Tohmé, Fernando Abel Milios, Evangelos Maguitman, Ana Gabriela
author	Maisonnave, Mariano
author_facet	Maisonnave, Mariano Delbianco, Fernando Andrés Tohmé, Fernando Abel Milios, Evangelos Maguitman, Ana Gabriela
author_role	author
author2	Delbianco, Fernando Andrés Tohmé, Fernando Abel Milios, Evangelos Maguitman, Ana Gabriela
author2_role	author author author author
dc.subject.none.fl_str_mv	CAUSAL GRAPH EXTRACTION DIGITAL TEXT MEDIA INFORMATION EXTRACTION FROM NEWS TIME-SERIES CAUSALITY LEARNING VARIABLE EXTRACTION
topic	CAUSAL GRAPH EXTRACTION DIGITAL TEXT MEDIA INFORMATION EXTRACTION FROM NEWS TIME-SERIES CAUSALITY LEARNING VARIABLE EXTRACTION
purl_subject.fl_str_mv	https://purl.org/becyt/ford/1.2 https://purl.org/becyt/ford/1
dc.description.none.fl_txt_mv	Causal graph extraction from news has the potential to aid in the understanding of complex scenarios. In particular, it can help explain and predict events, as well as conjecture about possible cause-effect connections. However, limited work has addressed the problem of large-scale extraction of causal graphs from news articles. This article presents a novel framework for extracting causal graphs from digital text media. The framework relies on topic-relevant variables representing terms and ongoing events that are selected from a domain under analysis by applying specially developed information retrieval and natural language processing methods. Events are represented as event-phrase embeddings, which make it possible to group similar events into semantically cohesive clusters. A time series of the selected variables is given as input to a causal structure learning techniques to learn a causal graph associated with the topic that is being examined. The complete framework is applied to the New York Times dataset, which covers news for a period of 246 months (roughly 20 years), and is illustrated through a case study. An initial evaluation based on synthetic data is carried out to gain insight into the most effective time-series causality learning techniques. This evaluation comprises a systematic analysis of nine state-of-the-art causal structure learning techniques and two novel ensemble methods derived from the most effective techniques. Subsequently, the complete framework based on the most promising causal structure learning technique is evaluated with domain experts in a real-world scenario through the use of the presented case study. The proposed analysis offers valuable insights into the problems of identifying topic-relevant variables from large volumes of news and learning causal graphs from time series Fil: Maisonnave, Mariano. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Bahía Blanca. Instituto de Ciencias e Ingeniería de la Computación. Universidad Nacional del Sur. Departamento de Ciencias e Ingeniería de la Computación. Instituto de Ciencias e Ingeniería de la Computación; Argentina. Dalhousie University Halifax; Canadá Fil: Delbianco, Fernando Andrés. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Bahía Blanca. Instituto de Matemática Bahía Blanca. Universidad Nacional del Sur. Departamento de Matemática. Instituto de Matemática Bahía Blanca; Argentina Fil: Tohmé, Fernando Abel. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Bahía Blanca. Instituto de Matemática Bahía Blanca. Universidad Nacional del Sur. Departamento de Matemática. Instituto de Matemática Bahía Blanca; Argentina Fil: Milios, Evangelos. Dalhousie University Halifax; Canadá Fil: Maguitman, Ana Gabriela. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Bahía Blanca. Instituto de Ciencias e Ingeniería de la Computación. Universidad Nacional del Sur. Departamento de Ciencias e Ingeniería de la Computación. Instituto de Ciencias e Ingeniería de la Computación; Argentina
description	Causal graph extraction from news has the potential to aid in the understanding of complex scenarios. In particular, it can help explain and predict events, as well as conjecture about possible cause-effect connections. However, limited work has addressed the problem of large-scale extraction of causal graphs from news articles. This article presents a novel framework for extracting causal graphs from digital text media. The framework relies on topic-relevant variables representing terms and ongoing events that are selected from a domain under analysis by applying specially developed information retrieval and natural language processing methods. Events are represented as event-phrase embeddings, which make it possible to group similar events into semantically cohesive clusters. A time series of the selected variables is given as input to a causal structure learning techniques to learn a causal graph associated with the topic that is being examined. The complete framework is applied to the New York Times dataset, which covers news for a period of 246 months (roughly 20 years), and is illustrated through a case study. An initial evaluation based on synthetic data is carried out to gain insight into the most effective time-series causality learning techniques. This evaluation comprises a systematic analysis of nine state-of-the-art causal structure learning techniques and two novel ensemble methods derived from the most effective techniques. Subsequently, the complete framework based on the most promising causal structure learning technique is evaluated with domain experts in a real-world scenario through the use of the presented case study. The proposed analysis offers valuable insights into the problems of identifying topic-relevant variables from large volumes of news and learning causal graphs from time series
publishDate	2022
dc.date.none.fl_str_mv	2022-08-03
dc.type.none.fl_str_mv	info:eu-repo/semantics/article info:eu-repo/semantics/publishedVersion http://purl.org/coar/resource_type/c_6501 info:ar-repo/semantics/articulo
format	article
status_str	publishedVersion
dc.identifier.none.fl_str_mv	http://hdl.handle.net/11336/197131 Maisonnave, Mariano; Delbianco, Fernando Andrés; Tohmé, Fernando Abel; Milios, Evangelos; Maguitman, Ana Gabriela; Causal graph extraction from news: a comparative study of time-series causality learning techniques; PeerJ; PeerJ Computer Science; 8; 3-8-2022; 2-28 2376-5992 CONICET Digital CONICET
url	http://hdl.handle.net/11336/197131
identifier_str_mv	Maisonnave, Mariano; Delbianco, Fernando Andrés; Tohmé, Fernando Abel; Milios, Evangelos; Maguitman, Ana Gabriela; Causal graph extraction from news: a comparative study of time-series causality learning techniques; PeerJ; PeerJ Computer Science; 8; 3-8-2022; 2-28 2376-5992 CONICET Digital CONICET
dc.language.none.fl_str_mv	eng
language	eng
dc.relation.none.fl_str_mv	info:eu-repo/semantics/altIdentifier/doi/10.7717/PEERJ-CS.1066 info:eu-repo/semantics/altIdentifier/url/https://peerj.com/articles/cs-1066/
dc.rights.none.fl_str_mv	info:eu-repo/semantics/openAccess https://creativecommons.org/licenses/by/2.5/ar/
eu_rights_str_mv	openAccess
rights_invalid_str_mv	https://creativecommons.org/licenses/by/2.5/ar/
dc.format.none.fl_str_mv	application/pdf application/pdf application/pdf
dc.publisher.none.fl_str_mv	PeerJ
publisher.none.fl_str_mv	PeerJ
dc.source.none.fl_str_mv	reponame:CONICET Digital (CONICET) instname:Consejo Nacional de Investigaciones Científicas y Técnicas
reponame_str	CONICET Digital (CONICET)
collection	CONICET Digital (CONICET)
instname_str	Consejo Nacional de Investigaciones Científicas y Técnicas
repository.name.fl_str_mv	CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicas
repository.mail.fl_str_mv	dasensio@conicet.gov.ar; lcarlino@conicet.gov.ar
_version_	1858305146875805696
score	13.176822

Causal graph extraction from news: a comparative study of time-series causality learning techniques

Publicaciones similares