Improvement and generalization of ABCD method with Bayesian inference

Autores
Alvarez, Ezequiel; Da Rold, Leandro; Szewc, Manuel; Szynkman, Alejandro Andrés; Tanco, Santiago Andrés; Tarutina, Tatiana
Año de publicación
2024
Idioma
inglés
Tipo de recurso
artículo
Estado
versión publicada
Descripción
To find New Physics or to refine our knowledge of the Standard Model at the LHC is an enterprise that involves many factors, such as the capabilities and the performance of the accelerator and detectors, the use and exploitation of the available information, the design of search strategies and observables, as well as the proposal of new models. We focus on the use of the information and pour our effort in re-thinking the usual datadriven ABCD method to improve it and to generalize it using Bayesian Machine Learning techniques and tools. We propose that a dataset consisting of a signal and many backgrounds is well described through a mixture model. Signal, backgrounds and their relative fractions in the sample can be well extracted by exploiting the prior knowledge and the dependence between the different observables at the event-by-event level with Bayesian tools. We show how, in contrast to the ABCD method, one can take advantage of understanding some properties of the different backgrounds and of having more than two independent observables to measure in each event. In addition, instead of regions defined through hard cuts, the Bayesian framework uses the information of continuous distribution to obtain soft-assignments of the events which are statistically more robust. To compare both methods we use a toy problem inspired by p p → hh → b¯bb¯b, selecting a reduced and simplified number of processes and analysing the flavor of the four jets and the invariant mass of the jet-pairs, modeled with simplified distributions. Taking advantage of all this information, and starting from a combination of biased and agnostic priors, leads us to a very good posterior once we use the Bayesian framework to exploit the data and the mutual information of the observables at the event-by-event level. We show how, in this simplified model, the Bayesian framework outperforms the ABCD method sensitivity in obtaining the signal fraction in scenarios with 1% and 0.5% true signal fractions in the dataset. We also show that the method is robust against the absence of signal. We discuss potential prospects for taking this Bayesian data-driven paradigm into more realistic scenarios.
Fil: Alvarez, Ezequiel. Universidad Nacional de San Martín. Escuela de Ciencia y Tecnología. Centro Internacional de Estudios Avanzados; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas. Instituto de Ciencias Físicas. - Universidad Nacional de San Martín. Instituto de Ciencias Físicas; Argentina
Fil: Da Rold, Leandro. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina. Comisión Nacional de Energía Atómica. Gerencia del Área de Energía Nuclear. Instituto Balseiro. Archivo Histórico del Centro Atómico Bariloche e Instituto Balseiro | Universidad Nacional de Cuyo. Instituto Balseiro. Archivo Histórico del Centro Atómico Bariloche e Instituto Balseiro; Argentina
Fil: Szewc, Manuel. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina. University of Cincinnati; Estados Unidos
Fil: Szynkman, Alejandro Andrés. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - La Plata. Instituto de Física La Plata. Universidad Nacional de La Plata. Facultad de Ciencias Exactas. Instituto de Física La Plata; Argentina
Fil: Tanco, Santiago Andrés. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - La Plata. Instituto de Física La Plata. Universidad Nacional de La Plata. Facultad de Ciencias Exactas. Instituto de Física La Plata; Argentina
Fil: Tarutina, Tatiana. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - La Plata. Instituto de Física La Plata. Universidad Nacional de La Plata. Facultad de Ciencias Exactas. Instituto de Física La Plata; Argentina
Materia
NEW PHYSICS
BAYESIAN INFERENCE
ABCD METHOD
Nivel de accesibilidad
acceso abierto
Condiciones de uso
https://creativecommons.org/licenses/by/2.5/ar/
Repositorio
CONICET Digital (CONICET)
Institución
Consejo Nacional de Investigaciones Científicas y Técnicas
OAI Identificador
oai:ri.conicet.gov.ar:11336/265958

id CONICETDig_2a47523fb0075f5572466e510805cd25
oai_identifier_str oai:ri.conicet.gov.ar:11336/265958
network_acronym_str CONICETDig
repository_id_str 3498
network_name_str CONICET Digital (CONICET)
spelling Improvement and generalization of ABCD method with Bayesian inferenceAlvarez, EzequielDa Rold, LeandroSzewc, ManuelSzynkman, Alejandro AndrésTanco, Santiago AndrésTarutina, TatianaNEW PHYSICSBAYESIAN INFERENCEABCD METHODhttps://purl.org/becyt/ford/1.3https://purl.org/becyt/ford/1To find New Physics or to refine our knowledge of the Standard Model at the LHC is an enterprise that involves many factors, such as the capabilities and the performance of the accelerator and detectors, the use and exploitation of the available information, the design of search strategies and observables, as well as the proposal of new models. We focus on the use of the information and pour our effort in re-thinking the usual datadriven ABCD method to improve it and to generalize it using Bayesian Machine Learning techniques and tools. We propose that a dataset consisting of a signal and many backgrounds is well described through a mixture model. Signal, backgrounds and their relative fractions in the sample can be well extracted by exploiting the prior knowledge and the dependence between the different observables at the event-by-event level with Bayesian tools. We show how, in contrast to the ABCD method, one can take advantage of understanding some properties of the different backgrounds and of having more than two independent observables to measure in each event. In addition, instead of regions defined through hard cuts, the Bayesian framework uses the information of continuous distribution to obtain soft-assignments of the events which are statistically more robust. To compare both methods we use a toy problem inspired by p p → hh → b¯bb¯b, selecting a reduced and simplified number of processes and analysing the flavor of the four jets and the invariant mass of the jet-pairs, modeled with simplified distributions. Taking advantage of all this information, and starting from a combination of biased and agnostic priors, leads us to a very good posterior once we use the Bayesian framework to exploit the data and the mutual information of the observables at the event-by-event level. We show how, in this simplified model, the Bayesian framework outperforms the ABCD method sensitivity in obtaining the signal fraction in scenarios with 1% and 0.5% true signal fractions in the dataset. We also show that the method is robust against the absence of signal. We discuss potential prospects for taking this Bayesian data-driven paradigm into more realistic scenarios.Fil: Alvarez, Ezequiel. Universidad Nacional de San Martín. Escuela de Ciencia y Tecnología. Centro Internacional de Estudios Avanzados; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas. Instituto de Ciencias Físicas. - Universidad Nacional de San Martín. Instituto de Ciencias Físicas; ArgentinaFil: Da Rold, Leandro. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina. Comisión Nacional de Energía Atómica. Gerencia del Área de Energía Nuclear. Instituto Balseiro. Archivo Histórico del Centro Atómico Bariloche e Instituto Balseiro | Universidad Nacional de Cuyo. Instituto Balseiro. Archivo Histórico del Centro Atómico Bariloche e Instituto Balseiro; ArgentinaFil: Szewc, Manuel. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina. University of Cincinnati; Estados UnidosFil: Szynkman, Alejandro Andrés. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - La Plata. Instituto de Física La Plata. Universidad Nacional de La Plata. Facultad de Ciencias Exactas. Instituto de Física La Plata; ArgentinaFil: Tanco, Santiago Andrés. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - La Plata. Instituto de Física La Plata. Universidad Nacional de La Plata. Facultad de Ciencias Exactas. Instituto de Física La Plata; ArgentinaFil: Tarutina, Tatiana. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - La Plata. Instituto de Física La Plata. Universidad Nacional de La Plata. Facultad de Ciencias Exactas. Instituto de Física La Plata; ArgentinaSciPost Foundation2024-07info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionhttp://purl.org/coar/resource_type/c_6501info:ar-repo/semantics/articuloapplication/pdfapplication/pdfapplication/pdfapplication/pdfapplication/pdfhttp://hdl.handle.net/11336/265958Alvarez, Ezequiel; Da Rold, Leandro; Szewc, Manuel; Szynkman, Alejandro Andrés; Tanco, Santiago Andrés; et al.; Improvement and generalization of ABCD method with Bayesian inference; SciPost Foundation; SciPost Physics Core; 7; 3; 7-2024; 1-242666-9366CONICET DigitalCONICETenginfo:eu-repo/semantics/altIdentifier/url/https://scipost.org/10.21468/SciPostPhysCore.7.3.043info:eu-repo/semantics/altIdentifier/doi/10.21468/SciPostPhysCore.7.3.043info:eu-repo/semantics/openAccesshttps://creativecommons.org/licenses/by/2.5/ar/reponame:CONICET Digital (CONICET)instname:Consejo Nacional de Investigaciones Científicas y Técnicas2025-11-05T10:28:36Zoai:ri.conicet.gov.ar:11336/265958instacron:CONICETInstitucionalhttp://ri.conicet.gov.ar/Organismo científico-tecnológicoNo correspondehttp://ri.conicet.gov.ar/oai/requestdasensio@conicet.gov.ar; lcarlino@conicet.gov.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:34982025-11-05 10:28:36.608CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicasfalse
dc.title.none.fl_str_mv Improvement and generalization of ABCD method with Bayesian inference
title Improvement and generalization of ABCD method with Bayesian inference
spellingShingle Improvement and generalization of ABCD method with Bayesian inference
Alvarez, Ezequiel
NEW PHYSICS
BAYESIAN INFERENCE
ABCD METHOD
title_short Improvement and generalization of ABCD method with Bayesian inference
title_full Improvement and generalization of ABCD method with Bayesian inference
title_fullStr Improvement and generalization of ABCD method with Bayesian inference
title_full_unstemmed Improvement and generalization of ABCD method with Bayesian inference
title_sort Improvement and generalization of ABCD method with Bayesian inference
dc.creator.none.fl_str_mv Alvarez, Ezequiel
Da Rold, Leandro
Szewc, Manuel
Szynkman, Alejandro Andrés
Tanco, Santiago Andrés
Tarutina, Tatiana
author Alvarez, Ezequiel
author_facet Alvarez, Ezequiel
Da Rold, Leandro
Szewc, Manuel
Szynkman, Alejandro Andrés
Tanco, Santiago Andrés
Tarutina, Tatiana
author_role author
author2 Da Rold, Leandro
Szewc, Manuel
Szynkman, Alejandro Andrés
Tanco, Santiago Andrés
Tarutina, Tatiana
author2_role author
author
author
author
author
dc.subject.none.fl_str_mv NEW PHYSICS
BAYESIAN INFERENCE
ABCD METHOD
topic NEW PHYSICS
BAYESIAN INFERENCE
ABCD METHOD
purl_subject.fl_str_mv https://purl.org/becyt/ford/1.3
https://purl.org/becyt/ford/1
dc.description.none.fl_txt_mv To find New Physics or to refine our knowledge of the Standard Model at the LHC is an enterprise that involves many factors, such as the capabilities and the performance of the accelerator and detectors, the use and exploitation of the available information, the design of search strategies and observables, as well as the proposal of new models. We focus on the use of the information and pour our effort in re-thinking the usual datadriven ABCD method to improve it and to generalize it using Bayesian Machine Learning techniques and tools. We propose that a dataset consisting of a signal and many backgrounds is well described through a mixture model. Signal, backgrounds and their relative fractions in the sample can be well extracted by exploiting the prior knowledge and the dependence between the different observables at the event-by-event level with Bayesian tools. We show how, in contrast to the ABCD method, one can take advantage of understanding some properties of the different backgrounds and of having more than two independent observables to measure in each event. In addition, instead of regions defined through hard cuts, the Bayesian framework uses the information of continuous distribution to obtain soft-assignments of the events which are statistically more robust. To compare both methods we use a toy problem inspired by p p → hh → b¯bb¯b, selecting a reduced and simplified number of processes and analysing the flavor of the four jets and the invariant mass of the jet-pairs, modeled with simplified distributions. Taking advantage of all this information, and starting from a combination of biased and agnostic priors, leads us to a very good posterior once we use the Bayesian framework to exploit the data and the mutual information of the observables at the event-by-event level. We show how, in this simplified model, the Bayesian framework outperforms the ABCD method sensitivity in obtaining the signal fraction in scenarios with 1% and 0.5% true signal fractions in the dataset. We also show that the method is robust against the absence of signal. We discuss potential prospects for taking this Bayesian data-driven paradigm into more realistic scenarios.
Fil: Alvarez, Ezequiel. Universidad Nacional de San Martín. Escuela de Ciencia y Tecnología. Centro Internacional de Estudios Avanzados; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas. Instituto de Ciencias Físicas. - Universidad Nacional de San Martín. Instituto de Ciencias Físicas; Argentina
Fil: Da Rold, Leandro. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina. Comisión Nacional de Energía Atómica. Gerencia del Área de Energía Nuclear. Instituto Balseiro. Archivo Histórico del Centro Atómico Bariloche e Instituto Balseiro | Universidad Nacional de Cuyo. Instituto Balseiro. Archivo Histórico del Centro Atómico Bariloche e Instituto Balseiro; Argentina
Fil: Szewc, Manuel. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina. University of Cincinnati; Estados Unidos
Fil: Szynkman, Alejandro Andrés. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - La Plata. Instituto de Física La Plata. Universidad Nacional de La Plata. Facultad de Ciencias Exactas. Instituto de Física La Plata; Argentina
Fil: Tanco, Santiago Andrés. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - La Plata. Instituto de Física La Plata. Universidad Nacional de La Plata. Facultad de Ciencias Exactas. Instituto de Física La Plata; Argentina
Fil: Tarutina, Tatiana. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - La Plata. Instituto de Física La Plata. Universidad Nacional de La Plata. Facultad de Ciencias Exactas. Instituto de Física La Plata; Argentina
description To find New Physics or to refine our knowledge of the Standard Model at the LHC is an enterprise that involves many factors, such as the capabilities and the performance of the accelerator and detectors, the use and exploitation of the available information, the design of search strategies and observables, as well as the proposal of new models. We focus on the use of the information and pour our effort in re-thinking the usual datadriven ABCD method to improve it and to generalize it using Bayesian Machine Learning techniques and tools. We propose that a dataset consisting of a signal and many backgrounds is well described through a mixture model. Signal, backgrounds and their relative fractions in the sample can be well extracted by exploiting the prior knowledge and the dependence between the different observables at the event-by-event level with Bayesian tools. We show how, in contrast to the ABCD method, one can take advantage of understanding some properties of the different backgrounds and of having more than two independent observables to measure in each event. In addition, instead of regions defined through hard cuts, the Bayesian framework uses the information of continuous distribution to obtain soft-assignments of the events which are statistically more robust. To compare both methods we use a toy problem inspired by p p → hh → b¯bb¯b, selecting a reduced and simplified number of processes and analysing the flavor of the four jets and the invariant mass of the jet-pairs, modeled with simplified distributions. Taking advantage of all this information, and starting from a combination of biased and agnostic priors, leads us to a very good posterior once we use the Bayesian framework to exploit the data and the mutual information of the observables at the event-by-event level. We show how, in this simplified model, the Bayesian framework outperforms the ABCD method sensitivity in obtaining the signal fraction in scenarios with 1% and 0.5% true signal fractions in the dataset. We also show that the method is robust against the absence of signal. We discuss potential prospects for taking this Bayesian data-driven paradigm into more realistic scenarios.
publishDate 2024
dc.date.none.fl_str_mv 2024-07
dc.type.none.fl_str_mv info:eu-repo/semantics/article
info:eu-repo/semantics/publishedVersion
http://purl.org/coar/resource_type/c_6501
info:ar-repo/semantics/articulo
format article
status_str publishedVersion
dc.identifier.none.fl_str_mv http://hdl.handle.net/11336/265958
Alvarez, Ezequiel; Da Rold, Leandro; Szewc, Manuel; Szynkman, Alejandro Andrés; Tanco, Santiago Andrés; et al.; Improvement and generalization of ABCD method with Bayesian inference; SciPost Foundation; SciPost Physics Core; 7; 3; 7-2024; 1-24
2666-9366
CONICET Digital
CONICET
url http://hdl.handle.net/11336/265958
identifier_str_mv Alvarez, Ezequiel; Da Rold, Leandro; Szewc, Manuel; Szynkman, Alejandro Andrés; Tanco, Santiago Andrés; et al.; Improvement and generalization of ABCD method with Bayesian inference; SciPost Foundation; SciPost Physics Core; 7; 3; 7-2024; 1-24
2666-9366
CONICET Digital
CONICET
dc.language.none.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv info:eu-repo/semantics/altIdentifier/url/https://scipost.org/10.21468/SciPostPhysCore.7.3.043
info:eu-repo/semantics/altIdentifier/doi/10.21468/SciPostPhysCore.7.3.043
dc.rights.none.fl_str_mv info:eu-repo/semantics/openAccess
https://creativecommons.org/licenses/by/2.5/ar/
eu_rights_str_mv openAccess
rights_invalid_str_mv https://creativecommons.org/licenses/by/2.5/ar/
dc.format.none.fl_str_mv application/pdf
application/pdf
application/pdf
application/pdf
application/pdf
dc.publisher.none.fl_str_mv SciPost Foundation
publisher.none.fl_str_mv SciPost Foundation
dc.source.none.fl_str_mv reponame:CONICET Digital (CONICET)
instname:Consejo Nacional de Investigaciones Científicas y Técnicas
reponame_str CONICET Digital (CONICET)
collection CONICET Digital (CONICET)
instname_str Consejo Nacional de Investigaciones Científicas y Técnicas
repository.name.fl_str_mv CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicas
repository.mail.fl_str_mv dasensio@conicet.gov.ar; lcarlino@conicet.gov.ar
_version_ 1847977947499069440
score 13.087074