Exploring unsupervised top tagging using Bayesian inference

Autores
Alvarez, Ezequiel; Szewc, Manuel; Szynkman, Alejandro Andrés; Tanco, Santiago Andrés; Tarutina, Tatiana
Año de publicación
2023
Idioma
inglés
Tipo de recurso
artículo
Estado
versión publicada
Descripción
Recognizing hadronically decaying top-quark jets in a sample of jets, or even its total fraction in the sample, is an important step in many LHC searches for Standard Model and Beyond Standard Model physics as well. Although there exists outstanding top-tagger algorithms, their construction and their expected performance rely on Montecarlo simulations, which may induce potential biases. For these reasons we develop two simple unsupervised top-tagger algorithms based on performing Bayesian inference on a mixture model. In one of them we use as the observed variable a new geometrically-based observable Ã3, and in the other we consider the more traditional τ3/τ2 N-subjettiness ratio, which yields a better performance. As expected, we find that the unsupervised tagger performance is below existing supervised taggers, reaching expected Area Under Curve AUC ∼ 0.80 − 0.81 and accuracies of about 69% − 75% in a full range of sample purity. However, these performances are more robust to possible biases in the Montecarlo that their supervised counterparts. Our findings are a step towards exploring and considering simpler and unbiased taggers.
Fil: Alvarez, Ezequiel. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina
Fil: Szewc, Manuel. University of Cincinnati; Estados Unidos. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina
Fil: Szynkman, Alejandro Andrés. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - La Plata. Instituto de Física La Plata. Universidad Nacional de La Plata. Facultad de Ciencias Exactas. Instituto de Física La Plata; Argentina
Fil: Tanco, Santiago Andrés. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - La Plata. Instituto de Física La Plata. Universidad Nacional de La Plata. Facultad de Ciencias Exactas. Instituto de Física La Plata; Argentina
Fil: Tarutina, Tatiana. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - La Plata. Instituto de Física La Plata. Universidad Nacional de La Plata. Facultad de Ciencias Exactas. Instituto de Física La Plata; Argentina
Materia
Jets
machine learning
top quark
Nivel de accesibilidad
acceso abierto
Condiciones de uso
https://creativecommons.org/licenses/by/2.5/ar/
Repositorio
CONICET Digital (CONICET)
Institución
Consejo Nacional de Investigaciones Científicas y Técnicas
OAI Identificador
oai:ri.conicet.gov.ar:11336/243762

id CONICETDig_ec1f8dc3fb7d3f60df92c019682c2339
oai_identifier_str oai:ri.conicet.gov.ar:11336/243762
network_acronym_str CONICETDig
repository_id_str 3498
network_name_str CONICET Digital (CONICET)
spelling Exploring unsupervised top tagging using Bayesian inferenceAlvarez, EzequielSzewc, ManuelSzynkman, Alejandro AndrésTanco, Santiago AndrésTarutina, TatianaJetsmachine learningtop quarkhttps://purl.org/becyt/ford/1.3https://purl.org/becyt/ford/1Recognizing hadronically decaying top-quark jets in a sample of jets, or even its total fraction in the sample, is an important step in many LHC searches for Standard Model and Beyond Standard Model physics as well. Although there exists outstanding top-tagger algorithms, their construction and their expected performance rely on Montecarlo simulations, which may induce potential biases. For these reasons we develop two simple unsupervised top-tagger algorithms based on performing Bayesian inference on a mixture model. In one of them we use as the observed variable a new geometrically-based observable Ã3, and in the other we consider the more traditional τ3/τ2 N-subjettiness ratio, which yields a better performance. As expected, we find that the unsupervised tagger performance is below existing supervised taggers, reaching expected Area Under Curve AUC ∼ 0.80 − 0.81 and accuracies of about 69% − 75% in a full range of sample purity. However, these performances are more robust to possible biases in the Montecarlo that their supervised counterparts. Our findings are a step towards exploring and considering simpler and unbiased taggers.Fil: Alvarez, Ezequiel. Consejo Nacional de Investigaciones Científicas y Técnicas; ArgentinaFil: Szewc, Manuel. University of Cincinnati; Estados Unidos. Consejo Nacional de Investigaciones Científicas y Técnicas; ArgentinaFil: Szynkman, Alejandro Andrés. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - La Plata. Instituto de Física La Plata. Universidad Nacional de La Plata. Facultad de Ciencias Exactas. Instituto de Física La Plata; ArgentinaFil: Tanco, Santiago Andrés. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - La Plata. Instituto de Física La Plata. Universidad Nacional de La Plata. Facultad de Ciencias Exactas. Instituto de Física La Plata; ArgentinaFil: Tarutina, Tatiana. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - La Plata. Instituto de Física La Plata. Universidad Nacional de La Plata. Facultad de Ciencias Exactas. Instituto de Física La Plata; ArgentinaSciPost Foundation2023-04info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionhttp://purl.org/coar/resource_type/c_6501info:ar-repo/semantics/articuloapplication/pdfapplication/pdfhttp://hdl.handle.net/11336/243762Alvarez, Ezequiel; Szewc, Manuel; Szynkman, Alejandro Andrés; Tanco, Santiago Andrés; Tarutina, Tatiana; Exploring unsupervised top tagging using Bayesian inference; SciPost Foundation; SciPost Physics Core; 6; 2; 4-2023; 1-192666-9366CONICET DigitalCONICETenginfo:eu-repo/semantics/altIdentifier/url/https://scipost.org/10.21468/SciPostPhysCore.6.2.046info:eu-repo/semantics/altIdentifier/doi/10.21468/SCIPOSTPHYSCORE.6.2.046info:eu-repo/semantics/openAccesshttps://creativecommons.org/licenses/by/2.5/ar/reponame:CONICET Digital (CONICET)instname:Consejo Nacional de Investigaciones Científicas y Técnicas2025-09-29T10:34:56Zoai:ri.conicet.gov.ar:11336/243762instacron:CONICETInstitucionalhttp://ri.conicet.gov.ar/Organismo científico-tecnológicoNo correspondehttp://ri.conicet.gov.ar/oai/requestdasensio@conicet.gov.ar; lcarlino@conicet.gov.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:34982025-09-29 10:34:56.948CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicasfalse
dc.title.none.fl_str_mv Exploring unsupervised top tagging using Bayesian inference
title Exploring unsupervised top tagging using Bayesian inference
spellingShingle Exploring unsupervised top tagging using Bayesian inference
Alvarez, Ezequiel
Jets
machine learning
top quark
title_short Exploring unsupervised top tagging using Bayesian inference
title_full Exploring unsupervised top tagging using Bayesian inference
title_fullStr Exploring unsupervised top tagging using Bayesian inference
title_full_unstemmed Exploring unsupervised top tagging using Bayesian inference
title_sort Exploring unsupervised top tagging using Bayesian inference
dc.creator.none.fl_str_mv Alvarez, Ezequiel
Szewc, Manuel
Szynkman, Alejandro Andrés
Tanco, Santiago Andrés
Tarutina, Tatiana
author Alvarez, Ezequiel
author_facet Alvarez, Ezequiel
Szewc, Manuel
Szynkman, Alejandro Andrés
Tanco, Santiago Andrés
Tarutina, Tatiana
author_role author
author2 Szewc, Manuel
Szynkman, Alejandro Andrés
Tanco, Santiago Andrés
Tarutina, Tatiana
author2_role author
author
author
author
dc.subject.none.fl_str_mv Jets
machine learning
top quark
topic Jets
machine learning
top quark
purl_subject.fl_str_mv https://purl.org/becyt/ford/1.3
https://purl.org/becyt/ford/1
dc.description.none.fl_txt_mv Recognizing hadronically decaying top-quark jets in a sample of jets, or even its total fraction in the sample, is an important step in many LHC searches for Standard Model and Beyond Standard Model physics as well. Although there exists outstanding top-tagger algorithms, their construction and their expected performance rely on Montecarlo simulations, which may induce potential biases. For these reasons we develop two simple unsupervised top-tagger algorithms based on performing Bayesian inference on a mixture model. In one of them we use as the observed variable a new geometrically-based observable Ã3, and in the other we consider the more traditional τ3/τ2 N-subjettiness ratio, which yields a better performance. As expected, we find that the unsupervised tagger performance is below existing supervised taggers, reaching expected Area Under Curve AUC ∼ 0.80 − 0.81 and accuracies of about 69% − 75% in a full range of sample purity. However, these performances are more robust to possible biases in the Montecarlo that their supervised counterparts. Our findings are a step towards exploring and considering simpler and unbiased taggers.
Fil: Alvarez, Ezequiel. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina
Fil: Szewc, Manuel. University of Cincinnati; Estados Unidos. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina
Fil: Szynkman, Alejandro Andrés. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - La Plata. Instituto de Física La Plata. Universidad Nacional de La Plata. Facultad de Ciencias Exactas. Instituto de Física La Plata; Argentina
Fil: Tanco, Santiago Andrés. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - La Plata. Instituto de Física La Plata. Universidad Nacional de La Plata. Facultad de Ciencias Exactas. Instituto de Física La Plata; Argentina
Fil: Tarutina, Tatiana. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - La Plata. Instituto de Física La Plata. Universidad Nacional de La Plata. Facultad de Ciencias Exactas. Instituto de Física La Plata; Argentina
description Recognizing hadronically decaying top-quark jets in a sample of jets, or even its total fraction in the sample, is an important step in many LHC searches for Standard Model and Beyond Standard Model physics as well. Although there exists outstanding top-tagger algorithms, their construction and their expected performance rely on Montecarlo simulations, which may induce potential biases. For these reasons we develop two simple unsupervised top-tagger algorithms based on performing Bayesian inference on a mixture model. In one of them we use as the observed variable a new geometrically-based observable Ã3, and in the other we consider the more traditional τ3/τ2 N-subjettiness ratio, which yields a better performance. As expected, we find that the unsupervised tagger performance is below existing supervised taggers, reaching expected Area Under Curve AUC ∼ 0.80 − 0.81 and accuracies of about 69% − 75% in a full range of sample purity. However, these performances are more robust to possible biases in the Montecarlo that their supervised counterparts. Our findings are a step towards exploring and considering simpler and unbiased taggers.
publishDate 2023
dc.date.none.fl_str_mv 2023-04
dc.type.none.fl_str_mv info:eu-repo/semantics/article
info:eu-repo/semantics/publishedVersion
http://purl.org/coar/resource_type/c_6501
info:ar-repo/semantics/articulo
format article
status_str publishedVersion
dc.identifier.none.fl_str_mv http://hdl.handle.net/11336/243762
Alvarez, Ezequiel; Szewc, Manuel; Szynkman, Alejandro Andrés; Tanco, Santiago Andrés; Tarutina, Tatiana; Exploring unsupervised top tagging using Bayesian inference; SciPost Foundation; SciPost Physics Core; 6; 2; 4-2023; 1-19
2666-9366
CONICET Digital
CONICET
url http://hdl.handle.net/11336/243762
identifier_str_mv Alvarez, Ezequiel; Szewc, Manuel; Szynkman, Alejandro Andrés; Tanco, Santiago Andrés; Tarutina, Tatiana; Exploring unsupervised top tagging using Bayesian inference; SciPost Foundation; SciPost Physics Core; 6; 2; 4-2023; 1-19
2666-9366
CONICET Digital
CONICET
dc.language.none.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv info:eu-repo/semantics/altIdentifier/url/https://scipost.org/10.21468/SciPostPhysCore.6.2.046
info:eu-repo/semantics/altIdentifier/doi/10.21468/SCIPOSTPHYSCORE.6.2.046
dc.rights.none.fl_str_mv info:eu-repo/semantics/openAccess
https://creativecommons.org/licenses/by/2.5/ar/
eu_rights_str_mv openAccess
rights_invalid_str_mv https://creativecommons.org/licenses/by/2.5/ar/
dc.format.none.fl_str_mv application/pdf
application/pdf
dc.publisher.none.fl_str_mv SciPost Foundation
publisher.none.fl_str_mv SciPost Foundation
dc.source.none.fl_str_mv reponame:CONICET Digital (CONICET)
instname:Consejo Nacional de Investigaciones Científicas y Técnicas
reponame_str CONICET Digital (CONICET)
collection CONICET Digital (CONICET)
instname_str Consejo Nacional de Investigaciones Científicas y Técnicas
repository.name.fl_str_mv CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicas
repository.mail.fl_str_mv dasensio@conicet.gov.ar; lcarlino@conicet.gov.ar
_version_ 1844614366372036608
score 13.070432