Null hypothesis test for anomaly detection
- Autores
- Kamenik, Jernej F.; Szewc, Manuel
- Año de publicación
- 2023
- Idioma
- inglés
- Tipo de recurso
- artículo
- Estado
- versión publicada
- Descripción
- We extend the use of Classification Without Labels for anomaly detection with a hypothesis test designed to exclude the background-only hypothesis. By testing for statistical independence of the two discriminating dataset regions, we are able to exclude the background-only hypothesis without relying on fixed anomaly score cuts or extrapolations of background estimates between regions. The method relies on the assumption of conditional independence of anomaly score features and dataset regions, which can be ensured using existing decorrelation techniques. As a benchmark example, we consider the LHC Olympics dataset where we show that mutual information represents a suitable test for statistical independence and our method exhibits excellent and robust performance at different signal fractions even in presence of realistic feature correlations.
Fil: Kamenik, Jernej F.. University of Ljubljana; Eslovenia
Fil: Szewc, Manuel. University of Ljubljana; Eslovenia. Consejo Nacional de Investigaciones Científicas y Técnicas. Instituto de Ciencias Físicas. - Universidad Nacional de San Martín. Instituto de Ciencias Físicas; Argentina - Materia
-
LHC
ANOMALY DETECTION
CWOLA
UNSUPERVISED - Nivel de accesibilidad
- acceso abierto
- Condiciones de uso
- https://creativecommons.org/licenses/by/2.5/ar/
- Repositorio
- Institución
- Consejo Nacional de Investigaciones Científicas y Técnicas
- OAI Identificador
- oai:ri.conicet.gov.ar:11336/237150
Ver los metadatos del registro completo
id |
CONICETDig_29853a537cec447e1ad11fc1558d9078 |
---|---|
oai_identifier_str |
oai:ri.conicet.gov.ar:11336/237150 |
network_acronym_str |
CONICETDig |
repository_id_str |
3498 |
network_name_str |
CONICET Digital (CONICET) |
spelling |
Null hypothesis test for anomaly detectionKamenik, Jernej F.Szewc, ManuelLHCANOMALY DETECTIONCWOLAUNSUPERVISEDhttps://purl.org/becyt/ford/1.3https://purl.org/becyt/ford/1We extend the use of Classification Without Labels for anomaly detection with a hypothesis test designed to exclude the background-only hypothesis. By testing for statistical independence of the two discriminating dataset regions, we are able to exclude the background-only hypothesis without relying on fixed anomaly score cuts or extrapolations of background estimates between regions. The method relies on the assumption of conditional independence of anomaly score features and dataset regions, which can be ensured using existing decorrelation techniques. As a benchmark example, we consider the LHC Olympics dataset where we show that mutual information represents a suitable test for statistical independence and our method exhibits excellent and robust performance at different signal fractions even in presence of realistic feature correlations.Fil: Kamenik, Jernej F.. University of Ljubljana; EsloveniaFil: Szewc, Manuel. University of Ljubljana; Eslovenia. Consejo Nacional de Investigaciones Científicas y Técnicas. Instituto de Ciencias Físicas. - Universidad Nacional de San Martín. Instituto de Ciencias Físicas; ArgentinaElsevier Science2023-05info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionhttp://purl.org/coar/resource_type/c_6501info:ar-repo/semantics/articuloapplication/pdfapplication/pdfhttp://hdl.handle.net/11336/237150Kamenik, Jernej F.; Szewc, Manuel; Null hypothesis test for anomaly detection; Elsevier Science; Physics Letters B; 840; 5-2023; 1-80370-2693CONICET DigitalCONICETenginfo:eu-repo/semantics/altIdentifier/url/https://linkinghub.elsevier.com/retrieve/pii/S0370269323001703info:eu-repo/semantics/altIdentifier/doi/10.1016/j.physletb.2023.137836info:eu-repo/semantics/openAccesshttps://creativecommons.org/licenses/by/2.5/ar/reponame:CONICET Digital (CONICET)instname:Consejo Nacional de Investigaciones Científicas y Técnicas2025-09-29T10:30:17Zoai:ri.conicet.gov.ar:11336/237150instacron:CONICETInstitucionalhttp://ri.conicet.gov.ar/Organismo científico-tecnológicoNo correspondehttp://ri.conicet.gov.ar/oai/requestdasensio@conicet.gov.ar; lcarlino@conicet.gov.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:34982025-09-29 10:30:17.377CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicasfalse |
dc.title.none.fl_str_mv |
Null hypothesis test for anomaly detection |
title |
Null hypothesis test for anomaly detection |
spellingShingle |
Null hypothesis test for anomaly detection Kamenik, Jernej F. LHC ANOMALY DETECTION CWOLA UNSUPERVISED |
title_short |
Null hypothesis test for anomaly detection |
title_full |
Null hypothesis test for anomaly detection |
title_fullStr |
Null hypothesis test for anomaly detection |
title_full_unstemmed |
Null hypothesis test for anomaly detection |
title_sort |
Null hypothesis test for anomaly detection |
dc.creator.none.fl_str_mv |
Kamenik, Jernej F. Szewc, Manuel |
author |
Kamenik, Jernej F. |
author_facet |
Kamenik, Jernej F. Szewc, Manuel |
author_role |
author |
author2 |
Szewc, Manuel |
author2_role |
author |
dc.subject.none.fl_str_mv |
LHC ANOMALY DETECTION CWOLA UNSUPERVISED |
topic |
LHC ANOMALY DETECTION CWOLA UNSUPERVISED |
purl_subject.fl_str_mv |
https://purl.org/becyt/ford/1.3 https://purl.org/becyt/ford/1 |
dc.description.none.fl_txt_mv |
We extend the use of Classification Without Labels for anomaly detection with a hypothesis test designed to exclude the background-only hypothesis. By testing for statistical independence of the two discriminating dataset regions, we are able to exclude the background-only hypothesis without relying on fixed anomaly score cuts or extrapolations of background estimates between regions. The method relies on the assumption of conditional independence of anomaly score features and dataset regions, which can be ensured using existing decorrelation techniques. As a benchmark example, we consider the LHC Olympics dataset where we show that mutual information represents a suitable test for statistical independence and our method exhibits excellent and robust performance at different signal fractions even in presence of realistic feature correlations. Fil: Kamenik, Jernej F.. University of Ljubljana; Eslovenia Fil: Szewc, Manuel. University of Ljubljana; Eslovenia. Consejo Nacional de Investigaciones Científicas y Técnicas. Instituto de Ciencias Físicas. - Universidad Nacional de San Martín. Instituto de Ciencias Físicas; Argentina |
description |
We extend the use of Classification Without Labels for anomaly detection with a hypothesis test designed to exclude the background-only hypothesis. By testing for statistical independence of the two discriminating dataset regions, we are able to exclude the background-only hypothesis without relying on fixed anomaly score cuts or extrapolations of background estimates between regions. The method relies on the assumption of conditional independence of anomaly score features and dataset regions, which can be ensured using existing decorrelation techniques. As a benchmark example, we consider the LHC Olympics dataset where we show that mutual information represents a suitable test for statistical independence and our method exhibits excellent and robust performance at different signal fractions even in presence of realistic feature correlations. |
publishDate |
2023 |
dc.date.none.fl_str_mv |
2023-05 |
dc.type.none.fl_str_mv |
info:eu-repo/semantics/article info:eu-repo/semantics/publishedVersion http://purl.org/coar/resource_type/c_6501 info:ar-repo/semantics/articulo |
format |
article |
status_str |
publishedVersion |
dc.identifier.none.fl_str_mv |
http://hdl.handle.net/11336/237150 Kamenik, Jernej F.; Szewc, Manuel; Null hypothesis test for anomaly detection; Elsevier Science; Physics Letters B; 840; 5-2023; 1-8 0370-2693 CONICET Digital CONICET |
url |
http://hdl.handle.net/11336/237150 |
identifier_str_mv |
Kamenik, Jernej F.; Szewc, Manuel; Null hypothesis test for anomaly detection; Elsevier Science; Physics Letters B; 840; 5-2023; 1-8 0370-2693 CONICET Digital CONICET |
dc.language.none.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
info:eu-repo/semantics/altIdentifier/url/https://linkinghub.elsevier.com/retrieve/pii/S0370269323001703 info:eu-repo/semantics/altIdentifier/doi/10.1016/j.physletb.2023.137836 |
dc.rights.none.fl_str_mv |
info:eu-repo/semantics/openAccess https://creativecommons.org/licenses/by/2.5/ar/ |
eu_rights_str_mv |
openAccess |
rights_invalid_str_mv |
https://creativecommons.org/licenses/by/2.5/ar/ |
dc.format.none.fl_str_mv |
application/pdf application/pdf |
dc.publisher.none.fl_str_mv |
Elsevier Science |
publisher.none.fl_str_mv |
Elsevier Science |
dc.source.none.fl_str_mv |
reponame:CONICET Digital (CONICET) instname:Consejo Nacional de Investigaciones Científicas y Técnicas |
reponame_str |
CONICET Digital (CONICET) |
collection |
CONICET Digital (CONICET) |
instname_str |
Consejo Nacional de Investigaciones Científicas y Técnicas |
repository.name.fl_str_mv |
CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicas |
repository.mail.fl_str_mv |
dasensio@conicet.gov.ar; lcarlino@conicet.gov.ar |
_version_ |
1844614311007223808 |
score |
13.070432 |