Inferring a Property of a Large System from a Small Number of Samples
- Autores
- Hernández Lahme, Damián Gabriel; Samengo, Ines
- Año de publicación
- 2022
- Idioma
- inglés
- Tipo de recurso
- artículo
- Estado
- versión publicada
- Descripción
- Inferring the value of a property of a large stochastic system is a difficult task when the number of samples is insufficient to reliably estimate the probability distribution. The Bayesian estimator of the property of interest requires the knowledge of the prior distribution, and in many situations, it is not clear which prior should be used. Several estimators have been developed so far in which the proposed prior us individually tailored for each property of interest; such is the case, for example, for the entropy, the amount of mutual information, or the correlation between pairs of variables. In this paper, we propose a general framework to select priors that is valid for arbitrary properties. We first demonstrate that only certain aspects of the prior distribution actually affect the inference process. We then expand the sought prior as a linear combination of a one-dimensional family of indexed priors, each of which is obtained through a maximum entropy approach with constrained mean values of the property under study. In many cases of interest, only one or very few components of the expansion turn out to contribute to the Bayesian estimator, so it is often valid to only keep a single component. The relevant component is selected by the data, so no handcrafted priors are required. We test the performance of this approximation with a few paradigmatic examples and show that it performs well in comparison to the ad-hoc methods previously proposed in the literature. Our method highlights the connection between Bayesian inference and equilibrium statistical mechanics, since the most relevant component of the expansion can be argued to be that with the right temperature.
Fil: Hernández Lahme, Damián Gabriel. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Patagonia Norte; Argentina. Comisión Nacional de Energía Atómica. Centro Atómico Bariloche; Argentina. Comisión Nacional de Energía Atómica. Gerencia del Área de Energía Nuclear. Instituto Balseiro; Argentina
Fil: Samengo, Ines. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Patagonia Norte; Argentina. Comisión Nacional de Energía Atómica. Centro Atómico Bariloche; Argentina. Comisión Nacional de Energía Atómica. Gerencia del Área de Energía Nuclear. Instituto Balseiro; Argentina - Materia
-
BAYESIAN
ENTROPY
INFERENCE
MUTUAL INFORMATION
UNDERSAMPLED - Nivel de accesibilidad
- acceso abierto
- Condiciones de uso
- https://creativecommons.org/licenses/by/2.5/ar/
- Repositorio
- Institución
- Consejo Nacional de Investigaciones Científicas y Técnicas
- OAI Identificador
- oai:ri.conicet.gov.ar:11336/188161
Ver los metadatos del registro completo
id |
CONICETDig_98d7f667dac859e63af53c464de1342b |
---|---|
oai_identifier_str |
oai:ri.conicet.gov.ar:11336/188161 |
network_acronym_str |
CONICETDig |
repository_id_str |
3498 |
network_name_str |
CONICET Digital (CONICET) |
spelling |
Inferring a Property of a Large System from a Small Number of SamplesHernández Lahme, Damián GabrielSamengo, InesBAYESIANENTROPYINFERENCEMUTUAL INFORMATIONUNDERSAMPLEDhttps://purl.org/becyt/ford/1.3https://purl.org/becyt/ford/1https://purl.org/becyt/ford/1.1https://purl.org/becyt/ford/1Inferring the value of a property of a large stochastic system is a difficult task when the number of samples is insufficient to reliably estimate the probability distribution. The Bayesian estimator of the property of interest requires the knowledge of the prior distribution, and in many situations, it is not clear which prior should be used. Several estimators have been developed so far in which the proposed prior us individually tailored for each property of interest; such is the case, for example, for the entropy, the amount of mutual information, or the correlation between pairs of variables. In this paper, we propose a general framework to select priors that is valid for arbitrary properties. We first demonstrate that only certain aspects of the prior distribution actually affect the inference process. We then expand the sought prior as a linear combination of a one-dimensional family of indexed priors, each of which is obtained through a maximum entropy approach with constrained mean values of the property under study. In many cases of interest, only one or very few components of the expansion turn out to contribute to the Bayesian estimator, so it is often valid to only keep a single component. The relevant component is selected by the data, so no handcrafted priors are required. We test the performance of this approximation with a few paradigmatic examples and show that it performs well in comparison to the ad-hoc methods previously proposed in the literature. Our method highlights the connection between Bayesian inference and equilibrium statistical mechanics, since the most relevant component of the expansion can be argued to be that with the right temperature.Fil: Hernández Lahme, Damián Gabriel. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Patagonia Norte; Argentina. Comisión Nacional de Energía Atómica. Centro Atómico Bariloche; Argentina. Comisión Nacional de Energía Atómica. Gerencia del Área de Energía Nuclear. Instituto Balseiro; ArgentinaFil: Samengo, Ines. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Patagonia Norte; Argentina. Comisión Nacional de Energía Atómica. Centro Atómico Bariloche; Argentina. Comisión Nacional de Energía Atómica. Gerencia del Área de Energía Nuclear. Instituto Balseiro; ArgentinaMolecular Diversity Preservation International2022-01info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionhttp://purl.org/coar/resource_type/c_6501info:ar-repo/semantics/articuloapplication/pdfapplication/pdfhttp://hdl.handle.net/11336/188161Hernández Lahme, Damián Gabriel; Samengo, Ines; Inferring a Property of a Large System from a Small Number of Samples; Molecular Diversity Preservation International; Entropy; 24; 1; 1-2022; 1-171099-4300CONICET DigitalCONICETenginfo:eu-repo/semantics/altIdentifier/url/https://www.mdpi.com/1099-4300/24/1/125/htminfo:eu-repo/semantics/altIdentifier/doi/10.3390/e24010125info:eu-repo/semantics/openAccesshttps://creativecommons.org/licenses/by/2.5/ar/reponame:CONICET Digital (CONICET)instname:Consejo Nacional de Investigaciones Científicas y Técnicas2025-09-29T10:20:55Zoai:ri.conicet.gov.ar:11336/188161instacron:CONICETInstitucionalhttp://ri.conicet.gov.ar/Organismo científico-tecnológicoNo correspondehttp://ri.conicet.gov.ar/oai/requestdasensio@conicet.gov.ar; lcarlino@conicet.gov.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:34982025-09-29 10:20:55.667CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicasfalse |
dc.title.none.fl_str_mv |
Inferring a Property of a Large System from a Small Number of Samples |
title |
Inferring a Property of a Large System from a Small Number of Samples |
spellingShingle |
Inferring a Property of a Large System from a Small Number of Samples Hernández Lahme, Damián Gabriel BAYESIAN ENTROPY INFERENCE MUTUAL INFORMATION UNDERSAMPLED |
title_short |
Inferring a Property of a Large System from a Small Number of Samples |
title_full |
Inferring a Property of a Large System from a Small Number of Samples |
title_fullStr |
Inferring a Property of a Large System from a Small Number of Samples |
title_full_unstemmed |
Inferring a Property of a Large System from a Small Number of Samples |
title_sort |
Inferring a Property of a Large System from a Small Number of Samples |
dc.creator.none.fl_str_mv |
Hernández Lahme, Damián Gabriel Samengo, Ines |
author |
Hernández Lahme, Damián Gabriel |
author_facet |
Hernández Lahme, Damián Gabriel Samengo, Ines |
author_role |
author |
author2 |
Samengo, Ines |
author2_role |
author |
dc.subject.none.fl_str_mv |
BAYESIAN ENTROPY INFERENCE MUTUAL INFORMATION UNDERSAMPLED |
topic |
BAYESIAN ENTROPY INFERENCE MUTUAL INFORMATION UNDERSAMPLED |
purl_subject.fl_str_mv |
https://purl.org/becyt/ford/1.3 https://purl.org/becyt/ford/1 https://purl.org/becyt/ford/1.1 https://purl.org/becyt/ford/1 |
dc.description.none.fl_txt_mv |
Inferring the value of a property of a large stochastic system is a difficult task when the number of samples is insufficient to reliably estimate the probability distribution. The Bayesian estimator of the property of interest requires the knowledge of the prior distribution, and in many situations, it is not clear which prior should be used. Several estimators have been developed so far in which the proposed prior us individually tailored for each property of interest; such is the case, for example, for the entropy, the amount of mutual information, or the correlation between pairs of variables. In this paper, we propose a general framework to select priors that is valid for arbitrary properties. We first demonstrate that only certain aspects of the prior distribution actually affect the inference process. We then expand the sought prior as a linear combination of a one-dimensional family of indexed priors, each of which is obtained through a maximum entropy approach with constrained mean values of the property under study. In many cases of interest, only one or very few components of the expansion turn out to contribute to the Bayesian estimator, so it is often valid to only keep a single component. The relevant component is selected by the data, so no handcrafted priors are required. We test the performance of this approximation with a few paradigmatic examples and show that it performs well in comparison to the ad-hoc methods previously proposed in the literature. Our method highlights the connection between Bayesian inference and equilibrium statistical mechanics, since the most relevant component of the expansion can be argued to be that with the right temperature. Fil: Hernández Lahme, Damián Gabriel. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Patagonia Norte; Argentina. Comisión Nacional de Energía Atómica. Centro Atómico Bariloche; Argentina. Comisión Nacional de Energía Atómica. Gerencia del Área de Energía Nuclear. Instituto Balseiro; Argentina Fil: Samengo, Ines. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Patagonia Norte; Argentina. Comisión Nacional de Energía Atómica. Centro Atómico Bariloche; Argentina. Comisión Nacional de Energía Atómica. Gerencia del Área de Energía Nuclear. Instituto Balseiro; Argentina |
description |
Inferring the value of a property of a large stochastic system is a difficult task when the number of samples is insufficient to reliably estimate the probability distribution. The Bayesian estimator of the property of interest requires the knowledge of the prior distribution, and in many situations, it is not clear which prior should be used. Several estimators have been developed so far in which the proposed prior us individually tailored for each property of interest; such is the case, for example, for the entropy, the amount of mutual information, or the correlation between pairs of variables. In this paper, we propose a general framework to select priors that is valid for arbitrary properties. We first demonstrate that only certain aspects of the prior distribution actually affect the inference process. We then expand the sought prior as a linear combination of a one-dimensional family of indexed priors, each of which is obtained through a maximum entropy approach with constrained mean values of the property under study. In many cases of interest, only one or very few components of the expansion turn out to contribute to the Bayesian estimator, so it is often valid to only keep a single component. The relevant component is selected by the data, so no handcrafted priors are required. We test the performance of this approximation with a few paradigmatic examples and show that it performs well in comparison to the ad-hoc methods previously proposed in the literature. Our method highlights the connection between Bayesian inference and equilibrium statistical mechanics, since the most relevant component of the expansion can be argued to be that with the right temperature. |
publishDate |
2022 |
dc.date.none.fl_str_mv |
2022-01 |
dc.type.none.fl_str_mv |
info:eu-repo/semantics/article info:eu-repo/semantics/publishedVersion http://purl.org/coar/resource_type/c_6501 info:ar-repo/semantics/articulo |
format |
article |
status_str |
publishedVersion |
dc.identifier.none.fl_str_mv |
http://hdl.handle.net/11336/188161 Hernández Lahme, Damián Gabriel; Samengo, Ines; Inferring a Property of a Large System from a Small Number of Samples; Molecular Diversity Preservation International; Entropy; 24; 1; 1-2022; 1-17 1099-4300 CONICET Digital CONICET |
url |
http://hdl.handle.net/11336/188161 |
identifier_str_mv |
Hernández Lahme, Damián Gabriel; Samengo, Ines; Inferring a Property of a Large System from a Small Number of Samples; Molecular Diversity Preservation International; Entropy; 24; 1; 1-2022; 1-17 1099-4300 CONICET Digital CONICET |
dc.language.none.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
info:eu-repo/semantics/altIdentifier/url/https://www.mdpi.com/1099-4300/24/1/125/htm info:eu-repo/semantics/altIdentifier/doi/10.3390/e24010125 |
dc.rights.none.fl_str_mv |
info:eu-repo/semantics/openAccess https://creativecommons.org/licenses/by/2.5/ar/ |
eu_rights_str_mv |
openAccess |
rights_invalid_str_mv |
https://creativecommons.org/licenses/by/2.5/ar/ |
dc.format.none.fl_str_mv |
application/pdf application/pdf |
dc.publisher.none.fl_str_mv |
Molecular Diversity Preservation International |
publisher.none.fl_str_mv |
Molecular Diversity Preservation International |
dc.source.none.fl_str_mv |
reponame:CONICET Digital (CONICET) instname:Consejo Nacional de Investigaciones Científicas y Técnicas |
reponame_str |
CONICET Digital (CONICET) |
collection |
CONICET Digital (CONICET) |
instname_str |
Consejo Nacional de Investigaciones Científicas y Técnicas |
repository.name.fl_str_mv |
CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicas |
repository.mail.fl_str_mv |
dasensio@conicet.gov.ar; lcarlino@conicet.gov.ar |
_version_ |
1844614194374115328 |
score |
13.070432 |