The Positive Matching Index: A new similarity measure with optimal characteristics

Autores
Dos Santos, Daniel Andrés; Deutsch, Reena
Año de publicación
2010
Idioma
inglés
Tipo de recurso
artículo
Estado
versión publicada
Descripción
Despite the many coefficients accounting for the resemblance between pairs of objects based on presence/absence data, no one measure shows optimal characteristics. In this work the Positive Matching Index (PMI) is proposed as a new measure of similarity between lists of attributes. PMI fulfills the Tulloss' theoretical prerequisites for similarity coefficients, is easy to calculate and has an intrinsic meaning expressable into a natural language. PMI is bounded between 0 and 1 and represents the mean proportion of positive matches relative to the size of attribute lists, ranging this cardinality continuously from the smaller list to the larger one. PMI behaves correctly where alternative indices either fail, or only approximate to the desirable properties for a similarity index. Empirical examples associated to biomedical research are provided to show outperformance of PMI in relation to standard indices such as Jaccard and Dice coefficients.
Fil: Dos Santos, Daniel Andrés. Universidad Nacional de Tucumán. Facultad de Ciencias Naturales e Instituto Miguel Lillo; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Tucumán; Argentina
Fil: Deutsch, Reena. University of California at San Diego; Estados Unidos
Materia
Association Coefficient
Binary Data
Dice Index
Jaccard Index
Similarity
Nivel de accesibilidad
acceso abierto
Condiciones de uso
https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
Repositorio
CONICET Digital (CONICET)
Institución
Consejo Nacional de Investigaciones Científicas y Técnicas
OAI Identificador
oai:ri.conicet.gov.ar:11336/75324

id CONICETDig_13f6362ac3c0289bba968f18ed23d10d
oai_identifier_str oai:ri.conicet.gov.ar:11336/75324
network_acronym_str CONICETDig
repository_id_str 3498
network_name_str CONICET Digital (CONICET)
spelling The Positive Matching Index: A new similarity measure with optimal characteristicsDos Santos, Daniel AndrésDeutsch, ReenaAssociation CoefficientBinary DataDice IndexJaccard IndexSimilarityhttps://purl.org/becyt/ford/1.2https://purl.org/becyt/ford/1Despite the many coefficients accounting for the resemblance between pairs of objects based on presence/absence data, no one measure shows optimal characteristics. In this work the Positive Matching Index (PMI) is proposed as a new measure of similarity between lists of attributes. PMI fulfills the Tulloss' theoretical prerequisites for similarity coefficients, is easy to calculate and has an intrinsic meaning expressable into a natural language. PMI is bounded between 0 and 1 and represents the mean proportion of positive matches relative to the size of attribute lists, ranging this cardinality continuously from the smaller list to the larger one. PMI behaves correctly where alternative indices either fail, or only approximate to the desirable properties for a similarity index. Empirical examples associated to biomedical research are provided to show outperformance of PMI in relation to standard indices such as Jaccard and Dice coefficients.Fil: Dos Santos, Daniel Andrés. Universidad Nacional de Tucumán. Facultad de Ciencias Naturales e Instituto Miguel Lillo; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Tucumán; ArgentinaFil: Deutsch, Reena. University of California at San Diego; Estados UnidosElsevier Science2010-09info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionhttp://purl.org/coar/resource_type/c_6501info:ar-repo/semantics/articuloapplication/pdfapplication/pdfhttp://hdl.handle.net/11336/75324Dos Santos, Daniel Andrés; Deutsch, Reena; The Positive Matching Index: A new similarity measure with optimal characteristics; Elsevier Science; Pattern Recognition Letters; 31; 12; 9-2010; 1570-15760167-8655CONICET DigitalCONICETenginfo:eu-repo/semantics/altIdentifier/url/https://www.sciencedirect.com/science/article/pii/S0167865510000917info:eu-repo/semantics/altIdentifier/doi/10.1016/j.patrec.2010.03.010info:eu-repo/semantics/openAccesshttps://creativecommons.org/licenses/by-nc-sa/2.5/ar/reponame:CONICET Digital (CONICET)instname:Consejo Nacional de Investigaciones Científicas y Técnicas2025-09-29T09:44:35Zoai:ri.conicet.gov.ar:11336/75324instacron:CONICETInstitucionalhttp://ri.conicet.gov.ar/Organismo científico-tecnológicoNo correspondehttp://ri.conicet.gov.ar/oai/requestdasensio@conicet.gov.ar; lcarlino@conicet.gov.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:34982025-09-29 09:44:36.05CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicasfalse
dc.title.none.fl_str_mv The Positive Matching Index: A new similarity measure with optimal characteristics
title The Positive Matching Index: A new similarity measure with optimal characteristics
spellingShingle The Positive Matching Index: A new similarity measure with optimal characteristics
Dos Santos, Daniel Andrés
Association Coefficient
Binary Data
Dice Index
Jaccard Index
Similarity
title_short The Positive Matching Index: A new similarity measure with optimal characteristics
title_full The Positive Matching Index: A new similarity measure with optimal characteristics
title_fullStr The Positive Matching Index: A new similarity measure with optimal characteristics
title_full_unstemmed The Positive Matching Index: A new similarity measure with optimal characteristics
title_sort The Positive Matching Index: A new similarity measure with optimal characteristics
dc.creator.none.fl_str_mv Dos Santos, Daniel Andrés
Deutsch, Reena
author Dos Santos, Daniel Andrés
author_facet Dos Santos, Daniel Andrés
Deutsch, Reena
author_role author
author2 Deutsch, Reena
author2_role author
dc.subject.none.fl_str_mv Association Coefficient
Binary Data
Dice Index
Jaccard Index
Similarity
topic Association Coefficient
Binary Data
Dice Index
Jaccard Index
Similarity
purl_subject.fl_str_mv https://purl.org/becyt/ford/1.2
https://purl.org/becyt/ford/1
dc.description.none.fl_txt_mv Despite the many coefficients accounting for the resemblance between pairs of objects based on presence/absence data, no one measure shows optimal characteristics. In this work the Positive Matching Index (PMI) is proposed as a new measure of similarity between lists of attributes. PMI fulfills the Tulloss' theoretical prerequisites for similarity coefficients, is easy to calculate and has an intrinsic meaning expressable into a natural language. PMI is bounded between 0 and 1 and represents the mean proportion of positive matches relative to the size of attribute lists, ranging this cardinality continuously from the smaller list to the larger one. PMI behaves correctly where alternative indices either fail, or only approximate to the desirable properties for a similarity index. Empirical examples associated to biomedical research are provided to show outperformance of PMI in relation to standard indices such as Jaccard and Dice coefficients.
Fil: Dos Santos, Daniel Andrés. Universidad Nacional de Tucumán. Facultad de Ciencias Naturales e Instituto Miguel Lillo; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Tucumán; Argentina
Fil: Deutsch, Reena. University of California at San Diego; Estados Unidos
description Despite the many coefficients accounting for the resemblance between pairs of objects based on presence/absence data, no one measure shows optimal characteristics. In this work the Positive Matching Index (PMI) is proposed as a new measure of similarity between lists of attributes. PMI fulfills the Tulloss' theoretical prerequisites for similarity coefficients, is easy to calculate and has an intrinsic meaning expressable into a natural language. PMI is bounded between 0 and 1 and represents the mean proportion of positive matches relative to the size of attribute lists, ranging this cardinality continuously from the smaller list to the larger one. PMI behaves correctly where alternative indices either fail, or only approximate to the desirable properties for a similarity index. Empirical examples associated to biomedical research are provided to show outperformance of PMI in relation to standard indices such as Jaccard and Dice coefficients.
publishDate 2010
dc.date.none.fl_str_mv 2010-09
dc.type.none.fl_str_mv info:eu-repo/semantics/article
info:eu-repo/semantics/publishedVersion
http://purl.org/coar/resource_type/c_6501
info:ar-repo/semantics/articulo
format article
status_str publishedVersion
dc.identifier.none.fl_str_mv http://hdl.handle.net/11336/75324
Dos Santos, Daniel Andrés; Deutsch, Reena; The Positive Matching Index: A new similarity measure with optimal characteristics; Elsevier Science; Pattern Recognition Letters; 31; 12; 9-2010; 1570-1576
0167-8655
CONICET Digital
CONICET
url http://hdl.handle.net/11336/75324
identifier_str_mv Dos Santos, Daniel Andrés; Deutsch, Reena; The Positive Matching Index: A new similarity measure with optimal characteristics; Elsevier Science; Pattern Recognition Letters; 31; 12; 9-2010; 1570-1576
0167-8655
CONICET Digital
CONICET
dc.language.none.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv info:eu-repo/semantics/altIdentifier/url/https://www.sciencedirect.com/science/article/pii/S0167865510000917
info:eu-repo/semantics/altIdentifier/doi/10.1016/j.patrec.2010.03.010
dc.rights.none.fl_str_mv info:eu-repo/semantics/openAccess
https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
eu_rights_str_mv openAccess
rights_invalid_str_mv https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
dc.format.none.fl_str_mv application/pdf
application/pdf
dc.publisher.none.fl_str_mv Elsevier Science
publisher.none.fl_str_mv Elsevier Science
dc.source.none.fl_str_mv reponame:CONICET Digital (CONICET)
instname:Consejo Nacional de Investigaciones Científicas y Técnicas
reponame_str CONICET Digital (CONICET)
collection CONICET Digital (CONICET)
instname_str Consejo Nacional de Investigaciones Científicas y Técnicas
repository.name.fl_str_mv CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicas
repository.mail.fl_str_mv dasensio@conicet.gov.ar; lcarlino@conicet.gov.ar
_version_ 1844613403416461312
score 13.070432