Similarity Searching using Hybrid Technique Permutation Graph and Clustering
- Autores
- Rocha, Gerardo; Figueroa, Karina; Reyes, Nora Susana
- Año de publicación
- 2025
- Idioma
- inglés
- Tipo de recurso
- documento de conferencia
- Estado
- versión publicada
- Descripción
- Similarity searches aim to find the elements most similar to a query within a database. One way to model this is by only considering the distance between the searched element and the rest of the database, which allows it to be defined as a metric space. The advantage is that large amounts of training data are not required, which is the main challenge today. The obvious strategy is to compare the entire database; however, with computationally expensive distances, this can consume time and resources. This work proposes using a data structure to perform searches on it, to reduce the number of distance calculations used. In particular, we propose combining strategies that are proven to be efficient: algorithms based on permutations and those based on clustering and graphs. Experiments show that we can achieve reductions of up to 20% on the number of distance calculations needed for the permutation based algorithm.
Red de Universidades con Carreras en Informática - Materia
-
Ciencias Informáticas
Similarity Search
Metric Space
Distance Calculations - Nivel de accesibilidad
- acceso abierto
- Condiciones de uso
- http://creativecommons.org/licenses/by-nc-sa/4.0/
- Repositorio
.jpg)
- Institución
- Universidad Nacional de La Plata
- OAI Identificador
- oai:sedici.unlp.edu.ar:10915/191306
Ver los metadatos del registro completo
| id |
SEDICI_37f6c77452bc4c510da5f00b7b8f43af |
|---|---|
| oai_identifier_str |
oai:sedici.unlp.edu.ar:10915/191306 |
| network_acronym_str |
SEDICI |
| repository_id_str |
1329 |
| network_name_str |
SEDICI (UNLP) |
| spelling |
Similarity Searching using Hybrid Technique Permutation Graph and ClusteringRocha, GerardoFigueroa, KarinaReyes, Nora SusanaCiencias InformáticasSimilarity SearchMetric SpaceDistance CalculationsSimilarity searches aim to find the elements most similar to a query within a database. One way to model this is by only considering the distance between the searched element and the rest of the database, which allows it to be defined as a metric space. The advantage is that large amounts of training data are not required, which is the main challenge today. The obvious strategy is to compare the entire database; however, with computationally expensive distances, this can consume time and resources. This work proposes using a data structure to perform searches on it, to reduce the number of distance calculations used. In particular, we propose combining strategies that are proven to be efficient: algorithms based on permutations and those based on clustering and graphs. Experiments show that we can achieve reductions of up to 20% on the number of distance calculations needed for the permutation based algorithm.Red de Universidades con Carreras en Informática2025-10info:eu-repo/semantics/conferenceObjectinfo:eu-repo/semantics/publishedVersionObjeto de conferenciahttp://purl.org/coar/resource_type/c_5794info:ar-repo/semantics/documentoDeConferenciaapplication/pdf527-537http://sedici.unlp.edu.ar/handle/10915/191306enginfo:eu-repo/semantics/altIdentifier/isbn/978-987-8258-99-7info:eu-repo/semantics/reference/hdl/10915/189846info:eu-repo/semantics/openAccesshttp://creativecommons.org/licenses/by-nc-sa/4.0/Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)reponame:SEDICI (UNLP)instname:Universidad Nacional de La Platainstacron:UNLP2026-03-26T09:21:32Zoai:sedici.unlp.edu.ar:10915/191306Institucionalhttp://sedici.unlp.edu.ar/Universidad públicaNo correspondehttp://sedici.unlp.edu.ar/oai/snrdalira@sedici.unlp.edu.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:13292026-03-26 09:21:33.191SEDICI (UNLP) - Universidad Nacional de La Platafalse |
| dc.title.none.fl_str_mv |
Similarity Searching using Hybrid Technique Permutation Graph and Clustering |
| title |
Similarity Searching using Hybrid Technique Permutation Graph and Clustering |
| spellingShingle |
Similarity Searching using Hybrid Technique Permutation Graph and Clustering Rocha, Gerardo Ciencias Informáticas Similarity Search Metric Space Distance Calculations |
| title_short |
Similarity Searching using Hybrid Technique Permutation Graph and Clustering |
| title_full |
Similarity Searching using Hybrid Technique Permutation Graph and Clustering |
| title_fullStr |
Similarity Searching using Hybrid Technique Permutation Graph and Clustering |
| title_full_unstemmed |
Similarity Searching using Hybrid Technique Permutation Graph and Clustering |
| title_sort |
Similarity Searching using Hybrid Technique Permutation Graph and Clustering |
| dc.creator.none.fl_str_mv |
Rocha, Gerardo Figueroa, Karina Reyes, Nora Susana |
| author |
Rocha, Gerardo |
| author_facet |
Rocha, Gerardo Figueroa, Karina Reyes, Nora Susana |
| author_role |
author |
| author2 |
Figueroa, Karina Reyes, Nora Susana |
| author2_role |
author author |
| dc.subject.none.fl_str_mv |
Ciencias Informáticas Similarity Search Metric Space Distance Calculations |
| topic |
Ciencias Informáticas Similarity Search Metric Space Distance Calculations |
| dc.description.none.fl_txt_mv |
Similarity searches aim to find the elements most similar to a query within a database. One way to model this is by only considering the distance between the searched element and the rest of the database, which allows it to be defined as a metric space. The advantage is that large amounts of training data are not required, which is the main challenge today. The obvious strategy is to compare the entire database; however, with computationally expensive distances, this can consume time and resources. This work proposes using a data structure to perform searches on it, to reduce the number of distance calculations used. In particular, we propose combining strategies that are proven to be efficient: algorithms based on permutations and those based on clustering and graphs. Experiments show that we can achieve reductions of up to 20% on the number of distance calculations needed for the permutation based algorithm. Red de Universidades con Carreras en Informática |
| description |
Similarity searches aim to find the elements most similar to a query within a database. One way to model this is by only considering the distance between the searched element and the rest of the database, which allows it to be defined as a metric space. The advantage is that large amounts of training data are not required, which is the main challenge today. The obvious strategy is to compare the entire database; however, with computationally expensive distances, this can consume time and resources. This work proposes using a data structure to perform searches on it, to reduce the number of distance calculations used. In particular, we propose combining strategies that are proven to be efficient: algorithms based on permutations and those based on clustering and graphs. Experiments show that we can achieve reductions of up to 20% on the number of distance calculations needed for the permutation based algorithm. |
| publishDate |
2025 |
| dc.date.none.fl_str_mv |
2025-10 |
| dc.type.none.fl_str_mv |
info:eu-repo/semantics/conferenceObject info:eu-repo/semantics/publishedVersion Objeto de conferencia http://purl.org/coar/resource_type/c_5794 info:ar-repo/semantics/documentoDeConferencia |
| format |
conferenceObject |
| status_str |
publishedVersion |
| dc.identifier.none.fl_str_mv |
http://sedici.unlp.edu.ar/handle/10915/191306 |
| url |
http://sedici.unlp.edu.ar/handle/10915/191306 |
| dc.language.none.fl_str_mv |
eng |
| language |
eng |
| dc.relation.none.fl_str_mv |
info:eu-repo/semantics/altIdentifier/isbn/978-987-8258-99-7 info:eu-repo/semantics/reference/hdl/10915/189846 |
| dc.rights.none.fl_str_mv |
info:eu-repo/semantics/openAccess http://creativecommons.org/licenses/by-nc-sa/4.0/ Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) |
| eu_rights_str_mv |
openAccess |
| rights_invalid_str_mv |
http://creativecommons.org/licenses/by-nc-sa/4.0/ Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) |
| dc.format.none.fl_str_mv |
application/pdf 527-537 |
| dc.source.none.fl_str_mv |
reponame:SEDICI (UNLP) instname:Universidad Nacional de La Plata instacron:UNLP |
| reponame_str |
SEDICI (UNLP) |
| collection |
SEDICI (UNLP) |
| instname_str |
Universidad Nacional de La Plata |
| instacron_str |
UNLP |
| institution |
UNLP |
| repository.name.fl_str_mv |
SEDICI (UNLP) - Universidad Nacional de La Plata |
| repository.mail.fl_str_mv |
alira@sedici.unlp.edu.ar |
| _version_ |
1860736630977986560 |
| score |
12.977003 |