Generic LSH Families for the Angular Distance Based on Johnson-Lindenstrauss Projections and Feature Hashing LSH
- Autores
- Argerich, Luis; Golmar, Natalia
- Año de publicación
- 2017
- Idioma
- inglés
- Tipo de recurso
- documento de conferencia
- Estado
- versión publicada
- Descripción
- In this paper we propose the creation of generic LSH families for the angular distance based on Johnson-Lindenstrauss projections. We show that feature hashing is a valid J-L projection and propose two new LSH families based on feature hashing. These new LSH families are tested on both synthetic and real datasets with very good results and a considerable performance improvement over other LSH families. While the theoretical analysis is done for the angular distance, these families can also be used in practice for the euclidean distance with excellent results [2]. Our tests using real datasets show that the proposed LSH functions work well for the euclidean distance.
Sociedad Argentina de Informática e Investigación Operativa (SADIO) - Materia
-
Ciencias Informáticas
Locality Sensitive Hashing - Nivel de accesibilidad
- acceso abierto
- Condiciones de uso
- http://creativecommons.org/licenses/by-sa/3.0/
- Repositorio
- Institución
- Universidad Nacional de La Plata
- OAI Identificador
- oai:sedici.unlp.edu.ar:10915/63169
Ver los metadatos del registro completo
id |
SEDICI_c409c45bc5ad229d5a8d327da47166df |
---|---|
oai_identifier_str |
oai:sedici.unlp.edu.ar:10915/63169 |
network_acronym_str |
SEDICI |
repository_id_str |
1329 |
network_name_str |
SEDICI (UNLP) |
spelling |
Generic LSH Families for the Angular Distance Based on Johnson-Lindenstrauss Projections and Feature Hashing LSHArgerich, LuisGolmar, NataliaCiencias InformáticasLocality Sensitive HashingIn this paper we propose the creation of generic LSH families for the angular distance based on Johnson-Lindenstrauss projections. We show that feature hashing is a valid J-L projection and propose two new LSH families based on feature hashing. These new LSH families are tested on both synthetic and real datasets with very good results and a considerable performance improvement over other LSH families. While the theoretical analysis is done for the angular distance, these families can also be used in practice for the euclidean distance with excellent results [2]. Our tests using real datasets show that the proposed LSH functions work well for the euclidean distance.Sociedad Argentina de Informática e Investigación Operativa (SADIO)2017-09info:eu-repo/semantics/conferenceObjectinfo:eu-repo/semantics/publishedVersionObjeto de conferenciahttp://purl.org/coar/resource_type/c_5794info:ar-repo/semantics/documentoDeConferenciaapplication/pdf1-6http://sedici.unlp.edu.ar/handle/10915/63169enginfo:eu-repo/semantics/altIdentifier/url/http://www.clei2017-46jaiio.sadio.org.ar/sites/default/files/Mem/AGRANDA/AGRANDA-01.pdfinfo:eu-repo/semantics/altIdentifier/issn/2451-7569info:eu-repo/semantics/openAccesshttp://creativecommons.org/licenses/by-sa/3.0/Creative Commons Attribution-ShareAlike 3.0 Unported (CC BY-SA 3.0)reponame:SEDICI (UNLP)instname:Universidad Nacional de La Platainstacron:UNLP2025-10-15T11:00:47Zoai:sedici.unlp.edu.ar:10915/63169Institucionalhttp://sedici.unlp.edu.ar/Universidad públicaNo correspondehttp://sedici.unlp.edu.ar/oai/snrdalira@sedici.unlp.edu.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:13292025-10-15 11:00:48.132SEDICI (UNLP) - Universidad Nacional de La Platafalse |
dc.title.none.fl_str_mv |
Generic LSH Families for the Angular Distance Based on Johnson-Lindenstrauss Projections and Feature Hashing LSH |
title |
Generic LSH Families for the Angular Distance Based on Johnson-Lindenstrauss Projections and Feature Hashing LSH |
spellingShingle |
Generic LSH Families for the Angular Distance Based on Johnson-Lindenstrauss Projections and Feature Hashing LSH Argerich, Luis Ciencias Informáticas Locality Sensitive Hashing |
title_short |
Generic LSH Families for the Angular Distance Based on Johnson-Lindenstrauss Projections and Feature Hashing LSH |
title_full |
Generic LSH Families for the Angular Distance Based on Johnson-Lindenstrauss Projections and Feature Hashing LSH |
title_fullStr |
Generic LSH Families for the Angular Distance Based on Johnson-Lindenstrauss Projections and Feature Hashing LSH |
title_full_unstemmed |
Generic LSH Families for the Angular Distance Based on Johnson-Lindenstrauss Projections and Feature Hashing LSH |
title_sort |
Generic LSH Families for the Angular Distance Based on Johnson-Lindenstrauss Projections and Feature Hashing LSH |
dc.creator.none.fl_str_mv |
Argerich, Luis Golmar, Natalia |
author |
Argerich, Luis |
author_facet |
Argerich, Luis Golmar, Natalia |
author_role |
author |
author2 |
Golmar, Natalia |
author2_role |
author |
dc.subject.none.fl_str_mv |
Ciencias Informáticas Locality Sensitive Hashing |
topic |
Ciencias Informáticas Locality Sensitive Hashing |
dc.description.none.fl_txt_mv |
In this paper we propose the creation of generic LSH families for the angular distance based on Johnson-Lindenstrauss projections. We show that feature hashing is a valid J-L projection and propose two new LSH families based on feature hashing. These new LSH families are tested on both synthetic and real datasets with very good results and a considerable performance improvement over other LSH families. While the theoretical analysis is done for the angular distance, these families can also be used in practice for the euclidean distance with excellent results [2]. Our tests using real datasets show that the proposed LSH functions work well for the euclidean distance. Sociedad Argentina de Informática e Investigación Operativa (SADIO) |
description |
In this paper we propose the creation of generic LSH families for the angular distance based on Johnson-Lindenstrauss projections. We show that feature hashing is a valid J-L projection and propose two new LSH families based on feature hashing. These new LSH families are tested on both synthetic and real datasets with very good results and a considerable performance improvement over other LSH families. While the theoretical analysis is done for the angular distance, these families can also be used in practice for the euclidean distance with excellent results [2]. Our tests using real datasets show that the proposed LSH functions work well for the euclidean distance. |
publishDate |
2017 |
dc.date.none.fl_str_mv |
2017-09 |
dc.type.none.fl_str_mv |
info:eu-repo/semantics/conferenceObject info:eu-repo/semantics/publishedVersion Objeto de conferencia http://purl.org/coar/resource_type/c_5794 info:ar-repo/semantics/documentoDeConferencia |
format |
conferenceObject |
status_str |
publishedVersion |
dc.identifier.none.fl_str_mv |
http://sedici.unlp.edu.ar/handle/10915/63169 |
url |
http://sedici.unlp.edu.ar/handle/10915/63169 |
dc.language.none.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
info:eu-repo/semantics/altIdentifier/url/http://www.clei2017-46jaiio.sadio.org.ar/sites/default/files/Mem/AGRANDA/AGRANDA-01.pdf info:eu-repo/semantics/altIdentifier/issn/2451-7569 |
dc.rights.none.fl_str_mv |
info:eu-repo/semantics/openAccess http://creativecommons.org/licenses/by-sa/3.0/ Creative Commons Attribution-ShareAlike 3.0 Unported (CC BY-SA 3.0) |
eu_rights_str_mv |
openAccess |
rights_invalid_str_mv |
http://creativecommons.org/licenses/by-sa/3.0/ Creative Commons Attribution-ShareAlike 3.0 Unported (CC BY-SA 3.0) |
dc.format.none.fl_str_mv |
application/pdf 1-6 |
dc.source.none.fl_str_mv |
reponame:SEDICI (UNLP) instname:Universidad Nacional de La Plata instacron:UNLP |
reponame_str |
SEDICI (UNLP) |
collection |
SEDICI (UNLP) |
instname_str |
Universidad Nacional de La Plata |
instacron_str |
UNLP |
institution |
UNLP |
repository.name.fl_str_mv |
SEDICI (UNLP) - Universidad Nacional de La Plata |
repository.mail.fl_str_mv |
alira@sedici.unlp.edu.ar |
_version_ |
1846064058083573760 |
score |
13.22299 |