Using Big Data Analysis to Improve Cache Performance in Search Engines

Autores: Tolosa, Gabriel Hernán; Feuerstein, Esteban
Año de publicación: 2015
Idioma: español castellano
Tipo de recurso: documento de conferencia
Estado: versión publicada
Descripción: Web Search Engines process huge amounts of data to support search but must run under strong performance requirements (to answer a query in a fraction of a second). To meet that performance they implement different optimization techniques such as caching, that may be implemented at several levels. One of these caching levels is the intersection cache, that attempts to exploit frequently occurring pairs of terms by keeping in the memory of the search node the results of intersecting the corresponding inverted lists. In this work we propose an optimization step to decide which items should be cached and which not by introducing the usage of data mining techniques. Our preliminary results show that it is possible to achieve extra cost savings in this already hyper-optimized field.
Sociedad Argentina de Informática e Investigación Operativa (SADIO)
Materia: Ciencias Informáticas
big data
Web Search Engines (WSE)
intersection caching
Search process
Nivel de accesibilidad: acceso abierto
Condiciones de uso: http://creativecommons.org/licenses/by-sa/3.0/
Repositorio
Institución: Universidad Nacional de La Plata
OAI Identificador: oai:sedici.unlp.edu.ar:10915/51952

Acceder

id	SEDICI_22746514526f0262d97e12244b41e126
oai_identifier_str	oai:sedici.unlp.edu.ar:10915/51952
network_acronym_str	SEDICI
repository_id_str	1329
network_name_str	SEDICI (UNLP)
spelling	Using Big Data Analysis to Improve Cache Performance in Search EnginesTolosa, Gabriel HernánFeuerstein, EstebanCiencias Informáticasbig dataWeb Search Engines (WSE)intersection cachingSearch processWeb Search Engines process huge amounts of data to support search but must run under strong performance requirements (to answer a query in a fraction of a second). To meet that performance they implement different optimization techniques such as caching, that may be implemented at several levels. One of these caching levels is the intersection cache, that attempts to exploit frequently occurring pairs of terms by keeping in the memory of the search node the results of intersecting the corresponding inverted lists. In this work we propose an optimization step to decide which items should be cached and which not by introducing the usage of data mining techniques. Our preliminary results show that it is possible to achieve extra cost savings in this already hyper-optimized field.Sociedad Argentina de Informática e Investigación Operativa (SADIO)2015-09info:eu-repo/semantics/conferenceObjectinfo:eu-repo/semantics/publishedVersionObjeto de conferenciahttp://purl.org/coar/resource_type/c_5794info:ar-repo/semantics/documentoDeConferenciaapplication/pdf7-10http://sedici.unlp.edu.ar/handle/10915/51952spainfo:eu-repo/semantics/altIdentifier/url/http://44jaiio.sadio.org.ar/sites/default/files/agranda7-10.pdfinfo:eu-repo/semantics/altIdentifier/issn/2451-7569info:eu-repo/semantics/openAccesshttp://creativecommons.org/licenses/by-sa/3.0/Creative Commons Attribution-ShareAlike 3.0 Unported (CC BY-SA 3.0)reponame:SEDICI (UNLP)instname:Universidad Nacional de La Platainstacron:UNLP2026-05-06T12:14:12Zoai:sedici.unlp.edu.ar:10915/51952Institucionalhttp://sedici.unlp.edu.ar/Universidad públicaNo correspondehttp://sedici.unlp.edu.ar/oai/snrdalira@sedici.unlp.edu.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:13292026-05-06 12:14:12.514SEDICI (UNLP) - Universidad Nacional de La Platafalse
dc.title.none.fl_str_mv	Using Big Data Analysis to Improve Cache Performance in Search Engines
title	Using Big Data Analysis to Improve Cache Performance in Search Engines
spellingShingle	Using Big Data Analysis to Improve Cache Performance in Search Engines Tolosa, Gabriel Hernán Ciencias Informáticas big data Web Search Engines (WSE) intersection caching Search process
title_short	Using Big Data Analysis to Improve Cache Performance in Search Engines
title_full	Using Big Data Analysis to Improve Cache Performance in Search Engines
title_fullStr	Using Big Data Analysis to Improve Cache Performance in Search Engines
title_full_unstemmed	Using Big Data Analysis to Improve Cache Performance in Search Engines
title_sort	Using Big Data Analysis to Improve Cache Performance in Search Engines
dc.creator.none.fl_str_mv	Tolosa, Gabriel Hernán Feuerstein, Esteban
author	Tolosa, Gabriel Hernán
author_facet	Tolosa, Gabriel Hernán Feuerstein, Esteban
author_role	author
author2	Feuerstein, Esteban
author2_role	author
dc.subject.none.fl_str_mv	Ciencias Informáticas big data Web Search Engines (WSE) intersection caching Search process
topic	Ciencias Informáticas big data Web Search Engines (WSE) intersection caching Search process
dc.description.none.fl_txt_mv	Web Search Engines process huge amounts of data to support search but must run under strong performance requirements (to answer a query in a fraction of a second). To meet that performance they implement different optimization techniques such as caching, that may be implemented at several levels. One of these caching levels is the intersection cache, that attempts to exploit frequently occurring pairs of terms by keeping in the memory of the search node the results of intersecting the corresponding inverted lists. In this work we propose an optimization step to decide which items should be cached and which not by introducing the usage of data mining techniques. Our preliminary results show that it is possible to achieve extra cost savings in this already hyper-optimized field. Sociedad Argentina de Informática e Investigación Operativa (SADIO)
description	Web Search Engines process huge amounts of data to support search but must run under strong performance requirements (to answer a query in a fraction of a second). To meet that performance they implement different optimization techniques such as caching, that may be implemented at several levels. One of these caching levels is the intersection cache, that attempts to exploit frequently occurring pairs of terms by keeping in the memory of the search node the results of intersecting the corresponding inverted lists. In this work we propose an optimization step to decide which items should be cached and which not by introducing the usage of data mining techniques. Our preliminary results show that it is possible to achieve extra cost savings in this already hyper-optimized field.
publishDate	2015
dc.date.none.fl_str_mv	2015-09
dc.type.none.fl_str_mv	info:eu-repo/semantics/conferenceObject info:eu-repo/semantics/publishedVersion Objeto de conferencia http://purl.org/coar/resource_type/c_5794 info:ar-repo/semantics/documentoDeConferencia
format	conferenceObject
status_str	publishedVersion
dc.identifier.none.fl_str_mv	http://sedici.unlp.edu.ar/handle/10915/51952
url	http://sedici.unlp.edu.ar/handle/10915/51952
dc.language.none.fl_str_mv	spa
language	spa
dc.relation.none.fl_str_mv	info:eu-repo/semantics/altIdentifier/url/http://44jaiio.sadio.org.ar/sites/default/files/agranda7-10.pdf info:eu-repo/semantics/altIdentifier/issn/2451-7569
dc.rights.none.fl_str_mv	info:eu-repo/semantics/openAccess http://creativecommons.org/licenses/by-sa/3.0/ Creative Commons Attribution-ShareAlike 3.0 Unported (CC BY-SA 3.0)
eu_rights_str_mv	openAccess
rights_invalid_str_mv	http://creativecommons.org/licenses/by-sa/3.0/ Creative Commons Attribution-ShareAlike 3.0 Unported (CC BY-SA 3.0)
dc.format.none.fl_str_mv	application/pdf 7-10
dc.source.none.fl_str_mv	reponame:SEDICI (UNLP) instname:Universidad Nacional de La Plata instacron:UNLP
reponame_str	SEDICI (UNLP)
collection	SEDICI (UNLP)
instname_str	Universidad Nacional de La Plata
instacron_str	UNLP
institution	UNLP
repository.name.fl_str_mv	SEDICI (UNLP) - Universidad Nacional de La Plata
repository.mail.fl_str_mv	alira@sedici.unlp.edu.ar
_version_	1864468292454842368
score	13.1485815

Using Big Data Analysis to Improve Cache Performance in Search Engines

Publicaciones similares