Reducing hardware hit by queries in web search engines

Autores: Mendoza, Marcelo; Marin, Mauricio; Gil Costa, Graciela Verónica; Ferrarotti, Flavio
Año de publicación: 2016
Idioma: inglés
Tipo de recurso: artículo
Estado: versión publicada
Descripción: In this paper, we introduce a new collection selection strategy to be operated in search engines with document partitioned indexes. Our method involves the selection of those document partitions that are most likely to deliver the best results to the formulated queries, reducing the number of queries that are submitted to each partition. This method employs learning algorithms that are capable of ranking the partitions, maximizing the probability of recovering documents with high gain. The method operates by building vector representations of each partition on the term space that is spanned by the queries. The proposed method is able to generalize to new queries and elaborate document lists with high precision for queries not considered during the training phase. To update the representations of each partition, our method employs incremental learning strategies. Beginning with an inversion test of the partition lists, we identify queries that contribute with new information and add them to the training phase. The experimental results show that our collection selection method favorably compares with state-of-the-art methods. In addition our method achieves a suitable performance with low parameter sensitivity making it applicable to search engines with hundreds of partitions.
Fil: Mendoza, Marcelo. Universidad Técnica Federico Santa María; Chile
Fil: Marin, Mauricio. Universidad de Santiago de Chile; Chile
Fil: Gil Costa, Graciela Verónica. Universidad Nacional de San Luis; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina
Fil: Ferrarotti, Flavio. Software Competence Center Hagenberg; Austria
Materia: Distributed Information Retrieval
Incremental Learning
Query Routing
Nivel de accesibilidad: acceso abierto
Condiciones de uso: https://creativecommons.org/licenses/by-nc-nd/2.5/ar/
Repositorio
Institución: Consejo Nacional de Investigaciones Científicas y Técnicas
OAI Identificador: oai:ri.conicet.gov.ar:11336/60466

Acceder

id	CONICETDig_cf0e49ee97fb9310f16709a1c4948672
oai_identifier_str	oai:ri.conicet.gov.ar:11336/60466
network_acronym_str	CONICETDig
repository_id_str	3498
network_name_str	CONICET Digital (CONICET)
spelling	Reducing hardware hit by queries in web search enginesMendoza, MarceloMarin, MauricioGil Costa, Graciela VerónicaFerrarotti, FlavioDistributed Information RetrievalIncremental LearningQuery Routinghttps://purl.org/becyt/ford/1.2https://purl.org/becyt/ford/1In this paper, we introduce a new collection selection strategy to be operated in search engines with document partitioned indexes. Our method involves the selection of those document partitions that are most likely to deliver the best results to the formulated queries, reducing the number of queries that are submitted to each partition. This method employs learning algorithms that are capable of ranking the partitions, maximizing the probability of recovering documents with high gain. The method operates by building vector representations of each partition on the term space that is spanned by the queries. The proposed method is able to generalize to new queries and elaborate document lists with high precision for queries not considered during the training phase. To update the representations of each partition, our method employs incremental learning strategies. Beginning with an inversion test of the partition lists, we identify queries that contribute with new information and add them to the training phase. The experimental results show that our collection selection method favorably compares with state-of-the-art methods. In addition our method achieves a suitable performance with low parameter sensitivity making it applicable to search engines with hundreds of partitions.Fil: Mendoza, Marcelo. Universidad Técnica Federico Santa María; ChileFil: Marin, Mauricio. Universidad de Santiago de Chile; ChileFil: Gil Costa, Graciela Verónica. Universidad Nacional de San Luis; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; ArgentinaFil: Ferrarotti, Flavio. Software Competence Center Hagenberg; AustriaPergamon-Elsevier Science Ltd2016-11info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionhttp://purl.org/coar/resource_type/c_6501info:ar-repo/semantics/articuloapplication/pdfapplication/pdfhttp://hdl.handle.net/11336/60466Mendoza, Marcelo; Marin, Mauricio; Gil Costa, Graciela Verónica; Ferrarotti, Flavio; Reducing hardware hit by queries in web search engines; Pergamon-Elsevier Science Ltd; Information Processing & Management; 52; 6; 11-2016; 1031-10520306-4573CONICET DigitalCONICETenginfo:eu-repo/semantics/altIdentifier/doi/10.1016/j.ipm.2016.04.008info:eu-repo/semantics/altIdentifier/url/https://www.sciencedirect.com/science/article/pii/S0306457316300899info:eu-repo/semantics/openAccesshttps://creativecommons.org/licenses/by-nc-nd/2.5/ar/reponame:CONICET Digital (CONICET)instname:Consejo Nacional de Investigaciones Científicas y Técnicas2026-06-04T10:56:13Zoai:ri.conicet.gov.ar:11336/60466instacron:CONICETInstitucionalhttp://ri.conicet.gov.ar/Organismo científico-tecnológicoNo correspondehttp://ri.conicet.gov.ar/oai/requestdasensio@conicet.gov.ar; lcarlino@conicet.gov.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:34982026-06-04 10:56:13.81CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicasfalse
dc.title.none.fl_str_mv	Reducing hardware hit by queries in web search engines
title	Reducing hardware hit by queries in web search engines
spellingShingle	Reducing hardware hit by queries in web search engines Mendoza, Marcelo Distributed Information Retrieval Incremental Learning Query Routing
title_short	Reducing hardware hit by queries in web search engines
title_full	Reducing hardware hit by queries in web search engines
title_fullStr	Reducing hardware hit by queries in web search engines
title_full_unstemmed	Reducing hardware hit by queries in web search engines
title_sort	Reducing hardware hit by queries in web search engines
dc.creator.none.fl_str_mv	Mendoza, Marcelo Marin, Mauricio Gil Costa, Graciela Verónica Ferrarotti, Flavio
author	Mendoza, Marcelo
author_facet	Mendoza, Marcelo Marin, Mauricio Gil Costa, Graciela Verónica Ferrarotti, Flavio
author_role	author
author2	Marin, Mauricio Gil Costa, Graciela Verónica Ferrarotti, Flavio
author2_role	author author author
dc.subject.none.fl_str_mv	Distributed Information Retrieval Incremental Learning Query Routing
topic	Distributed Information Retrieval Incremental Learning Query Routing
purl_subject.fl_str_mv	https://purl.org/becyt/ford/1.2 https://purl.org/becyt/ford/1
dc.description.none.fl_txt_mv	In this paper, we introduce a new collection selection strategy to be operated in search engines with document partitioned indexes. Our method involves the selection of those document partitions that are most likely to deliver the best results to the formulated queries, reducing the number of queries that are submitted to each partition. This method employs learning algorithms that are capable of ranking the partitions, maximizing the probability of recovering documents with high gain. The method operates by building vector representations of each partition on the term space that is spanned by the queries. The proposed method is able to generalize to new queries and elaborate document lists with high precision for queries not considered during the training phase. To update the representations of each partition, our method employs incremental learning strategies. Beginning with an inversion test of the partition lists, we identify queries that contribute with new information and add them to the training phase. The experimental results show that our collection selection method favorably compares with state-of-the-art methods. In addition our method achieves a suitable performance with low parameter sensitivity making it applicable to search engines with hundreds of partitions. Fil: Mendoza, Marcelo. Universidad Técnica Federico Santa María; Chile Fil: Marin, Mauricio. Universidad de Santiago de Chile; Chile Fil: Gil Costa, Graciela Verónica. Universidad Nacional de San Luis; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina Fil: Ferrarotti, Flavio. Software Competence Center Hagenberg; Austria
description	In this paper, we introduce a new collection selection strategy to be operated in search engines with document partitioned indexes. Our method involves the selection of those document partitions that are most likely to deliver the best results to the formulated queries, reducing the number of queries that are submitted to each partition. This method employs learning algorithms that are capable of ranking the partitions, maximizing the probability of recovering documents with high gain. The method operates by building vector representations of each partition on the term space that is spanned by the queries. The proposed method is able to generalize to new queries and elaborate document lists with high precision for queries not considered during the training phase. To update the representations of each partition, our method employs incremental learning strategies. Beginning with an inversion test of the partition lists, we identify queries that contribute with new information and add them to the training phase. The experimental results show that our collection selection method favorably compares with state-of-the-art methods. In addition our method achieves a suitable performance with low parameter sensitivity making it applicable to search engines with hundreds of partitions.
publishDate	2016
dc.date.none.fl_str_mv	2016-11
dc.type.none.fl_str_mv	info:eu-repo/semantics/article info:eu-repo/semantics/publishedVersion http://purl.org/coar/resource_type/c_6501 info:ar-repo/semantics/articulo
format	article
status_str	publishedVersion
dc.identifier.none.fl_str_mv	http://hdl.handle.net/11336/60466 Mendoza, Marcelo; Marin, Mauricio; Gil Costa, Graciela Verónica; Ferrarotti, Flavio; Reducing hardware hit by queries in web search engines; Pergamon-Elsevier Science Ltd; Information Processing & Management; 52; 6; 11-2016; 1031-1052 0306-4573 CONICET Digital CONICET
url	http://hdl.handle.net/11336/60466
identifier_str_mv	Mendoza, Marcelo; Marin, Mauricio; Gil Costa, Graciela Verónica; Ferrarotti, Flavio; Reducing hardware hit by queries in web search engines; Pergamon-Elsevier Science Ltd; Information Processing & Management; 52; 6; 11-2016; 1031-1052 0306-4573 CONICET Digital CONICET
dc.language.none.fl_str_mv	eng
language	eng
dc.relation.none.fl_str_mv	info:eu-repo/semantics/altIdentifier/doi/10.1016/j.ipm.2016.04.008 info:eu-repo/semantics/altIdentifier/url/https://www.sciencedirect.com/science/article/pii/S0306457316300899
dc.rights.none.fl_str_mv	info:eu-repo/semantics/openAccess https://creativecommons.org/licenses/by-nc-nd/2.5/ar/
eu_rights_str_mv	openAccess
rights_invalid_str_mv	https://creativecommons.org/licenses/by-nc-nd/2.5/ar/
dc.format.none.fl_str_mv	application/pdf application/pdf
dc.publisher.none.fl_str_mv	Pergamon-Elsevier Science Ltd
publisher.none.fl_str_mv	Pergamon-Elsevier Science Ltd
dc.source.none.fl_str_mv	reponame:CONICET Digital (CONICET) instname:Consejo Nacional de Investigaciones Científicas y Técnicas
reponame_str	CONICET Digital (CONICET)
collection	CONICET Digital (CONICET)
instname_str	Consejo Nacional de Investigaciones Científicas y Técnicas
repository.name.fl_str_mv	CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicas
repository.mail.fl_str_mv	dasensio@conicet.gov.ar; lcarlino@conicet.gov.ar
_version_	1867098485395816448
score	12.832306

Reducing hardware hit by queries in web search engines

Publicaciones similares