A parallel view for search engines

Autores
Gil Costa, Graciela Verónica; Persico, Andrea; Printista, Alicia Marcela
Año de publicación
2005
Idioma
inglés
Tipo de recurso
documento de conferencia
Estado
versión publicada
Descripción
To engineer a search engine is a challenging task. Search engines index tens to hundreds of millions of web pages involving a comparable number of distinct terms. They answer tens of n-tillions of queries every day. Despite the importance of large-scale search engines on the Web, very little academic research has been done on them. Furthermore, due to rapid advance in technology and web proliferation, creating a web search engine today is very different from years ago. In most papers the index simply ”is”, without discussion of how it was created. But for a indexing scheme to be useful it must be possible for the index to be constructed in a reasonable amount of time, and so papers describing complex indexing methods should also describe and analyze a mechanism whereby the index can be built. Scalability is of concern during index construction as well as during query processing. This paper describes the cooperative work between the Crawler, Indexer and the Searcher.
VI Workshop de Procesamiento Distribuido y Paralelo (WPDP)
Red de Universidades con Carreras en Informática (RedUNCI)
Materia
Ciencias Informáticas
Indexing methods
Motor de Búsqueda
Search process
Nivel de accesibilidad
acceso abierto
Condiciones de uso
http://creativecommons.org/licenses/by-nc-sa/2.5/ar/
Repositorio
SEDICI (UNLP)
Institución
Universidad Nacional de La Plata
OAI Identificador
oai:sedici.unlp.edu.ar:10915/23175

id SEDICI_61354a086859df47a6e18f1b3d3d4d24
oai_identifier_str oai:sedici.unlp.edu.ar:10915/23175
network_acronym_str SEDICI
repository_id_str 1329
network_name_str SEDICI (UNLP)
spelling A parallel view for search enginesGil Costa, Graciela VerónicaPersico, AndreaPrintista, Alicia MarcelaCiencias InformáticasIndexing methodsMotor de BúsquedaSearch processTo engineer a search engine is a challenging task. Search engines index tens to hundreds of millions of web pages involving a comparable number of distinct terms. They answer tens of n-tillions of queries every day. Despite the importance of large-scale search engines on the Web, very little academic research has been done on them. Furthermore, due to rapid advance in technology and web proliferation, creating a web search engine today is very different from years ago. In most papers the index simply ”is”, without discussion of how it was created. But for a indexing scheme to be useful it must be possible for the index to be constructed in a reasonable amount of time, and so papers describing complex indexing methods should also describe and analyze a mechanism whereby the index can be built. Scalability is of concern during index construction as well as during query processing. This paper describes the cooperative work between the Crawler, Indexer and the Searcher.VI Workshop de Procesamiento Distribuido y Paralelo (WPDP)Red de Universidades con Carreras en Informática (RedUNCI)2005-10info:eu-repo/semantics/conferenceObjectinfo:eu-repo/semantics/publishedVersionObjeto de conferenciahttp://purl.org/coar/resource_type/c_5794info:ar-repo/semantics/documentoDeConferenciaapplication/pdfhttp://sedici.unlp.edu.ar/handle/10915/23175enginfo:eu-repo/semantics/openAccesshttp://creativecommons.org/licenses/by-nc-sa/2.5/ar/Creative Commons Attribution-NonCommercial-ShareAlike 2.5 Argentina (CC BY-NC-SA 2.5)reponame:SEDICI (UNLP)instname:Universidad Nacional de La Platainstacron:UNLP2025-09-03T10:28:07Zoai:sedici.unlp.edu.ar:10915/23175Institucionalhttp://sedici.unlp.edu.ar/Universidad públicaNo correspondehttp://sedici.unlp.edu.ar/oai/snrdalira@sedici.unlp.edu.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:13292025-09-03 10:28:08.277SEDICI (UNLP) - Universidad Nacional de La Platafalse
dc.title.none.fl_str_mv A parallel view for search engines
title A parallel view for search engines
spellingShingle A parallel view for search engines
Gil Costa, Graciela Verónica
Ciencias Informáticas
Indexing methods
Motor de Búsqueda
Search process
title_short A parallel view for search engines
title_full A parallel view for search engines
title_fullStr A parallel view for search engines
title_full_unstemmed A parallel view for search engines
title_sort A parallel view for search engines
dc.creator.none.fl_str_mv Gil Costa, Graciela Verónica
Persico, Andrea
Printista, Alicia Marcela
author Gil Costa, Graciela Verónica
author_facet Gil Costa, Graciela Verónica
Persico, Andrea
Printista, Alicia Marcela
author_role author
author2 Persico, Andrea
Printista, Alicia Marcela
author2_role author
author
dc.subject.none.fl_str_mv Ciencias Informáticas
Indexing methods
Motor de Búsqueda
Search process
topic Ciencias Informáticas
Indexing methods
Motor de Búsqueda
Search process
dc.description.none.fl_txt_mv To engineer a search engine is a challenging task. Search engines index tens to hundreds of millions of web pages involving a comparable number of distinct terms. They answer tens of n-tillions of queries every day. Despite the importance of large-scale search engines on the Web, very little academic research has been done on them. Furthermore, due to rapid advance in technology and web proliferation, creating a web search engine today is very different from years ago. In most papers the index simply ”is”, without discussion of how it was created. But for a indexing scheme to be useful it must be possible for the index to be constructed in a reasonable amount of time, and so papers describing complex indexing methods should also describe and analyze a mechanism whereby the index can be built. Scalability is of concern during index construction as well as during query processing. This paper describes the cooperative work between the Crawler, Indexer and the Searcher.
VI Workshop de Procesamiento Distribuido y Paralelo (WPDP)
Red de Universidades con Carreras en Informática (RedUNCI)
description To engineer a search engine is a challenging task. Search engines index tens to hundreds of millions of web pages involving a comparable number of distinct terms. They answer tens of n-tillions of queries every day. Despite the importance of large-scale search engines on the Web, very little academic research has been done on them. Furthermore, due to rapid advance in technology and web proliferation, creating a web search engine today is very different from years ago. In most papers the index simply ”is”, without discussion of how it was created. But for a indexing scheme to be useful it must be possible for the index to be constructed in a reasonable amount of time, and so papers describing complex indexing methods should also describe and analyze a mechanism whereby the index can be built. Scalability is of concern during index construction as well as during query processing. This paper describes the cooperative work between the Crawler, Indexer and the Searcher.
publishDate 2005
dc.date.none.fl_str_mv 2005-10
dc.type.none.fl_str_mv info:eu-repo/semantics/conferenceObject
info:eu-repo/semantics/publishedVersion
Objeto de conferencia
http://purl.org/coar/resource_type/c_5794
info:ar-repo/semantics/documentoDeConferencia
format conferenceObject
status_str publishedVersion
dc.identifier.none.fl_str_mv http://sedici.unlp.edu.ar/handle/10915/23175
url http://sedici.unlp.edu.ar/handle/10915/23175
dc.language.none.fl_str_mv eng
language eng
dc.rights.none.fl_str_mv info:eu-repo/semantics/openAccess
http://creativecommons.org/licenses/by-nc-sa/2.5/ar/
Creative Commons Attribution-NonCommercial-ShareAlike 2.5 Argentina (CC BY-NC-SA 2.5)
eu_rights_str_mv openAccess
rights_invalid_str_mv http://creativecommons.org/licenses/by-nc-sa/2.5/ar/
Creative Commons Attribution-NonCommercial-ShareAlike 2.5 Argentina (CC BY-NC-SA 2.5)
dc.format.none.fl_str_mv application/pdf
dc.source.none.fl_str_mv reponame:SEDICI (UNLP)
instname:Universidad Nacional de La Plata
instacron:UNLP
reponame_str SEDICI (UNLP)
collection SEDICI (UNLP)
instname_str Universidad Nacional de La Plata
instacron_str UNLP
institution UNLP
repository.name.fl_str_mv SEDICI (UNLP) - Universidad Nacional de La Plata
repository.mail.fl_str_mv alira@sedici.unlp.edu.ar
_version_ 1842260120174592000
score 13.13397