Genetic algorithms for topical web search: A study of different mutation rates

Autores
Cecchini, Rocío L.; Lorenzetti, Carlos M.; Maguitman, Ana Gabriela; Brignole, Nélida B.
Año de publicación
2007
Idioma
inglés
Tipo de recurso
documento de conferencia
Estado
versión publicada
Descripción
Harvesting topical content is a process that can be done by formulating topic-relevant queries and submitting them to a search engine. The quality of the material collected through this process is highly dependant on the vocabulary used to generate the search queries. In this scenario, selecting good query terms can be seen as an optimization problem where the objective function to be optimized is based on the effectiveness of a query to retrieve relevant material. Three characteristics of this optimization problem are (1) the high-dimensionality of the search space, where candidate solutions are queries and each term corresponds to a different dimension, (2) the existence of acceptable suboptimal solutions, and (3) the possibility of finding multiple solutions. This paper describes optimization techniques based on Genetic Algorithms to evolve “good query terms” in the context of a given topic. We discuss the use of a mutation pool to allow the generation of queries with novel terms, and study the effect of different mutation rates on the exploration of query-space.
Red de Universidades con Carreras en Informática (RedUNCI)
Materia
Ciencias Informáticas
Informática
Information Search and Retrieval
Search process
Query processing
topical web search
genetic algorithms
query formulation
query optimization
Nivel de accesibilidad
acceso abierto
Condiciones de uso
http://creativecommons.org/licenses/by-nc-sa/2.5/ar/
Repositorio
SEDICI (UNLP)
Institución
Universidad Nacional de La Plata
OAI Identificador
oai:sedici.unlp.edu.ar:10915/23579

id SEDICI_01279a7251ace3d61d707466d3851e80
oai_identifier_str oai:sedici.unlp.edu.ar:10915/23579
network_acronym_str SEDICI
repository_id_str 1329
network_name_str SEDICI (UNLP)
spelling Genetic algorithms for topical web search: A study of different mutation ratesCecchini, Rocío L.Lorenzetti, Carlos M.Maguitman, Ana GabrielaBrignole, Nélida B.Ciencias InformáticasInformáticaInformation Search and RetrievalSearch processQuery processingtopical web searchgenetic algorithmsquery formulationquery optimizationHarvesting topical content is a process that can be done by formulating topic-relevant queries and submitting them to a search engine. The quality of the material collected through this process is highly dependant on the vocabulary used to generate the search queries. In this scenario, selecting good query terms can be seen as an optimization problem where the objective function to be optimized is based on the effectiveness of a query to retrieve relevant material. Three characteristics of this optimization problem are (1) the high-dimensionality of the search space, where candidate solutions are queries and each term corresponds to a different dimension, (2) the existence of acceptable suboptimal solutions, and (3) the possibility of finding multiple solutions. This paper describes optimization techniques based on Genetic Algorithms to evolve “good query terms” in the context of a given topic. We discuss the use of a mutation pool to allow the generation of queries with novel terms, and study the effect of different mutation rates on the exploration of query-space.Red de Universidades con Carreras en Informática (RedUNCI)2007-10info:eu-repo/semantics/conferenceObjectinfo:eu-repo/semantics/publishedVersionObjeto de conferenciahttp://purl.org/coar/resource_type/c_5794info:ar-repo/semantics/documentoDeConferenciaapplication/pdf1585-1596http://sedici.unlp.edu.ar/handle/10915/23579enginfo:eu-repo/semantics/openAccesshttp://creativecommons.org/licenses/by-nc-sa/2.5/ar/Creative Commons Attribution-NonCommercial-ShareAlike 2.5 Argentina (CC BY-NC-SA 2.5)reponame:SEDICI (UNLP)instname:Universidad Nacional de La Platainstacron:UNLP2025-09-03T10:28:18Zoai:sedici.unlp.edu.ar:10915/23579Institucionalhttp://sedici.unlp.edu.ar/Universidad públicaNo correspondehttp://sedici.unlp.edu.ar/oai/snrdalira@sedici.unlp.edu.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:13292025-09-03 10:28:19.548SEDICI (UNLP) - Universidad Nacional de La Platafalse
dc.title.none.fl_str_mv Genetic algorithms for topical web search: A study of different mutation rates
title Genetic algorithms for topical web search: A study of different mutation rates
spellingShingle Genetic algorithms for topical web search: A study of different mutation rates
Cecchini, Rocío L.
Ciencias Informáticas
Informática
Information Search and Retrieval
Search process
Query processing
topical web search
genetic algorithms
query formulation
query optimization
title_short Genetic algorithms for topical web search: A study of different mutation rates
title_full Genetic algorithms for topical web search: A study of different mutation rates
title_fullStr Genetic algorithms for topical web search: A study of different mutation rates
title_full_unstemmed Genetic algorithms for topical web search: A study of different mutation rates
title_sort Genetic algorithms for topical web search: A study of different mutation rates
dc.creator.none.fl_str_mv Cecchini, Rocío L.
Lorenzetti, Carlos M.
Maguitman, Ana Gabriela
Brignole, Nélida B.
author Cecchini, Rocío L.
author_facet Cecchini, Rocío L.
Lorenzetti, Carlos M.
Maguitman, Ana Gabriela
Brignole, Nélida B.
author_role author
author2 Lorenzetti, Carlos M.
Maguitman, Ana Gabriela
Brignole, Nélida B.
author2_role author
author
author
dc.subject.none.fl_str_mv Ciencias Informáticas
Informática
Information Search and Retrieval
Search process
Query processing
topical web search
genetic algorithms
query formulation
query optimization
topic Ciencias Informáticas
Informática
Information Search and Retrieval
Search process
Query processing
topical web search
genetic algorithms
query formulation
query optimization
dc.description.none.fl_txt_mv Harvesting topical content is a process that can be done by formulating topic-relevant queries and submitting them to a search engine. The quality of the material collected through this process is highly dependant on the vocabulary used to generate the search queries. In this scenario, selecting good query terms can be seen as an optimization problem where the objective function to be optimized is based on the effectiveness of a query to retrieve relevant material. Three characteristics of this optimization problem are (1) the high-dimensionality of the search space, where candidate solutions are queries and each term corresponds to a different dimension, (2) the existence of acceptable suboptimal solutions, and (3) the possibility of finding multiple solutions. This paper describes optimization techniques based on Genetic Algorithms to evolve “good query terms” in the context of a given topic. We discuss the use of a mutation pool to allow the generation of queries with novel terms, and study the effect of different mutation rates on the exploration of query-space.
Red de Universidades con Carreras en Informática (RedUNCI)
description Harvesting topical content is a process that can be done by formulating topic-relevant queries and submitting them to a search engine. The quality of the material collected through this process is highly dependant on the vocabulary used to generate the search queries. In this scenario, selecting good query terms can be seen as an optimization problem where the objective function to be optimized is based on the effectiveness of a query to retrieve relevant material. Three characteristics of this optimization problem are (1) the high-dimensionality of the search space, where candidate solutions are queries and each term corresponds to a different dimension, (2) the existence of acceptable suboptimal solutions, and (3) the possibility of finding multiple solutions. This paper describes optimization techniques based on Genetic Algorithms to evolve “good query terms” in the context of a given topic. We discuss the use of a mutation pool to allow the generation of queries with novel terms, and study the effect of different mutation rates on the exploration of query-space.
publishDate 2007
dc.date.none.fl_str_mv 2007-10
dc.type.none.fl_str_mv info:eu-repo/semantics/conferenceObject
info:eu-repo/semantics/publishedVersion
Objeto de conferencia
http://purl.org/coar/resource_type/c_5794
info:ar-repo/semantics/documentoDeConferencia
format conferenceObject
status_str publishedVersion
dc.identifier.none.fl_str_mv http://sedici.unlp.edu.ar/handle/10915/23579
url http://sedici.unlp.edu.ar/handle/10915/23579
dc.language.none.fl_str_mv eng
language eng
dc.rights.none.fl_str_mv info:eu-repo/semantics/openAccess
http://creativecommons.org/licenses/by-nc-sa/2.5/ar/
Creative Commons Attribution-NonCommercial-ShareAlike 2.5 Argentina (CC BY-NC-SA 2.5)
eu_rights_str_mv openAccess
rights_invalid_str_mv http://creativecommons.org/licenses/by-nc-sa/2.5/ar/
Creative Commons Attribution-NonCommercial-ShareAlike 2.5 Argentina (CC BY-NC-SA 2.5)
dc.format.none.fl_str_mv application/pdf
1585-1596
dc.source.none.fl_str_mv reponame:SEDICI (UNLP)
instname:Universidad Nacional de La Plata
instacron:UNLP
reponame_str SEDICI (UNLP)
collection SEDICI (UNLP)
instname_str Universidad Nacional de La Plata
instacron_str UNLP
institution UNLP
repository.name.fl_str_mv SEDICI (UNLP) - Universidad Nacional de La Plata
repository.mail.fl_str_mv alira@sedici.unlp.edu.ar
_version_ 1842260121971851264
score 13.13397