Modelling Efficient Novelty-based Search Result Diversification in Metric Spaces
- Autores
- Gil Costa, Graciela Verónica; Santos, Rodrygo L. T.; Macdonald, Craig; Ounis, Iadh
- Año de publicación
- 2013
- Idioma
- inglés
- Tipo de recurso
- artículo
- Estado
- versión publicada
- Descripción
- Novelty-based diversification provides a way to tackle ambiguous queries by re-ranking a set of retrieved documents. Current approaches are typically greedy, requiring O(n2) document–document comparisons in order to diversify a ranking of n documents. In this article, we introduce a new approach for novelty-based search result diversification to reduce the overhead incurred by document–document comparisons. To this end, we model novelty promotion as a similarity search in a metric space, exploiting the properties of this space to efficiently identify novel documents. We investigate three different approaches: pivoting-based, clustering-based, and permutation-based. In the first two, a novel document is one that lies outside the range of a pivot or outside a cluster. In the latter, a novel document is one that has a different signature (i.e., the documentʼs relative distance to a distinguished set of fixed objects called permutants) compared to previously selected documents. Thorough experiments using two TREC test collections for diversity evaluation, as well as a large sample of the query stream of a commercial search engine show that our approaches perform at least as effectively as well-known novelty-based diversification approaches in the literature, while dramatically improving their efficiency.
Fil: Gil Costa, Graciela Verónica. Yahoo; México. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico San Luis; Argentina
Fil: Santos, Rodrygo L. T.. University Of Glasgow; Reino Unido
Fil: Macdonald, Craig. University Of Glasgow; Reino Unido
Fil: Ounis, Iadh. University Of Glasgow; Reino Unido - Materia
-
Similarity Search
Diverification - Nivel de accesibilidad
- acceso abierto
- Condiciones de uso
- https://creativecommons.org/licenses/by-nc-nd/2.5/ar/
- Repositorio
- Institución
- Consejo Nacional de Investigaciones Científicas y Técnicas
- OAI Identificador
- oai:ri.conicet.gov.ar:11336/7075
Ver los metadatos del registro completo
id |
CONICETDig_140cf715b87d675670815af1f391c51d |
---|---|
oai_identifier_str |
oai:ri.conicet.gov.ar:11336/7075 |
network_acronym_str |
CONICETDig |
repository_id_str |
3498 |
network_name_str |
CONICET Digital (CONICET) |
spelling |
Modelling Efficient Novelty-based Search Result Diversification in Metric SpacesGil Costa, Graciela VerónicaSantos, Rodrygo L. T.Macdonald, CraigOunis, IadhSimilarity SearchDiverificationhttps://purl.org/becyt/ford/2.2https://purl.org/becyt/ford/2Novelty-based diversification provides a way to tackle ambiguous queries by re-ranking a set of retrieved documents. Current approaches are typically greedy, requiring O(n2) document–document comparisons in order to diversify a ranking of n documents. In this article, we introduce a new approach for novelty-based search result diversification to reduce the overhead incurred by document–document comparisons. To this end, we model novelty promotion as a similarity search in a metric space, exploiting the properties of this space to efficiently identify novel documents. We investigate three different approaches: pivoting-based, clustering-based, and permutation-based. In the first two, a novel document is one that lies outside the range of a pivot or outside a cluster. In the latter, a novel document is one that has a different signature (i.e., the documentʼs relative distance to a distinguished set of fixed objects called permutants) compared to previously selected documents. Thorough experiments using two TREC test collections for diversity evaluation, as well as a large sample of the query stream of a commercial search engine show that our approaches perform at least as effectively as well-known novelty-based diversification approaches in the literature, while dramatically improving their efficiency.Fil: Gil Costa, Graciela Verónica. Yahoo; México. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico San Luis; ArgentinaFil: Santos, Rodrygo L. T.. University Of Glasgow; Reino UnidoFil: Macdonald, Craig. University Of Glasgow; Reino UnidoFil: Ounis, Iadh. University Of Glasgow; Reino UnidoElsevier2013-01info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionhttp://purl.org/coar/resource_type/c_6501info:ar-repo/semantics/articuloapplication/pdfapplication/pdfapplication/pdfhttp://hdl.handle.net/11336/7075Gil Costa, Graciela Verónica; Santos, Rodrygo L. T.; Macdonald, Craig; Ounis, Iadh; Modelling Efficient Novelty-based Search Result Diversification in Metric Spaces; Elsevier; Journal of Discrete Algorithms; 18; 1-2013; 75-881570-8667enginfo:eu-repo/semantics/altIdentifier/url/http://www.sciencedirect.com/science/article/pii/S1570866712001074info:eu-repo/semantics/altIdentifier/doi/info:eu-repo/semantics/altIdentifier/doi/10.1016/j.jda.2012.07.004info:eu-repo/semantics/openAccesshttps://creativecommons.org/licenses/by-nc-nd/2.5/ar/reponame:CONICET Digital (CONICET)instname:Consejo Nacional de Investigaciones Científicas y Técnicas2025-09-29T10:05:06Zoai:ri.conicet.gov.ar:11336/7075instacron:CONICETInstitucionalhttp://ri.conicet.gov.ar/Organismo científico-tecnológicoNo correspondehttp://ri.conicet.gov.ar/oai/requestdasensio@conicet.gov.ar; lcarlino@conicet.gov.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:34982025-09-29 10:05:06.884CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicasfalse |
dc.title.none.fl_str_mv |
Modelling Efficient Novelty-based Search Result Diversification in Metric Spaces |
title |
Modelling Efficient Novelty-based Search Result Diversification in Metric Spaces |
spellingShingle |
Modelling Efficient Novelty-based Search Result Diversification in Metric Spaces Gil Costa, Graciela Verónica Similarity Search Diverification |
title_short |
Modelling Efficient Novelty-based Search Result Diversification in Metric Spaces |
title_full |
Modelling Efficient Novelty-based Search Result Diversification in Metric Spaces |
title_fullStr |
Modelling Efficient Novelty-based Search Result Diversification in Metric Spaces |
title_full_unstemmed |
Modelling Efficient Novelty-based Search Result Diversification in Metric Spaces |
title_sort |
Modelling Efficient Novelty-based Search Result Diversification in Metric Spaces |
dc.creator.none.fl_str_mv |
Gil Costa, Graciela Verónica Santos, Rodrygo L. T. Macdonald, Craig Ounis, Iadh |
author |
Gil Costa, Graciela Verónica |
author_facet |
Gil Costa, Graciela Verónica Santos, Rodrygo L. T. Macdonald, Craig Ounis, Iadh |
author_role |
author |
author2 |
Santos, Rodrygo L. T. Macdonald, Craig Ounis, Iadh |
author2_role |
author author author |
dc.subject.none.fl_str_mv |
Similarity Search Diverification |
topic |
Similarity Search Diverification |
purl_subject.fl_str_mv |
https://purl.org/becyt/ford/2.2 https://purl.org/becyt/ford/2 |
dc.description.none.fl_txt_mv |
Novelty-based diversification provides a way to tackle ambiguous queries by re-ranking a set of retrieved documents. Current approaches are typically greedy, requiring O(n2) document–document comparisons in order to diversify a ranking of n documents. In this article, we introduce a new approach for novelty-based search result diversification to reduce the overhead incurred by document–document comparisons. To this end, we model novelty promotion as a similarity search in a metric space, exploiting the properties of this space to efficiently identify novel documents. We investigate three different approaches: pivoting-based, clustering-based, and permutation-based. In the first two, a novel document is one that lies outside the range of a pivot or outside a cluster. In the latter, a novel document is one that has a different signature (i.e., the documentʼs relative distance to a distinguished set of fixed objects called permutants) compared to previously selected documents. Thorough experiments using two TREC test collections for diversity evaluation, as well as a large sample of the query stream of a commercial search engine show that our approaches perform at least as effectively as well-known novelty-based diversification approaches in the literature, while dramatically improving their efficiency. Fil: Gil Costa, Graciela Verónica. Yahoo; México. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico San Luis; Argentina Fil: Santos, Rodrygo L. T.. University Of Glasgow; Reino Unido Fil: Macdonald, Craig. University Of Glasgow; Reino Unido Fil: Ounis, Iadh. University Of Glasgow; Reino Unido |
description |
Novelty-based diversification provides a way to tackle ambiguous queries by re-ranking a set of retrieved documents. Current approaches are typically greedy, requiring O(n2) document–document comparisons in order to diversify a ranking of n documents. In this article, we introduce a new approach for novelty-based search result diversification to reduce the overhead incurred by document–document comparisons. To this end, we model novelty promotion as a similarity search in a metric space, exploiting the properties of this space to efficiently identify novel documents. We investigate three different approaches: pivoting-based, clustering-based, and permutation-based. In the first two, a novel document is one that lies outside the range of a pivot or outside a cluster. In the latter, a novel document is one that has a different signature (i.e., the documentʼs relative distance to a distinguished set of fixed objects called permutants) compared to previously selected documents. Thorough experiments using two TREC test collections for diversity evaluation, as well as a large sample of the query stream of a commercial search engine show that our approaches perform at least as effectively as well-known novelty-based diversification approaches in the literature, while dramatically improving their efficiency. |
publishDate |
2013 |
dc.date.none.fl_str_mv |
2013-01 |
dc.type.none.fl_str_mv |
info:eu-repo/semantics/article info:eu-repo/semantics/publishedVersion http://purl.org/coar/resource_type/c_6501 info:ar-repo/semantics/articulo |
format |
article |
status_str |
publishedVersion |
dc.identifier.none.fl_str_mv |
http://hdl.handle.net/11336/7075 Gil Costa, Graciela Verónica; Santos, Rodrygo L. T.; Macdonald, Craig; Ounis, Iadh; Modelling Efficient Novelty-based Search Result Diversification in Metric Spaces; Elsevier; Journal of Discrete Algorithms; 18; 1-2013; 75-88 1570-8667 |
url |
http://hdl.handle.net/11336/7075 |
identifier_str_mv |
Gil Costa, Graciela Verónica; Santos, Rodrygo L. T.; Macdonald, Craig; Ounis, Iadh; Modelling Efficient Novelty-based Search Result Diversification in Metric Spaces; Elsevier; Journal of Discrete Algorithms; 18; 1-2013; 75-88 1570-8667 |
dc.language.none.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
info:eu-repo/semantics/altIdentifier/url/http://www.sciencedirect.com/science/article/pii/S1570866712001074 info:eu-repo/semantics/altIdentifier/doi/ info:eu-repo/semantics/altIdentifier/doi/10.1016/j.jda.2012.07.004 |
dc.rights.none.fl_str_mv |
info:eu-repo/semantics/openAccess https://creativecommons.org/licenses/by-nc-nd/2.5/ar/ |
eu_rights_str_mv |
openAccess |
rights_invalid_str_mv |
https://creativecommons.org/licenses/by-nc-nd/2.5/ar/ |
dc.format.none.fl_str_mv |
application/pdf application/pdf application/pdf |
dc.publisher.none.fl_str_mv |
Elsevier |
publisher.none.fl_str_mv |
Elsevier |
dc.source.none.fl_str_mv |
reponame:CONICET Digital (CONICET) instname:Consejo Nacional de Investigaciones Científicas y Técnicas |
reponame_str |
CONICET Digital (CONICET) |
collection |
CONICET Digital (CONICET) |
instname_str |
Consejo Nacional de Investigaciones Científicas y Técnicas |
repository.name.fl_str_mv |
CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicas |
repository.mail.fl_str_mv |
dasensio@conicet.gov.ar; lcarlino@conicet.gov.ar |
_version_ |
1844613883635957760 |
score |
13.070432 |