Composite Retrieval of Diverse and Complementary Bundles
- Autores
- Amer Yahia, Sihem; Bonchi, Francesco; Castillo, Carlos; Feuerstein, Esteban Zindel; Méndez-Díaz, Isabel; Zabala, Paula Lorena
- Año de publicación
- 2014
- Idioma
- inglés
- Tipo de recurso
- artículo
- Estado
- versión publicada
- Descripción
- Users are often faced with the problem of finding complementary items that together achieve a single common goal (e.g., a starter kit for a novice astronomer, a collection of question/answers related to low-carb nutrition, a set of places to visit on holidays). In this paper, we argue that for some application scenarios returning item bundles is more appropriate than ranked lists. Thus we define composite retrieval as the problem of finding k bundles of complementary items. Beyond complementarity of items, the bundles must be valid w.r.t. a given budget, and the answer set of k bundles must exhibit diversity. We formally define the problem and show that in its general form is NP-hard and that also the special cases in which each bundle is formed by only one item, or only one bundle is sought, are hard. Our characterization however suggests how to adopt a two-phase approach (Produce-and-Choose, or PAC) in which we first produce many valid bundles, and then we choose k among them. For the first phase we devise two ad-hoc clustering algorithms, while for the second phase we adapt heuristics with approximation guarantees for a related problem. We also devise another approach which is based on first finding a k-clustering and then selecting a valid bundle from each of the produced clusters (Cluster-and-Pick, or CAP). We compare experimentally the proposed methods on two real-world data sets: the first data set is given by a sample of touristic attractions in 10 large European cities, while the second is a large database of user-generated restaurant reviews from Yahoo! Local. Our experiments show that when diversity is highly important, CAP is the best option, while when diversity is less important, a PAC approach constructing bundles around randomly chosen pivots, is better.
Fil: Amer Yahia, Sihem. Centre National de la Recherche Scientifique; Francia
Fil: Bonchi, Francesco. Yahoo Labs; España
Fil: Castillo, Carlos. Qatar Computing Research Institute; Qatar
Fil: Feuerstein, Esteban Zindel. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Departamento de Computación; Argentina
Fil: Méndez-Díaz, Isabel. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Departamento de Computación; Argentina
Fil: Zabala, Paula Lorena. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Departamento de Computación; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina - Materia
-
Composite Retrieval
Complementarity
Diversity
Maximun Edge Subgraph - Nivel de accesibilidad
- acceso abierto
- Condiciones de uso
- https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
- Repositorio
- Institución
- Consejo Nacional de Investigaciones Científicas y Técnicas
- OAI Identificador
- oai:ri.conicet.gov.ar:11336/33093
Ver los metadatos del registro completo
id |
CONICETDig_24bbdc5879e551f72c531a7c98f07e84 |
---|---|
oai_identifier_str |
oai:ri.conicet.gov.ar:11336/33093 |
network_acronym_str |
CONICETDig |
repository_id_str |
3498 |
network_name_str |
CONICET Digital (CONICET) |
spelling |
Composite Retrieval of Diverse and Complementary BundlesAmer Yahia, SihemBonchi, FrancescoCastillo, CarlosFeuerstein, Esteban ZindelMéndez-Díaz, IsabelZabala, Paula LorenaComposite RetrievalComplementarityDiversityMaximun Edge Subgraphhttps://purl.org/becyt/ford/1.2https://purl.org/becyt/ford/1Users are often faced with the problem of finding complementary items that together achieve a single common goal (e.g., a starter kit for a novice astronomer, a collection of question/answers related to low-carb nutrition, a set of places to visit on holidays). In this paper, we argue that for some application scenarios returning item bundles is more appropriate than ranked lists. Thus we define composite retrieval as the problem of finding k bundles of complementary items. Beyond complementarity of items, the bundles must be valid w.r.t. a given budget, and the answer set of k bundles must exhibit diversity. We formally define the problem and show that in its general form is NP-hard and that also the special cases in which each bundle is formed by only one item, or only one bundle is sought, are hard. Our characterization however suggests how to adopt a two-phase approach (Produce-and-Choose, or PAC) in which we first produce many valid bundles, and then we choose k among them. For the first phase we devise two ad-hoc clustering algorithms, while for the second phase we adapt heuristics with approximation guarantees for a related problem. We also devise another approach which is based on first finding a k-clustering and then selecting a valid bundle from each of the produced clusters (Cluster-and-Pick, or CAP). We compare experimentally the proposed methods on two real-world data sets: the first data set is given by a sample of touristic attractions in 10 large European cities, while the second is a large database of user-generated restaurant reviews from Yahoo! Local. Our experiments show that when diversity is highly important, CAP is the best option, while when diversity is less important, a PAC approach constructing bundles around randomly chosen pivots, is better.Fil: Amer Yahia, Sihem. Centre National de la Recherche Scientifique; FranciaFil: Bonchi, Francesco. Yahoo Labs; EspañaFil: Castillo, Carlos. Qatar Computing Research Institute; QatarFil: Feuerstein, Esteban Zindel. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Departamento de Computación; ArgentinaFil: Méndez-Díaz, Isabel. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Departamento de Computación; ArgentinaFil: Zabala, Paula Lorena. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Departamento de Computación; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; ArgentinaIEEE Computer Society2014-11info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionhttp://purl.org/coar/resource_type/c_6501info:ar-repo/semantics/articuloapplication/pdfapplication/pdfhttp://hdl.handle.net/11336/33093Bonchi, Francesco; Castillo, Carlos; Zabala, Paula Lorena; Amer Yahia, Sihem; Feuerstein, Esteban Zindel; Méndez-Díaz, Isabel; et al.; Composite Retrieval of Diverse and Complementary Bundles; IEEE Computer Society; Ieee Transactions On Knowledge And Data Engineering; 26; 11; 11-2014; 2662-26751041-4347CONICET DigitalCONICETenginfo:eu-repo/semantics/altIdentifier/doi/10.1109/TKDE.2014.2306678info:eu-repo/semantics/altIdentifier/url/http://ieeexplore.ieee.org/document/6742606/info:eu-repo/semantics/openAccesshttps://creativecommons.org/licenses/by-nc-sa/2.5/ar/reponame:CONICET Digital (CONICET)instname:Consejo Nacional de Investigaciones Científicas y Técnicas2025-10-15T14:37:05Zoai:ri.conicet.gov.ar:11336/33093instacron:CONICETInstitucionalhttp://ri.conicet.gov.ar/Organismo científico-tecnológicoNo correspondehttp://ri.conicet.gov.ar/oai/requestdasensio@conicet.gov.ar; lcarlino@conicet.gov.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:34982025-10-15 14:37:06.124CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicasfalse |
dc.title.none.fl_str_mv |
Composite Retrieval of Diverse and Complementary Bundles |
title |
Composite Retrieval of Diverse and Complementary Bundles |
spellingShingle |
Composite Retrieval of Diverse and Complementary Bundles Amer Yahia, Sihem Composite Retrieval Complementarity Diversity Maximun Edge Subgraph |
title_short |
Composite Retrieval of Diverse and Complementary Bundles |
title_full |
Composite Retrieval of Diverse and Complementary Bundles |
title_fullStr |
Composite Retrieval of Diverse and Complementary Bundles |
title_full_unstemmed |
Composite Retrieval of Diverse and Complementary Bundles |
title_sort |
Composite Retrieval of Diverse and Complementary Bundles |
dc.creator.none.fl_str_mv |
Amer Yahia, Sihem Bonchi, Francesco Castillo, Carlos Feuerstein, Esteban Zindel Méndez-Díaz, Isabel Zabala, Paula Lorena |
author |
Amer Yahia, Sihem |
author_facet |
Amer Yahia, Sihem Bonchi, Francesco Castillo, Carlos Feuerstein, Esteban Zindel Méndez-Díaz, Isabel Zabala, Paula Lorena |
author_role |
author |
author2 |
Bonchi, Francesco Castillo, Carlos Feuerstein, Esteban Zindel Méndez-Díaz, Isabel Zabala, Paula Lorena |
author2_role |
author author author author author |
dc.subject.none.fl_str_mv |
Composite Retrieval Complementarity Diversity Maximun Edge Subgraph |
topic |
Composite Retrieval Complementarity Diversity Maximun Edge Subgraph |
purl_subject.fl_str_mv |
https://purl.org/becyt/ford/1.2 https://purl.org/becyt/ford/1 |
dc.description.none.fl_txt_mv |
Users are often faced with the problem of finding complementary items that together achieve a single common goal (e.g., a starter kit for a novice astronomer, a collection of question/answers related to low-carb nutrition, a set of places to visit on holidays). In this paper, we argue that for some application scenarios returning item bundles is more appropriate than ranked lists. Thus we define composite retrieval as the problem of finding k bundles of complementary items. Beyond complementarity of items, the bundles must be valid w.r.t. a given budget, and the answer set of k bundles must exhibit diversity. We formally define the problem and show that in its general form is NP-hard and that also the special cases in which each bundle is formed by only one item, or only one bundle is sought, are hard. Our characterization however suggests how to adopt a two-phase approach (Produce-and-Choose, or PAC) in which we first produce many valid bundles, and then we choose k among them. For the first phase we devise two ad-hoc clustering algorithms, while for the second phase we adapt heuristics with approximation guarantees for a related problem. We also devise another approach which is based on first finding a k-clustering and then selecting a valid bundle from each of the produced clusters (Cluster-and-Pick, or CAP). We compare experimentally the proposed methods on two real-world data sets: the first data set is given by a sample of touristic attractions in 10 large European cities, while the second is a large database of user-generated restaurant reviews from Yahoo! Local. Our experiments show that when diversity is highly important, CAP is the best option, while when diversity is less important, a PAC approach constructing bundles around randomly chosen pivots, is better. Fil: Amer Yahia, Sihem. Centre National de la Recherche Scientifique; Francia Fil: Bonchi, Francesco. Yahoo Labs; España Fil: Castillo, Carlos. Qatar Computing Research Institute; Qatar Fil: Feuerstein, Esteban Zindel. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Departamento de Computación; Argentina Fil: Méndez-Díaz, Isabel. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Departamento de Computación; Argentina Fil: Zabala, Paula Lorena. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Departamento de Computación; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina |
description |
Users are often faced with the problem of finding complementary items that together achieve a single common goal (e.g., a starter kit for a novice astronomer, a collection of question/answers related to low-carb nutrition, a set of places to visit on holidays). In this paper, we argue that for some application scenarios returning item bundles is more appropriate than ranked lists. Thus we define composite retrieval as the problem of finding k bundles of complementary items. Beyond complementarity of items, the bundles must be valid w.r.t. a given budget, and the answer set of k bundles must exhibit diversity. We formally define the problem and show that in its general form is NP-hard and that also the special cases in which each bundle is formed by only one item, or only one bundle is sought, are hard. Our characterization however suggests how to adopt a two-phase approach (Produce-and-Choose, or PAC) in which we first produce many valid bundles, and then we choose k among them. For the first phase we devise two ad-hoc clustering algorithms, while for the second phase we adapt heuristics with approximation guarantees for a related problem. We also devise another approach which is based on first finding a k-clustering and then selecting a valid bundle from each of the produced clusters (Cluster-and-Pick, or CAP). We compare experimentally the proposed methods on two real-world data sets: the first data set is given by a sample of touristic attractions in 10 large European cities, while the second is a large database of user-generated restaurant reviews from Yahoo! Local. Our experiments show that when diversity is highly important, CAP is the best option, while when diversity is less important, a PAC approach constructing bundles around randomly chosen pivots, is better. |
publishDate |
2014 |
dc.date.none.fl_str_mv |
2014-11 |
dc.type.none.fl_str_mv |
info:eu-repo/semantics/article info:eu-repo/semantics/publishedVersion http://purl.org/coar/resource_type/c_6501 info:ar-repo/semantics/articulo |
format |
article |
status_str |
publishedVersion |
dc.identifier.none.fl_str_mv |
http://hdl.handle.net/11336/33093 Bonchi, Francesco; Castillo, Carlos; Zabala, Paula Lorena; Amer Yahia, Sihem; Feuerstein, Esteban Zindel; Méndez-Díaz, Isabel; et al.; Composite Retrieval of Diverse and Complementary Bundles; IEEE Computer Society; Ieee Transactions On Knowledge And Data Engineering; 26; 11; 11-2014; 2662-2675 1041-4347 CONICET Digital CONICET |
url |
http://hdl.handle.net/11336/33093 |
identifier_str_mv |
Bonchi, Francesco; Castillo, Carlos; Zabala, Paula Lorena; Amer Yahia, Sihem; Feuerstein, Esteban Zindel; Méndez-Díaz, Isabel; et al.; Composite Retrieval of Diverse and Complementary Bundles; IEEE Computer Society; Ieee Transactions On Knowledge And Data Engineering; 26; 11; 11-2014; 2662-2675 1041-4347 CONICET Digital CONICET |
dc.language.none.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
info:eu-repo/semantics/altIdentifier/doi/10.1109/TKDE.2014.2306678 info:eu-repo/semantics/altIdentifier/url/http://ieeexplore.ieee.org/document/6742606/ |
dc.rights.none.fl_str_mv |
info:eu-repo/semantics/openAccess https://creativecommons.org/licenses/by-nc-sa/2.5/ar/ |
eu_rights_str_mv |
openAccess |
rights_invalid_str_mv |
https://creativecommons.org/licenses/by-nc-sa/2.5/ar/ |
dc.format.none.fl_str_mv |
application/pdf application/pdf |
dc.publisher.none.fl_str_mv |
IEEE Computer Society |
publisher.none.fl_str_mv |
IEEE Computer Society |
dc.source.none.fl_str_mv |
reponame:CONICET Digital (CONICET) instname:Consejo Nacional de Investigaciones Científicas y Técnicas |
reponame_str |
CONICET Digital (CONICET) |
collection |
CONICET Digital (CONICET) |
instname_str |
Consejo Nacional de Investigaciones Científicas y Técnicas |
repository.name.fl_str_mv |
CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicas |
repository.mail.fl_str_mv |
dasensio@conicet.gov.ar; lcarlino@conicet.gov.ar |
_version_ |
1846082841202393088 |
score |
13.22299 |