Data stream treatment using sliding windows with MapReduce

Autores
Basgall, María José; Hasperué, Waldo; Naiouf, Ricardo Marcelo
Año de publicación
2016
Idioma
español castellano
Tipo de recurso
artículo
Estado
versión publicada
Descripción
Knowledge Discovery in Databases (KDD) techniques present limitations when the volume of data to process is very large. Any KDD algorithm needs to do several iterations on the complete set of data in order to carry out its work. For continuous data stream processing it is necessary to store part of it in a temporal window.In this paper, we present a technique that uses the size of the temporal window in a dynamic way, based on the frequency of the data arrival and the response time of the KDD task. The obtained results show that this technique reaches a great size window where each example of the stream is used in more than one iteration of the KDD task.
Fil: Basgall, María José. Universidad Nacional de La Plata. Facultad de Informática. Instituto de Investigación en Informática Lidi; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - La Plata; Argentina
Fil: Hasperué, Waldo. Universidad Nacional de La Plata. Facultad de Informática. Instituto de Investigación en Informática Lidi; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - La Plata; Argentina
Fil: Naiouf, Ricardo Marcelo. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - La Plata; Argentina. Universidad Nacional de La Plata. Facultad de Informática. Instituto de Investigación en Informática Lidi; Argentina
Materia
BIG DATA
MAPREDUCE
STREAM PROCESSING
Nivel de accesibilidad
acceso abierto
Condiciones de uso
https://creativecommons.org/licenses/by/2.5/ar/
Repositorio
CONICET Digital (CONICET)
Institución
Consejo Nacional de Investigaciones Científicas y Técnicas
OAI Identificador
oai:ri.conicet.gov.ar:11336/115823

id CONICETDig_2c433dcff85e1e05498e3fadb93e6dcb
oai_identifier_str oai:ri.conicet.gov.ar:11336/115823
network_acronym_str CONICETDig
repository_id_str 3498
network_name_str CONICET Digital (CONICET)
spelling Data stream treatment using sliding windows with MapReduceBasgall, María JoséHasperué, WaldoNaiouf, Ricardo MarceloBIG DATAMAPREDUCESTREAM PROCESSINGhttps://purl.org/becyt/ford/1.2https://purl.org/becyt/ford/1Knowledge Discovery in Databases (KDD) techniques present limitations when the volume of data to process is very large. Any KDD algorithm needs to do several iterations on the complete set of data in order to carry out its work. For continuous data stream processing it is necessary to store part of it in a temporal window.In this paper, we present a technique that uses the size of the temporal window in a dynamic way, based on the frequency of the data arrival and the response time of the KDD task. The obtained results show that this technique reaches a great size window where each example of the stream is used in more than one iteration of the KDD task.Fil: Basgall, María José. Universidad Nacional de La Plata. Facultad de Informática. Instituto de Investigación en Informática Lidi; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - La Plata; ArgentinaFil: Hasperué, Waldo. Universidad Nacional de La Plata. Facultad de Informática. Instituto de Investigación en Informática Lidi; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - La Plata; ArgentinaFil: Naiouf, Ricardo Marcelo. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - La Plata; Argentina. Universidad Nacional de La Plata. Facultad de Informática. Instituto de Investigación en Informática Lidi; ArgentinaUniversidad Nacional de La Plata. Facultad de Informática2016-11info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionhttp://purl.org/coar/resource_type/c_6501info:ar-repo/semantics/articuloapplication/pdfapplication/pdfapplication/pdfapplication/pdfhttp://hdl.handle.net/11336/115823Basgall, María José; Hasperué, Waldo; Naiouf, Ricardo Marcelo; Data stream treatment using sliding windows with MapReduce; Universidad Nacional de La Plata. Facultad de Informática; Journal of Computer Science and Technology; 16; 2; 11-2016; 76-831666-60461666-6038CONICET DigitalCONICETspainfo:eu-repo/semantics/altIdentifier/url/http://sedici.unlp.edu.ar/handle/10915/57265info:eu-repo/semantics/openAccesshttps://creativecommons.org/licenses/by/2.5/ar/reponame:CONICET Digital (CONICET)instname:Consejo Nacional de Investigaciones Científicas y Técnicas2025-09-10T13:04:51Zoai:ri.conicet.gov.ar:11336/115823instacron:CONICETInstitucionalhttp://ri.conicet.gov.ar/Organismo científico-tecnológicoNo correspondehttp://ri.conicet.gov.ar/oai/requestdasensio@conicet.gov.ar; lcarlino@conicet.gov.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:34982025-09-10 13:04:51.779CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicasfalse
dc.title.none.fl_str_mv Data stream treatment using sliding windows with MapReduce
title Data stream treatment using sliding windows with MapReduce
spellingShingle Data stream treatment using sliding windows with MapReduce
Basgall, María José
BIG DATA
MAPREDUCE
STREAM PROCESSING
title_short Data stream treatment using sliding windows with MapReduce
title_full Data stream treatment using sliding windows with MapReduce
title_fullStr Data stream treatment using sliding windows with MapReduce
title_full_unstemmed Data stream treatment using sliding windows with MapReduce
title_sort Data stream treatment using sliding windows with MapReduce
dc.creator.none.fl_str_mv Basgall, María José
Hasperué, Waldo
Naiouf, Ricardo Marcelo
author Basgall, María José
author_facet Basgall, María José
Hasperué, Waldo
Naiouf, Ricardo Marcelo
author_role author
author2 Hasperué, Waldo
Naiouf, Ricardo Marcelo
author2_role author
author
dc.subject.none.fl_str_mv BIG DATA
MAPREDUCE
STREAM PROCESSING
topic BIG DATA
MAPREDUCE
STREAM PROCESSING
purl_subject.fl_str_mv https://purl.org/becyt/ford/1.2
https://purl.org/becyt/ford/1
dc.description.none.fl_txt_mv Knowledge Discovery in Databases (KDD) techniques present limitations when the volume of data to process is very large. Any KDD algorithm needs to do several iterations on the complete set of data in order to carry out its work. For continuous data stream processing it is necessary to store part of it in a temporal window.In this paper, we present a technique that uses the size of the temporal window in a dynamic way, based on the frequency of the data arrival and the response time of the KDD task. The obtained results show that this technique reaches a great size window where each example of the stream is used in more than one iteration of the KDD task.
Fil: Basgall, María José. Universidad Nacional de La Plata. Facultad de Informática. Instituto de Investigación en Informática Lidi; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - La Plata; Argentina
Fil: Hasperué, Waldo. Universidad Nacional de La Plata. Facultad de Informática. Instituto de Investigación en Informática Lidi; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - La Plata; Argentina
Fil: Naiouf, Ricardo Marcelo. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - La Plata; Argentina. Universidad Nacional de La Plata. Facultad de Informática. Instituto de Investigación en Informática Lidi; Argentina
description Knowledge Discovery in Databases (KDD) techniques present limitations when the volume of data to process is very large. Any KDD algorithm needs to do several iterations on the complete set of data in order to carry out its work. For continuous data stream processing it is necessary to store part of it in a temporal window.In this paper, we present a technique that uses the size of the temporal window in a dynamic way, based on the frequency of the data arrival and the response time of the KDD task. The obtained results show that this technique reaches a great size window where each example of the stream is used in more than one iteration of the KDD task.
publishDate 2016
dc.date.none.fl_str_mv 2016-11
dc.type.none.fl_str_mv info:eu-repo/semantics/article
info:eu-repo/semantics/publishedVersion
http://purl.org/coar/resource_type/c_6501
info:ar-repo/semantics/articulo
format article
status_str publishedVersion
dc.identifier.none.fl_str_mv http://hdl.handle.net/11336/115823
Basgall, María José; Hasperué, Waldo; Naiouf, Ricardo Marcelo; Data stream treatment using sliding windows with MapReduce; Universidad Nacional de La Plata. Facultad de Informática; Journal of Computer Science and Technology; 16; 2; 11-2016; 76-83
1666-6046
1666-6038
CONICET Digital
CONICET
url http://hdl.handle.net/11336/115823
identifier_str_mv Basgall, María José; Hasperué, Waldo; Naiouf, Ricardo Marcelo; Data stream treatment using sliding windows with MapReduce; Universidad Nacional de La Plata. Facultad de Informática; Journal of Computer Science and Technology; 16; 2; 11-2016; 76-83
1666-6046
1666-6038
CONICET Digital
CONICET
dc.language.none.fl_str_mv spa
language spa
dc.relation.none.fl_str_mv info:eu-repo/semantics/altIdentifier/url/http://sedici.unlp.edu.ar/handle/10915/57265
dc.rights.none.fl_str_mv info:eu-repo/semantics/openAccess
https://creativecommons.org/licenses/by/2.5/ar/
eu_rights_str_mv openAccess
rights_invalid_str_mv https://creativecommons.org/licenses/by/2.5/ar/
dc.format.none.fl_str_mv application/pdf
application/pdf
application/pdf
application/pdf
dc.publisher.none.fl_str_mv Universidad Nacional de La Plata. Facultad de Informática
publisher.none.fl_str_mv Universidad Nacional de La Plata. Facultad de Informática
dc.source.none.fl_str_mv reponame:CONICET Digital (CONICET)
instname:Consejo Nacional de Investigaciones Científicas y Técnicas
reponame_str CONICET Digital (CONICET)
collection CONICET Digital (CONICET)
instname_str Consejo Nacional de Investigaciones Científicas y Técnicas
repository.name.fl_str_mv CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicas
repository.mail.fl_str_mv dasensio@conicet.gov.ar; lcarlino@conicet.gov.ar
_version_ 1842980165342199808
score 12.993085