Data stream treatment using sliding windows with MapReduce

Autores
Basgall, María José; Hasperué, Waldo; Naiouf, Marcelo
Año de publicación
2016
Idioma
inglés
Tipo de recurso
artículo
Estado
versión publicada
Descripción
Knowledge Discovery in Databases (KDD) techniques present limitations when the volume of data to process is very large. Any KDD algorithm needs to do several iterations on the complete set of data in order to carry out its work. For continuous data stream processing it is necessary to store part of it in a temporal window. In this paper, we present a technique that uses the size of the temporal window in a dynamic way, based on the frequency of the data arrival and the response time of the KDD task. The obtained results show that this technique reaches a great size window where each example of the stream is used in more than one iteration of the KDD task.
Facultad de Informática
Materia
Ciencias Informáticas
big data
mapreduce
stream processing
Nivel de accesibilidad
acceso abierto
Condiciones de uso
http://creativecommons.org/licenses/by/3.0/
Repositorio
SEDICI (UNLP)
Institución
Universidad Nacional de La Plata
OAI Identificador
oai:sedici.unlp.edu.ar:10915/57265

id SEDICI_ac02bd5f902bbf83bb3454838a5d794a
oai_identifier_str oai:sedici.unlp.edu.ar:10915/57265
network_acronym_str SEDICI
repository_id_str 1329
network_name_str SEDICI (UNLP)
spelling Data stream treatment using sliding windows with MapReduceBasgall, María JoséHasperué, WaldoNaiouf, MarceloCiencias Informáticasbig datamapreducestream processingKnowledge Discovery in Databases (KDD) techniques present limitations when the volume of data to process is very large. Any KDD algorithm needs to do several iterations on the complete set of data in order to carry out its work. For continuous data stream processing it is necessary to store part of it in a temporal window. In this paper, we present a technique that uses the size of the temporal window in a dynamic way, based on the frequency of the data arrival and the response time of the KDD task. The obtained results show that this technique reaches a great size window where each example of the stream is used in more than one iteration of the KDD task.Facultad de Informática2016-11info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionArticulohttp://purl.org/coar/resource_type/c_6501info:ar-repo/semantics/articuloapplication/pdf76-83http://sedici.unlp.edu.ar/handle/10915/57265enginfo:eu-repo/semantics/altIdentifier/url/http://journal.info.unlp.edu.ar/wp-content/uploads/2016/12/JCST-43-Paper-2.pdfinfo:eu-repo/semantics/altIdentifier/issn/1666-6038info:eu-repo/semantics/openAccesshttp://creativecommons.org/licenses/by/3.0/Creative Commons Attribution 3.0 Unported (CC BY 3.0)reponame:SEDICI (UNLP)instname:Universidad Nacional de La Platainstacron:UNLP2025-09-03T10:38:56Zoai:sedici.unlp.edu.ar:10915/57265Institucionalhttp://sedici.unlp.edu.ar/Universidad públicaNo correspondehttp://sedici.unlp.edu.ar/oai/snrdalira@sedici.unlp.edu.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:13292025-09-03 10:38:56.598SEDICI (UNLP) - Universidad Nacional de La Platafalse
dc.title.none.fl_str_mv Data stream treatment using sliding windows with MapReduce
title Data stream treatment using sliding windows with MapReduce
spellingShingle Data stream treatment using sliding windows with MapReduce
Basgall, María José
Ciencias Informáticas
big data
mapreduce
stream processing
title_short Data stream treatment using sliding windows with MapReduce
title_full Data stream treatment using sliding windows with MapReduce
title_fullStr Data stream treatment using sliding windows with MapReduce
title_full_unstemmed Data stream treatment using sliding windows with MapReduce
title_sort Data stream treatment using sliding windows with MapReduce
dc.creator.none.fl_str_mv Basgall, María José
Hasperué, Waldo
Naiouf, Marcelo
author Basgall, María José
author_facet Basgall, María José
Hasperué, Waldo
Naiouf, Marcelo
author_role author
author2 Hasperué, Waldo
Naiouf, Marcelo
author2_role author
author
dc.subject.none.fl_str_mv Ciencias Informáticas
big data
mapreduce
stream processing
topic Ciencias Informáticas
big data
mapreduce
stream processing
dc.description.none.fl_txt_mv Knowledge Discovery in Databases (KDD) techniques present limitations when the volume of data to process is very large. Any KDD algorithm needs to do several iterations on the complete set of data in order to carry out its work. For continuous data stream processing it is necessary to store part of it in a temporal window. In this paper, we present a technique that uses the size of the temporal window in a dynamic way, based on the frequency of the data arrival and the response time of the KDD task. The obtained results show that this technique reaches a great size window where each example of the stream is used in more than one iteration of the KDD task.
Facultad de Informática
description Knowledge Discovery in Databases (KDD) techniques present limitations when the volume of data to process is very large. Any KDD algorithm needs to do several iterations on the complete set of data in order to carry out its work. For continuous data stream processing it is necessary to store part of it in a temporal window. In this paper, we present a technique that uses the size of the temporal window in a dynamic way, based on the frequency of the data arrival and the response time of the KDD task. The obtained results show that this technique reaches a great size window where each example of the stream is used in more than one iteration of the KDD task.
publishDate 2016
dc.date.none.fl_str_mv 2016-11
dc.type.none.fl_str_mv info:eu-repo/semantics/article
info:eu-repo/semantics/publishedVersion
Articulo
http://purl.org/coar/resource_type/c_6501
info:ar-repo/semantics/articulo
format article
status_str publishedVersion
dc.identifier.none.fl_str_mv http://sedici.unlp.edu.ar/handle/10915/57265
url http://sedici.unlp.edu.ar/handle/10915/57265
dc.language.none.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv info:eu-repo/semantics/altIdentifier/url/http://journal.info.unlp.edu.ar/wp-content/uploads/2016/12/JCST-43-Paper-2.pdf
info:eu-repo/semantics/altIdentifier/issn/1666-6038
dc.rights.none.fl_str_mv info:eu-repo/semantics/openAccess
http://creativecommons.org/licenses/by/3.0/
Creative Commons Attribution 3.0 Unported (CC BY 3.0)
eu_rights_str_mv openAccess
rights_invalid_str_mv http://creativecommons.org/licenses/by/3.0/
Creative Commons Attribution 3.0 Unported (CC BY 3.0)
dc.format.none.fl_str_mv application/pdf
76-83
dc.source.none.fl_str_mv reponame:SEDICI (UNLP)
instname:Universidad Nacional de La Plata
instacron:UNLP
reponame_str SEDICI (UNLP)
collection SEDICI (UNLP)
instname_str Universidad Nacional de La Plata
instacron_str UNLP
institution UNLP
repository.name.fl_str_mv SEDICI (UNLP) - Universidad Nacional de La Plata
repository.mail.fl_str_mv alira@sedici.unlp.edu.ar
_version_ 1842260249887637504
score 13.13397