Towards Smart Data Technologies for Big Data Analytics

Autores
Basgall, María José; Naiouf, Marcelo; Herrera, Francisco; Fernández, Alberto
Año de publicación
2020
Idioma
inglés
Tipo de recurso
documento de conferencia
Estado
versión publicada
Descripción
Currently the publicly available datasets for Big Data Ana-lytics are of different qualities, and obtaining the expected behavior from the Machine Learning algorithms is crucial. Furthermore, since working with a huge amount of data is usually a time-demanding task, tohave high quality data is required. Smart Data refers to the process of transforming Big Data into clean and reliable data, and this can be accomplished by converting them, reducing unnecessary volume of data or applying some preprocessing techniques with the aim of improve their quality, and still to obtain trustworthy results. We present those properties that affect the quality of data. Also, the available proposals to analyze the quality of huge amount of data and to cope with low quality datasets in an scalable way, are commented. Furthermore, the need for a methodology towards Smart Data is highlighted.
Instituto de Investigación en Informática
Instituto de Investigación en Informática
Materia
Ciencias Informáticas
Big Data
Smart Data
Data Complexity
Data Quality
Nivel de accesibilidad
acceso abierto
Condiciones de uso
http://creativecommons.org/licenses/by-nc-sa/4.0/
Repositorio
SEDICI (UNLP)
Institución
Universidad Nacional de La Plata
OAI Identificador
oai:sedici.unlp.edu.ar:10915/104775

id SEDICI_9d1a70e1ede290e768214299644e3ef5
oai_identifier_str oai:sedici.unlp.edu.ar:10915/104775
network_acronym_str SEDICI
repository_id_str 1329
network_name_str SEDICI (UNLP)
spelling Towards Smart Data Technologies for Big Data AnalyticsBasgall, María JoséNaiouf, MarceloHerrera, FranciscoFernández, AlbertoCiencias InformáticasBig DataSmart DataData ComplexityData QualityCurrently the publicly available datasets for Big Data Ana-lytics are of different qualities, and obtaining the expected behavior from the Machine Learning algorithms is crucial. Furthermore, since working with a huge amount of data is usually a time-demanding task, tohave high quality data is required. Smart Data refers to the process of transforming Big Data into clean and reliable data, and this can be accomplished by converting them, reducing unnecessary volume of data or applying some preprocessing techniques with the aim of improve their quality, and still to obtain trustworthy results. We present those properties that affect the quality of data. Also, the available proposals to analyze the quality of huge amount of data and to cope with low quality datasets in an scalable way, are commented. Furthermore, the need for a methodology towards Smart Data is highlighted.Instituto de Investigación en InformáticaInstituto de Investigación en Informática2020-09info:eu-repo/semantics/conferenceObjectinfo:eu-repo/semantics/publishedVersionObjeto de conferenciahttp://purl.org/coar/resource_type/c_5794info:ar-repo/semantics/documentoDeConferenciaapplication/pdf44-47http://sedici.unlp.edu.ar/handle/10915/104775enginfo:eu-repo/semantics/altIdentifier/isbn/978-950-34-1927-4info:eu-repo/semantics/reference/hdl/10915/103585info:eu-repo/semantics/reference/hdl/10915/103585info:eu-repo/semantics/openAccesshttp://creativecommons.org/licenses/by-nc-sa/4.0/Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)reponame:SEDICI (UNLP)instname:Universidad Nacional de La Platainstacron:UNLP2025-09-29T11:22:57Zoai:sedici.unlp.edu.ar:10915/104775Institucionalhttp://sedici.unlp.edu.ar/Universidad públicaNo correspondehttp://sedici.unlp.edu.ar/oai/snrdalira@sedici.unlp.edu.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:13292025-09-29 11:22:57.954SEDICI (UNLP) - Universidad Nacional de La Platafalse
dc.title.none.fl_str_mv Towards Smart Data Technologies for Big Data Analytics
title Towards Smart Data Technologies for Big Data Analytics
spellingShingle Towards Smart Data Technologies for Big Data Analytics
Basgall, María José
Ciencias Informáticas
Big Data
Smart Data
Data Complexity
Data Quality
title_short Towards Smart Data Technologies for Big Data Analytics
title_full Towards Smart Data Technologies for Big Data Analytics
title_fullStr Towards Smart Data Technologies for Big Data Analytics
title_full_unstemmed Towards Smart Data Technologies for Big Data Analytics
title_sort Towards Smart Data Technologies for Big Data Analytics
dc.creator.none.fl_str_mv Basgall, María José
Naiouf, Marcelo
Herrera, Francisco
Fernández, Alberto
author Basgall, María José
author_facet Basgall, María José
Naiouf, Marcelo
Herrera, Francisco
Fernández, Alberto
author_role author
author2 Naiouf, Marcelo
Herrera, Francisco
Fernández, Alberto
author2_role author
author
author
dc.subject.none.fl_str_mv Ciencias Informáticas
Big Data
Smart Data
Data Complexity
Data Quality
topic Ciencias Informáticas
Big Data
Smart Data
Data Complexity
Data Quality
dc.description.none.fl_txt_mv Currently the publicly available datasets for Big Data Ana-lytics are of different qualities, and obtaining the expected behavior from the Machine Learning algorithms is crucial. Furthermore, since working with a huge amount of data is usually a time-demanding task, tohave high quality data is required. Smart Data refers to the process of transforming Big Data into clean and reliable data, and this can be accomplished by converting them, reducing unnecessary volume of data or applying some preprocessing techniques with the aim of improve their quality, and still to obtain trustworthy results. We present those properties that affect the quality of data. Also, the available proposals to analyze the quality of huge amount of data and to cope with low quality datasets in an scalable way, are commented. Furthermore, the need for a methodology towards Smart Data is highlighted.
Instituto de Investigación en Informática
Instituto de Investigación en Informática
description Currently the publicly available datasets for Big Data Ana-lytics are of different qualities, and obtaining the expected behavior from the Machine Learning algorithms is crucial. Furthermore, since working with a huge amount of data is usually a time-demanding task, tohave high quality data is required. Smart Data refers to the process of transforming Big Data into clean and reliable data, and this can be accomplished by converting them, reducing unnecessary volume of data or applying some preprocessing techniques with the aim of improve their quality, and still to obtain trustworthy results. We present those properties that affect the quality of data. Also, the available proposals to analyze the quality of huge amount of data and to cope with low quality datasets in an scalable way, are commented. Furthermore, the need for a methodology towards Smart Data is highlighted.
publishDate 2020
dc.date.none.fl_str_mv 2020-09
dc.type.none.fl_str_mv info:eu-repo/semantics/conferenceObject
info:eu-repo/semantics/publishedVersion
Objeto de conferencia
http://purl.org/coar/resource_type/c_5794
info:ar-repo/semantics/documentoDeConferencia
format conferenceObject
status_str publishedVersion
dc.identifier.none.fl_str_mv http://sedici.unlp.edu.ar/handle/10915/104775
url http://sedici.unlp.edu.ar/handle/10915/104775
dc.language.none.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv info:eu-repo/semantics/altIdentifier/isbn/978-950-34-1927-4
info:eu-repo/semantics/reference/hdl/10915/103585
info:eu-repo/semantics/reference/hdl/10915/103585
dc.rights.none.fl_str_mv info:eu-repo/semantics/openAccess
http://creativecommons.org/licenses/by-nc-sa/4.0/
Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)
eu_rights_str_mv openAccess
rights_invalid_str_mv http://creativecommons.org/licenses/by-nc-sa/4.0/
Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)
dc.format.none.fl_str_mv application/pdf
44-47
dc.source.none.fl_str_mv reponame:SEDICI (UNLP)
instname:Universidad Nacional de La Plata
instacron:UNLP
reponame_str SEDICI (UNLP)
collection SEDICI (UNLP)
instname_str Universidad Nacional de La Plata
instacron_str UNLP
institution UNLP
repository.name.fl_str_mv SEDICI (UNLP) - Universidad Nacional de La Plata
repository.mail.fl_str_mv alira@sedici.unlp.edu.ar
_version_ 1844616105306357760
score 13.069144