Towards Smart Data Technologies for Big Data Analytics
- Autores
- Basgall, María José; Naiouf, Marcelo; Herrera, Francisco; Fernández, Alberto
- Año de publicación
- 2020
- Idioma
- inglés
- Tipo de recurso
- documento de conferencia
- Estado
- versión publicada
- Descripción
- Currently the publicly available datasets for Big Data Ana-lytics are of different qualities, and obtaining the expected behavior from the Machine Learning algorithms is crucial. Furthermore, since working with a huge amount of data is usually a time-demanding task, tohave high quality data is required. Smart Data refers to the process of transforming Big Data into clean and reliable data, and this can be accomplished by converting them, reducing unnecessary volume of data or applying some preprocessing techniques with the aim of improve their quality, and still to obtain trustworthy results. We present those properties that affect the quality of data. Also, the available proposals to analyze the quality of huge amount of data and to cope with low quality datasets in an scalable way, are commented. Furthermore, the need for a methodology towards Smart Data is highlighted.
Instituto de Investigación en Informática
Instituto de Investigación en Informática - Materia
-
Ciencias Informáticas
Big Data
Smart Data
Data Complexity
Data Quality - Nivel de accesibilidad
- acceso abierto
- Condiciones de uso
- http://creativecommons.org/licenses/by-nc-sa/4.0/
- Repositorio
- Institución
- Universidad Nacional de La Plata
- OAI Identificador
- oai:sedici.unlp.edu.ar:10915/104775
Ver los metadatos del registro completo
id |
SEDICI_9d1a70e1ede290e768214299644e3ef5 |
---|---|
oai_identifier_str |
oai:sedici.unlp.edu.ar:10915/104775 |
network_acronym_str |
SEDICI |
repository_id_str |
1329 |
network_name_str |
SEDICI (UNLP) |
spelling |
Towards Smart Data Technologies for Big Data AnalyticsBasgall, María JoséNaiouf, MarceloHerrera, FranciscoFernández, AlbertoCiencias InformáticasBig DataSmart DataData ComplexityData QualityCurrently the publicly available datasets for Big Data Ana-lytics are of different qualities, and obtaining the expected behavior from the Machine Learning algorithms is crucial. Furthermore, since working with a huge amount of data is usually a time-demanding task, tohave high quality data is required. Smart Data refers to the process of transforming Big Data into clean and reliable data, and this can be accomplished by converting them, reducing unnecessary volume of data or applying some preprocessing techniques with the aim of improve their quality, and still to obtain trustworthy results. We present those properties that affect the quality of data. Also, the available proposals to analyze the quality of huge amount of data and to cope with low quality datasets in an scalable way, are commented. Furthermore, the need for a methodology towards Smart Data is highlighted.Instituto de Investigación en InformáticaInstituto de Investigación en Informática2020-09info:eu-repo/semantics/conferenceObjectinfo:eu-repo/semantics/publishedVersionObjeto de conferenciahttp://purl.org/coar/resource_type/c_5794info:ar-repo/semantics/documentoDeConferenciaapplication/pdf44-47http://sedici.unlp.edu.ar/handle/10915/104775enginfo:eu-repo/semantics/altIdentifier/isbn/978-950-34-1927-4info:eu-repo/semantics/reference/hdl/10915/103585info:eu-repo/semantics/reference/hdl/10915/103585info:eu-repo/semantics/openAccesshttp://creativecommons.org/licenses/by-nc-sa/4.0/Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)reponame:SEDICI (UNLP)instname:Universidad Nacional de La Platainstacron:UNLP2025-09-29T11:22:57Zoai:sedici.unlp.edu.ar:10915/104775Institucionalhttp://sedici.unlp.edu.ar/Universidad públicaNo correspondehttp://sedici.unlp.edu.ar/oai/snrdalira@sedici.unlp.edu.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:13292025-09-29 11:22:57.954SEDICI (UNLP) - Universidad Nacional de La Platafalse |
dc.title.none.fl_str_mv |
Towards Smart Data Technologies for Big Data Analytics |
title |
Towards Smart Data Technologies for Big Data Analytics |
spellingShingle |
Towards Smart Data Technologies for Big Data Analytics Basgall, María José Ciencias Informáticas Big Data Smart Data Data Complexity Data Quality |
title_short |
Towards Smart Data Technologies for Big Data Analytics |
title_full |
Towards Smart Data Technologies for Big Data Analytics |
title_fullStr |
Towards Smart Data Technologies for Big Data Analytics |
title_full_unstemmed |
Towards Smart Data Technologies for Big Data Analytics |
title_sort |
Towards Smart Data Technologies for Big Data Analytics |
dc.creator.none.fl_str_mv |
Basgall, María José Naiouf, Marcelo Herrera, Francisco Fernández, Alberto |
author |
Basgall, María José |
author_facet |
Basgall, María José Naiouf, Marcelo Herrera, Francisco Fernández, Alberto |
author_role |
author |
author2 |
Naiouf, Marcelo Herrera, Francisco Fernández, Alberto |
author2_role |
author author author |
dc.subject.none.fl_str_mv |
Ciencias Informáticas Big Data Smart Data Data Complexity Data Quality |
topic |
Ciencias Informáticas Big Data Smart Data Data Complexity Data Quality |
dc.description.none.fl_txt_mv |
Currently the publicly available datasets for Big Data Ana-lytics are of different qualities, and obtaining the expected behavior from the Machine Learning algorithms is crucial. Furthermore, since working with a huge amount of data is usually a time-demanding task, tohave high quality data is required. Smart Data refers to the process of transforming Big Data into clean and reliable data, and this can be accomplished by converting them, reducing unnecessary volume of data or applying some preprocessing techniques with the aim of improve their quality, and still to obtain trustworthy results. We present those properties that affect the quality of data. Also, the available proposals to analyze the quality of huge amount of data and to cope with low quality datasets in an scalable way, are commented. Furthermore, the need for a methodology towards Smart Data is highlighted. Instituto de Investigación en Informática Instituto de Investigación en Informática |
description |
Currently the publicly available datasets for Big Data Ana-lytics are of different qualities, and obtaining the expected behavior from the Machine Learning algorithms is crucial. Furthermore, since working with a huge amount of data is usually a time-demanding task, tohave high quality data is required. Smart Data refers to the process of transforming Big Data into clean and reliable data, and this can be accomplished by converting them, reducing unnecessary volume of data or applying some preprocessing techniques with the aim of improve their quality, and still to obtain trustworthy results. We present those properties that affect the quality of data. Also, the available proposals to analyze the quality of huge amount of data and to cope with low quality datasets in an scalable way, are commented. Furthermore, the need for a methodology towards Smart Data is highlighted. |
publishDate |
2020 |
dc.date.none.fl_str_mv |
2020-09 |
dc.type.none.fl_str_mv |
info:eu-repo/semantics/conferenceObject info:eu-repo/semantics/publishedVersion Objeto de conferencia http://purl.org/coar/resource_type/c_5794 info:ar-repo/semantics/documentoDeConferencia |
format |
conferenceObject |
status_str |
publishedVersion |
dc.identifier.none.fl_str_mv |
http://sedici.unlp.edu.ar/handle/10915/104775 |
url |
http://sedici.unlp.edu.ar/handle/10915/104775 |
dc.language.none.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
info:eu-repo/semantics/altIdentifier/isbn/978-950-34-1927-4 info:eu-repo/semantics/reference/hdl/10915/103585 info:eu-repo/semantics/reference/hdl/10915/103585 |
dc.rights.none.fl_str_mv |
info:eu-repo/semantics/openAccess http://creativecommons.org/licenses/by-nc-sa/4.0/ Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) |
eu_rights_str_mv |
openAccess |
rights_invalid_str_mv |
http://creativecommons.org/licenses/by-nc-sa/4.0/ Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) |
dc.format.none.fl_str_mv |
application/pdf 44-47 |
dc.source.none.fl_str_mv |
reponame:SEDICI (UNLP) instname:Universidad Nacional de La Plata instacron:UNLP |
reponame_str |
SEDICI (UNLP) |
collection |
SEDICI (UNLP) |
instname_str |
Universidad Nacional de La Plata |
instacron_str |
UNLP |
institution |
UNLP |
repository.name.fl_str_mv |
SEDICI (UNLP) - Universidad Nacional de La Plata |
repository.mail.fl_str_mv |
alira@sedici.unlp.edu.ar |
_version_ |
1844616105306357760 |
score |
13.069144 |