An end-user pipeline for scrapping and visualizing semi-structured data over the Web

Autores
Bosetti, Gabriela Alejandra; Firmenich, Sergio Damián; Winckler, Marco; Rossi, Gustavo Héctor; Cornejo Fandos, Ulises Jeremías; Egyed-Zsigmond, Elöd
Año de publicación
2019
Idioma
inglés
Tipo de recurso
documento de conferencia
Estado
versión publicada
Descripción
The Web is a vast source of semi-structured data sets that are made readily available to support the construction of new knowledge. Information visualization techniques have been demonstrated a suitable alternative for allowing users to analyze and understand a large amount of data. However, the steps required for visualizing semi-structured data obtained from the Web is not straightforward, and it requires proper treatment before information visualization techniques could be applied. In this work, we present a visualization pipeline for describing the fundamental operations required for visualizing semi-structured data over the Web. For that, we employ Web Scrapping and Web Augmentation techniques for supporting interactive visualizations and solving tasks without changing the context of use of the data. Our approach is duly supported by a framework including scrapping, augmenting and visualization tools and it has been applied to different kinds of websites to demonstrate its validity and feasibility. Our ultimate goal is to expand the limits of our technology for improving the user interaction with websites and creating new experiences for better understanding large data sets.
Publicado en Lecture Notes in Computer Science book series (LNCS, vol. 11496)
Laboratorio de Investigación y Formación en Informática Avanzada
Materia
Ciencias Informáticas
Infovis
Web augmentation
Web scrapping
Nivel de accesibilidad
acceso abierto
Condiciones de uso
http://creativecommons.org/licenses/by-nc-sa/4.0/
Repositorio
SEDICI (UNLP)
Institución
Universidad Nacional de La Plata
OAI Identificador
oai:sedici.unlp.edu.ar:10915/119026

id SEDICI_fca276701218942c12d77493ddb24ef8
oai_identifier_str oai:sedici.unlp.edu.ar:10915/119026
network_acronym_str SEDICI
repository_id_str 1329
network_name_str SEDICI (UNLP)
spelling An end-user pipeline for scrapping and visualizing semi-structured data over the WebBosetti, Gabriela AlejandraFirmenich, Sergio DamiánWinckler, MarcoRossi, Gustavo HéctorCornejo Fandos, Ulises JeremíasEgyed-Zsigmond, ElödCiencias InformáticasInfovisWeb augmentationWeb scrappingThe Web is a vast source of semi-structured data sets that are made readily available to support the construction of new knowledge. Information visualization techniques have been demonstrated a suitable alternative for allowing users to analyze and understand a large amount of data. However, the steps required for visualizing semi-structured data obtained from the Web is not straightforward, and it requires proper treatment before information visualization techniques could be applied. In this work, we present a visualization pipeline for describing the fundamental operations required for visualizing semi-structured data over the Web. For that, we employ Web Scrapping and Web Augmentation techniques for supporting interactive visualizations and solving tasks without changing the context of use of the data. Our approach is duly supported by a framework including scrapping, augmenting and visualization tools and it has been applied to different kinds of websites to demonstrate its validity and feasibility. Our ultimate goal is to expand the limits of our technology for improving the user interaction with websites and creating new experiences for better understanding large data sets.Publicado en <i>Lecture Notes in Computer Science</i> book series (LNCS, vol. 11496)Laboratorio de Investigación y Formación en Informática Avanzada2019info:eu-repo/semantics/conferenceObjectinfo:eu-repo/semantics/publishedVersionObjeto de conferenciahttp://purl.org/coar/resource_type/c_5794info:ar-repo/semantics/documentoDeConferenciaapplication/pdf223-237http://sedici.unlp.edu.ar/handle/10915/119026enginfo:eu-repo/semantics/altIdentifier/isbn/978-3-030-19274-7info:eu-repo/semantics/altIdentifier/issn/0302-9743info:eu-repo/semantics/altIdentifier/doi/10.1007/978-3-030-19274-7_17info:eu-repo/semantics/altIdentifier/hdl/11746/10709info:eu-repo/semantics/openAccesshttp://creativecommons.org/licenses/by-nc-sa/4.0/Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)reponame:SEDICI (UNLP)instname:Universidad Nacional de La Platainstacron:UNLP2025-09-29T11:28:04Zoai:sedici.unlp.edu.ar:10915/119026Institucionalhttp://sedici.unlp.edu.ar/Universidad públicaNo correspondehttp://sedici.unlp.edu.ar/oai/snrdalira@sedici.unlp.edu.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:13292025-09-29 11:28:04.838SEDICI (UNLP) - Universidad Nacional de La Platafalse
dc.title.none.fl_str_mv An end-user pipeline for scrapping and visualizing semi-structured data over the Web
title An end-user pipeline for scrapping and visualizing semi-structured data over the Web
spellingShingle An end-user pipeline for scrapping and visualizing semi-structured data over the Web
Bosetti, Gabriela Alejandra
Ciencias Informáticas
Infovis
Web augmentation
Web scrapping
title_short An end-user pipeline for scrapping and visualizing semi-structured data over the Web
title_full An end-user pipeline for scrapping and visualizing semi-structured data over the Web
title_fullStr An end-user pipeline for scrapping and visualizing semi-structured data over the Web
title_full_unstemmed An end-user pipeline for scrapping and visualizing semi-structured data over the Web
title_sort An end-user pipeline for scrapping and visualizing semi-structured data over the Web
dc.creator.none.fl_str_mv Bosetti, Gabriela Alejandra
Firmenich, Sergio Damián
Winckler, Marco
Rossi, Gustavo Héctor
Cornejo Fandos, Ulises Jeremías
Egyed-Zsigmond, Elöd
author Bosetti, Gabriela Alejandra
author_facet Bosetti, Gabriela Alejandra
Firmenich, Sergio Damián
Winckler, Marco
Rossi, Gustavo Héctor
Cornejo Fandos, Ulises Jeremías
Egyed-Zsigmond, Elöd
author_role author
author2 Firmenich, Sergio Damián
Winckler, Marco
Rossi, Gustavo Héctor
Cornejo Fandos, Ulises Jeremías
Egyed-Zsigmond, Elöd
author2_role author
author
author
author
author
dc.subject.none.fl_str_mv Ciencias Informáticas
Infovis
Web augmentation
Web scrapping
topic Ciencias Informáticas
Infovis
Web augmentation
Web scrapping
dc.description.none.fl_txt_mv The Web is a vast source of semi-structured data sets that are made readily available to support the construction of new knowledge. Information visualization techniques have been demonstrated a suitable alternative for allowing users to analyze and understand a large amount of data. However, the steps required for visualizing semi-structured data obtained from the Web is not straightforward, and it requires proper treatment before information visualization techniques could be applied. In this work, we present a visualization pipeline for describing the fundamental operations required for visualizing semi-structured data over the Web. For that, we employ Web Scrapping and Web Augmentation techniques for supporting interactive visualizations and solving tasks without changing the context of use of the data. Our approach is duly supported by a framework including scrapping, augmenting and visualization tools and it has been applied to different kinds of websites to demonstrate its validity and feasibility. Our ultimate goal is to expand the limits of our technology for improving the user interaction with websites and creating new experiences for better understanding large data sets.
Publicado en <i>Lecture Notes in Computer Science</i> book series (LNCS, vol. 11496)
Laboratorio de Investigación y Formación en Informática Avanzada
description The Web is a vast source of semi-structured data sets that are made readily available to support the construction of new knowledge. Information visualization techniques have been demonstrated a suitable alternative for allowing users to analyze and understand a large amount of data. However, the steps required for visualizing semi-structured data obtained from the Web is not straightforward, and it requires proper treatment before information visualization techniques could be applied. In this work, we present a visualization pipeline for describing the fundamental operations required for visualizing semi-structured data over the Web. For that, we employ Web Scrapping and Web Augmentation techniques for supporting interactive visualizations and solving tasks without changing the context of use of the data. Our approach is duly supported by a framework including scrapping, augmenting and visualization tools and it has been applied to different kinds of websites to demonstrate its validity and feasibility. Our ultimate goal is to expand the limits of our technology for improving the user interaction with websites and creating new experiences for better understanding large data sets.
publishDate 2019
dc.date.none.fl_str_mv 2019
dc.type.none.fl_str_mv info:eu-repo/semantics/conferenceObject
info:eu-repo/semantics/publishedVersion
Objeto de conferencia
http://purl.org/coar/resource_type/c_5794
info:ar-repo/semantics/documentoDeConferencia
format conferenceObject
status_str publishedVersion
dc.identifier.none.fl_str_mv http://sedici.unlp.edu.ar/handle/10915/119026
url http://sedici.unlp.edu.ar/handle/10915/119026
dc.language.none.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv info:eu-repo/semantics/altIdentifier/isbn/978-3-030-19274-7
info:eu-repo/semantics/altIdentifier/issn/0302-9743
info:eu-repo/semantics/altIdentifier/doi/10.1007/978-3-030-19274-7_17
info:eu-repo/semantics/altIdentifier/hdl/11746/10709
dc.rights.none.fl_str_mv info:eu-repo/semantics/openAccess
http://creativecommons.org/licenses/by-nc-sa/4.0/
Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)
eu_rights_str_mv openAccess
rights_invalid_str_mv http://creativecommons.org/licenses/by-nc-sa/4.0/
Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)
dc.format.none.fl_str_mv application/pdf
223-237
dc.source.none.fl_str_mv reponame:SEDICI (UNLP)
instname:Universidad Nacional de La Plata
instacron:UNLP
reponame_str SEDICI (UNLP)
collection SEDICI (UNLP)
instname_str Universidad Nacional de La Plata
instacron_str UNLP
institution UNLP
repository.name.fl_str_mv SEDICI (UNLP) - Universidad Nacional de La Plata
repository.mail.fl_str_mv alira@sedici.unlp.edu.ar
_version_ 1844616158759616512
score 13.070432