An end-user pipeline for scrapping and visualizing semi-structured data over the Web

Autores
Bosetti, Gabriela Alejandra; Firmenich, Sergio; Winckler, Marco; Rossi, Gustavo Héctor; Cornejo Fandos, Ulises; Egyed Zsigmond, Elöd
Año de publicación
2019
Idioma
inglés
Tipo de recurso
documento de conferencia
Estado
versión publicada
Descripción
The Web is a vast source of semi-structured data sets that are made readily available to support the construction of new knowledge. Information visualization techniques have been demonstrated a suitable alternative for allowing users to analyze and understand a large amount of data. However, the steps required for visualizing semi-structured data obtained from the Web is not straightforward, and it requires proper treatment before information visualization techniques could be applied. In this work, we present a visualization pipeline for describing the fundamental operations required for visualizing semi-structured data over the Web. For that, we employ Web Scrapping and Web Augmentation techniques for supporting interactive visualizations and solving tasks without changing the context of use of the data. Our approach is duly supported by a framework including scrapping, augmenting and visualization tools and it has been applied to different kinds of websites to demonstrate its validity and feasibility. Our ultimate goal is to expand the limits of our technology for improving the user interaction with websites and creating new experiences for better understanding large data sets.
Materia
Ciencias de la Computación e Información
Infovis
Web augmentation
Web scrapping
Nivel de accesibilidad
acceso abierto
Condiciones de uso
http://creativecommons.org/licenses/by-nc-sa/4.0/
Repositorio
CIC Digital (CICBA)
Institución
Comisión de Investigaciones Científicas de la Provincia de Buenos Aires
OAI Identificador
oai:digital.cic.gba.gob.ar:11746/10709

id CICBA_7a140de471c136729b49bb398e7aaf86
oai_identifier_str oai:digital.cic.gba.gob.ar:11746/10709
network_acronym_str CICBA
repository_id_str 9441
network_name_str CIC Digital (CICBA)
spelling An end-user pipeline for scrapping and visualizing semi-structured data over the WebBosetti, Gabriela AlejandraFirmenich, SergioWinckler, MarcoRossi, Gustavo HéctorCornejo Fandos, UlisesEgyed Zsigmond, ElödCiencias de la Computación e InformaciónInfovisWeb augmentationWeb scrappingThe Web is a vast source of semi-structured data sets that are made readily available to support the construction of new knowledge. Information visualization techniques have been demonstrated a suitable alternative for allowing users to analyze and understand a large amount of data. However, the steps required for visualizing semi-structured data obtained from the Web is not straightforward, and it requires proper treatment before information visualization techniques could be applied. In this work, we present a visualization pipeline for describing the fundamental operations required for visualizing semi-structured data over the Web. For that, we employ Web Scrapping and Web Augmentation techniques for supporting interactive visualizations and solving tasks without changing the context of use of the data. Our approach is duly supported by a framework including scrapping, augmenting and visualization tools and it has been applied to different kinds of websites to demonstrate its validity and feasibility. Our ultimate goal is to expand the limits of our technology for improving the user interaction with websites and creating new experiences for better understanding large data sets.2019info:eu-repo/semantics/conferenceObjectinfo:eu-repo/semantics/publishedVersionhttp://purl.org/coar/resource_type/c_5794info:ar-repo/semantics/documentoDeConferenciaapplication/pdfhttps://digital.cic.gba.gob.ar/handle/11746/10709enginfo:eu-repo/semantics/altIdentifier/doi/10.1007/978-3-030-19274-7_17info:eu-repo/semantics/openAccesshttp://creativecommons.org/licenses/by-nc-sa/4.0/reponame:CIC Digital (CICBA)instname:Comisión de Investigaciones Científicas de la Provincia de Buenos Airesinstacron:CICBA2025-09-29T13:39:53Zoai:digital.cic.gba.gob.ar:11746/10709Institucionalhttp://digital.cic.gba.gob.arOrganismo científico-tecnológicoNo correspondehttp://digital.cic.gba.gob.ar/oai/snrdmarisa.degiusti@sedici.unlp.edu.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:94412025-09-29 13:39:53.645CIC Digital (CICBA) - Comisión de Investigaciones Científicas de la Provincia de Buenos Airesfalse
dc.title.none.fl_str_mv An end-user pipeline for scrapping and visualizing semi-structured data over the Web
title An end-user pipeline for scrapping and visualizing semi-structured data over the Web
spellingShingle An end-user pipeline for scrapping and visualizing semi-structured data over the Web
Bosetti, Gabriela Alejandra
Ciencias de la Computación e Información
Infovis
Web augmentation
Web scrapping
title_short An end-user pipeline for scrapping and visualizing semi-structured data over the Web
title_full An end-user pipeline for scrapping and visualizing semi-structured data over the Web
title_fullStr An end-user pipeline for scrapping and visualizing semi-structured data over the Web
title_full_unstemmed An end-user pipeline for scrapping and visualizing semi-structured data over the Web
title_sort An end-user pipeline for scrapping and visualizing semi-structured data over the Web
dc.creator.none.fl_str_mv Bosetti, Gabriela Alejandra
Firmenich, Sergio
Winckler, Marco
Rossi, Gustavo Héctor
Cornejo Fandos, Ulises
Egyed Zsigmond, Elöd
author Bosetti, Gabriela Alejandra
author_facet Bosetti, Gabriela Alejandra
Firmenich, Sergio
Winckler, Marco
Rossi, Gustavo Héctor
Cornejo Fandos, Ulises
Egyed Zsigmond, Elöd
author_role author
author2 Firmenich, Sergio
Winckler, Marco
Rossi, Gustavo Héctor
Cornejo Fandos, Ulises
Egyed Zsigmond, Elöd
author2_role author
author
author
author
author
dc.subject.none.fl_str_mv Ciencias de la Computación e Información
Infovis
Web augmentation
Web scrapping
topic Ciencias de la Computación e Información
Infovis
Web augmentation
Web scrapping
dc.description.none.fl_txt_mv The Web is a vast source of semi-structured data sets that are made readily available to support the construction of new knowledge. Information visualization techniques have been demonstrated a suitable alternative for allowing users to analyze and understand a large amount of data. However, the steps required for visualizing semi-structured data obtained from the Web is not straightforward, and it requires proper treatment before information visualization techniques could be applied. In this work, we present a visualization pipeline for describing the fundamental operations required for visualizing semi-structured data over the Web. For that, we employ Web Scrapping and Web Augmentation techniques for supporting interactive visualizations and solving tasks without changing the context of use of the data. Our approach is duly supported by a framework including scrapping, augmenting and visualization tools and it has been applied to different kinds of websites to demonstrate its validity and feasibility. Our ultimate goal is to expand the limits of our technology for improving the user interaction with websites and creating new experiences for better understanding large data sets.
description The Web is a vast source of semi-structured data sets that are made readily available to support the construction of new knowledge. Information visualization techniques have been demonstrated a suitable alternative for allowing users to analyze and understand a large amount of data. However, the steps required for visualizing semi-structured data obtained from the Web is not straightforward, and it requires proper treatment before information visualization techniques could be applied. In this work, we present a visualization pipeline for describing the fundamental operations required for visualizing semi-structured data over the Web. For that, we employ Web Scrapping and Web Augmentation techniques for supporting interactive visualizations and solving tasks without changing the context of use of the data. Our approach is duly supported by a framework including scrapping, augmenting and visualization tools and it has been applied to different kinds of websites to demonstrate its validity and feasibility. Our ultimate goal is to expand the limits of our technology for improving the user interaction with websites and creating new experiences for better understanding large data sets.
publishDate 2019
dc.date.none.fl_str_mv 2019
dc.type.none.fl_str_mv info:eu-repo/semantics/conferenceObject
info:eu-repo/semantics/publishedVersion
http://purl.org/coar/resource_type/c_5794
info:ar-repo/semantics/documentoDeConferencia
format conferenceObject
status_str publishedVersion
dc.identifier.none.fl_str_mv https://digital.cic.gba.gob.ar/handle/11746/10709
url https://digital.cic.gba.gob.ar/handle/11746/10709
dc.language.none.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv info:eu-repo/semantics/altIdentifier/doi/10.1007/978-3-030-19274-7_17
dc.rights.none.fl_str_mv info:eu-repo/semantics/openAccess
http://creativecommons.org/licenses/by-nc-sa/4.0/
eu_rights_str_mv openAccess
rights_invalid_str_mv http://creativecommons.org/licenses/by-nc-sa/4.0/
dc.format.none.fl_str_mv application/pdf
dc.source.none.fl_str_mv reponame:CIC Digital (CICBA)
instname:Comisión de Investigaciones Científicas de la Provincia de Buenos Aires
instacron:CICBA
reponame_str CIC Digital (CICBA)
collection CIC Digital (CICBA)
instname_str Comisión de Investigaciones Científicas de la Provincia de Buenos Aires
instacron_str CICBA
institution CICBA
repository.name.fl_str_mv CIC Digital (CICBA) - Comisión de Investigaciones Científicas de la Provincia de Buenos Aires
repository.mail.fl_str_mv marisa.degiusti@sedici.unlp.edu.ar
_version_ 1844618584040407040
score 13.070432