Platform for collection from heterogeneous web sources and its application to a semantic repository organization at SeDiCI: Preliminaries

Autores
Vosou, Agustín; De Giusti, Marisa Raquel; Sobrado, Ariel; Villarreal, Gonzalo Luján
Año de publicación
2009
Idioma
inglés
Tipo de recurso
artículo
Estado
versión enviada
Descripción
Presentation of a web collection platform designed to relate and unify information available on different standard web sources with a view to creating a userbrowseable thematic repository. The platform will be used at the Intellectual Creation Diffusion Service combined with ontologies and thesaurus to provide improved data sorting. Data is currently spread on web resources and traditional search engines return ranked lists with no semantic relation among documents. Users have to spend a great deal of time relating documents and trying to figure out which ones fully address the issue domain. It is only after locating similarities and differences that information fragments are applied to the user s work, enabling knowledge creation. The proposed platform sorts out the different theme domain functioning modules to allow their use in various knowledge areas. Development includes two agents that searches data base stored URLs, one is capable of identifying bookmarked pages, interpreting labels and providing rules for extracting information and storing it in a RDF data file; on the other hand, the other agent is in charge of getting related URLs from the given one. After this stage, homogenization is applied and transformed information is sorted out according to domain ontologies. The platform allows for more efficient automatic extraction processes and information search among heterogeneous sources that represent the same concepts using different standards.
Materia
Ciencias de la Computación e Información
Bibliotecología
Information Systems
Internetworking
semantic repository
ontology and thesaurus
SEDICI
Nivel de accesibilidad
acceso abierto
Condiciones de uso
http://creativecommons.org/licenses/by/4.0/
Repositorio
CIC Digital (CICBA)
Institución
Comisión de Investigaciones Científicas de la Provincia de Buenos Aires
OAI Identificador
oai:digital.cic.gba.gob.ar:11746/3818

id CICBA_d2565d6e7a766cc9d31e09bdcf33ddce
oai_identifier_str oai:digital.cic.gba.gob.ar:11746/3818
network_acronym_str CICBA
repository_id_str 9441
network_name_str CIC Digital (CICBA)
spelling Platform for collection from heterogeneous web sources and its application to a semantic repository organization at SeDiCI: PreliminariesVosou, AgustínDe Giusti, Marisa RaquelSobrado, ArielVillarreal, Gonzalo LujánCiencias de la Computación e InformaciónBibliotecologíaInformation SystemsInternetworkingsemantic repositoryontology and thesaurusSEDICIPresentation of a web collection platform designed to relate and unify information available on different standard web sources with a view to creating a userbrowseable thematic repository. The platform will be used at the Intellectual Creation Diffusion Service combined with ontologies and thesaurus to provide improved data sorting. Data is currently spread on web resources and traditional search engines return ranked lists with no semantic relation among documents. Users have to spend a great deal of time relating documents and trying to figure out which ones fully address the issue domain. It is only after locating similarities and differences that information fragments are applied to the user s work, enabling knowledge creation. The proposed platform sorts out the different theme domain functioning modules to allow their use in various knowledge areas. Development includes two agents that searches data base stored URLs, one is capable of identifying bookmarked pages, interpreting labels and providing rules for extracting information and storing it in a RDF data file; on the other hand, the other agent is in charge of getting related URLs from the given one. After this stage, homogenization is applied and transformed information is sorted out according to domain ontologies. The platform allows for more efficient automatic extraction processes and information search among heterogeneous sources that represent the same concepts using different standards.2009-10-01info:eu-repo/semantics/articleinfo:eu-repo/semantics/submittedVersionhttp://purl.org/coar/resource_type/c_6501info:ar-repo/semantics/articuloapplication/pdfhttps://digital.cic.gba.gob.ar/handle/11746/3818enginfo:eu-repo/semantics/openAccesshttp://creativecommons.org/licenses/by/4.0/reponame:CIC Digital (CICBA)instname:Comisión de Investigaciones Científicas de la Provincia de Buenos Airesinstacron:CICBA2025-09-29T13:40:11Zoai:digital.cic.gba.gob.ar:11746/3818Institucionalhttp://digital.cic.gba.gob.arOrganismo científico-tecnológicoNo correspondehttp://digital.cic.gba.gob.ar/oai/snrdmarisa.degiusti@sedici.unlp.edu.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:94412025-09-29 13:40:12.125CIC Digital (CICBA) - Comisión de Investigaciones Científicas de la Provincia de Buenos Airesfalse
dc.title.none.fl_str_mv Platform for collection from heterogeneous web sources and its application to a semantic repository organization at SeDiCI: Preliminaries
title Platform for collection from heterogeneous web sources and its application to a semantic repository organization at SeDiCI: Preliminaries
spellingShingle Platform for collection from heterogeneous web sources and its application to a semantic repository organization at SeDiCI: Preliminaries
Vosou, Agustín
Ciencias de la Computación e Información
Bibliotecología
Information Systems
Internetworking
semantic repository
ontology and thesaurus
SEDICI
title_short Platform for collection from heterogeneous web sources and its application to a semantic repository organization at SeDiCI: Preliminaries
title_full Platform for collection from heterogeneous web sources and its application to a semantic repository organization at SeDiCI: Preliminaries
title_fullStr Platform for collection from heterogeneous web sources and its application to a semantic repository organization at SeDiCI: Preliminaries
title_full_unstemmed Platform for collection from heterogeneous web sources and its application to a semantic repository organization at SeDiCI: Preliminaries
title_sort Platform for collection from heterogeneous web sources and its application to a semantic repository organization at SeDiCI: Preliminaries
dc.creator.none.fl_str_mv Vosou, Agustín
De Giusti, Marisa Raquel
Sobrado, Ariel
Villarreal, Gonzalo Luján
author Vosou, Agustín
author_facet Vosou, Agustín
De Giusti, Marisa Raquel
Sobrado, Ariel
Villarreal, Gonzalo Luján
author_role author
author2 De Giusti, Marisa Raquel
Sobrado, Ariel
Villarreal, Gonzalo Luján
author2_role author
author
author
dc.subject.none.fl_str_mv Ciencias de la Computación e Información
Bibliotecología
Information Systems
Internetworking
semantic repository
ontology and thesaurus
SEDICI
topic Ciencias de la Computación e Información
Bibliotecología
Information Systems
Internetworking
semantic repository
ontology and thesaurus
SEDICI
dc.description.none.fl_txt_mv Presentation of a web collection platform designed to relate and unify information available on different standard web sources with a view to creating a userbrowseable thematic repository. The platform will be used at the Intellectual Creation Diffusion Service combined with ontologies and thesaurus to provide improved data sorting. Data is currently spread on web resources and traditional search engines return ranked lists with no semantic relation among documents. Users have to spend a great deal of time relating documents and trying to figure out which ones fully address the issue domain. It is only after locating similarities and differences that information fragments are applied to the user s work, enabling knowledge creation. The proposed platform sorts out the different theme domain functioning modules to allow their use in various knowledge areas. Development includes two agents that searches data base stored URLs, one is capable of identifying bookmarked pages, interpreting labels and providing rules for extracting information and storing it in a RDF data file; on the other hand, the other agent is in charge of getting related URLs from the given one. After this stage, homogenization is applied and transformed information is sorted out according to domain ontologies. The platform allows for more efficient automatic extraction processes and information search among heterogeneous sources that represent the same concepts using different standards.
description Presentation of a web collection platform designed to relate and unify information available on different standard web sources with a view to creating a userbrowseable thematic repository. The platform will be used at the Intellectual Creation Diffusion Service combined with ontologies and thesaurus to provide improved data sorting. Data is currently spread on web resources and traditional search engines return ranked lists with no semantic relation among documents. Users have to spend a great deal of time relating documents and trying to figure out which ones fully address the issue domain. It is only after locating similarities and differences that information fragments are applied to the user s work, enabling knowledge creation. The proposed platform sorts out the different theme domain functioning modules to allow their use in various knowledge areas. Development includes two agents that searches data base stored URLs, one is capable of identifying bookmarked pages, interpreting labels and providing rules for extracting information and storing it in a RDF data file; on the other hand, the other agent is in charge of getting related URLs from the given one. After this stage, homogenization is applied and transformed information is sorted out according to domain ontologies. The platform allows for more efficient automatic extraction processes and information search among heterogeneous sources that represent the same concepts using different standards.
publishDate 2009
dc.date.none.fl_str_mv 2009-10-01
dc.type.none.fl_str_mv info:eu-repo/semantics/article
info:eu-repo/semantics/submittedVersion
http://purl.org/coar/resource_type/c_6501
info:ar-repo/semantics/articulo
format article
status_str submittedVersion
dc.identifier.none.fl_str_mv https://digital.cic.gba.gob.ar/handle/11746/3818
url https://digital.cic.gba.gob.ar/handle/11746/3818
dc.language.none.fl_str_mv eng
language eng
dc.rights.none.fl_str_mv info:eu-repo/semantics/openAccess
http://creativecommons.org/licenses/by/4.0/
eu_rights_str_mv openAccess
rights_invalid_str_mv http://creativecommons.org/licenses/by/4.0/
dc.format.none.fl_str_mv application/pdf
dc.source.none.fl_str_mv reponame:CIC Digital (CICBA)
instname:Comisión de Investigaciones Científicas de la Provincia de Buenos Aires
instacron:CICBA
reponame_str CIC Digital (CICBA)
collection CIC Digital (CICBA)
instname_str Comisión de Investigaciones Científicas de la Provincia de Buenos Aires
instacron_str CICBA
institution CICBA
repository.name.fl_str_mv CIC Digital (CICBA) - Comisión de Investigaciones Científicas de la Provincia de Buenos Aires
repository.mail.fl_str_mv marisa.degiusti@sedici.unlp.edu.ar
_version_ 1844618607311454208
score 13.070432