Platform for collection from heterogeneous web sources and its application to a semantic repository organization at SeDiCI: Preliminaries

Autores
De Giusti, Marisa Raquel; Sobrado, Ariel; Vosou, Agustín; Villarreal, Gonzalo Luján
Año de publicación
2009
Idioma
inglés
Tipo de recurso
artículo
Estado
versión publicada
Descripción
Presentation of a web collection platform designed to relate and unify information available on different standard web sources with a view to creating a userbrowseable thematic repository. The platform will be used at the Intellectual Creation Diffusion Service combined with ontologies and thesaurus to provide improved data sorting. Data is currently spread on web resources and traditional search engines return ranked lists with no semantic relation among documents. Users have to spend a great deal of time relating documents and trying to figure out which ones fully address the issue domain. It is only after locating similarities and differences that information fragments are applied to the user s work, enabling knowledge creation. The proposed platform sorts out the different theme domain functioning modules to allow their use in various knowledge areas. Development includes two agents that searches data base stored URLs, one is capable of identifying bookmarked pages, interpreting labels and providing rules for extracting information and storing it in a RDF data file; on the other hand, the other agent is in charge of getting related URLs from the given one. After this stage, homogenization is applied and transformed information is sorted out according to domain ontologies. The platform allows for more efficient automatic extraction processes and information search among heterogeneous sources that represent the same concepts using different standards.
Dirección PREBI-SEDICI
Materia
Ciencias Informáticas
Bibliotecología
Information Systems
Internetworking
SeDiCI; semantic repository; ontology and thesaurus
Nivel de accesibilidad
acceso abierto
Condiciones de uso
http://creativecommons.org/licenses/by/3.0/
Repositorio
SEDICI (UNLP)
Institución
Universidad Nacional de La Plata
OAI Identificador
oai:sedici.unlp.edu.ar:10915/5525

id SEDICI_1d78539591a1d71716f803c0bd4b236c
oai_identifier_str oai:sedici.unlp.edu.ar:10915/5525
network_acronym_str SEDICI
repository_id_str 1329
network_name_str SEDICI (UNLP)
spelling Platform for collection from heterogeneous web sources and its application to a semantic repository organization at SeDiCI: PreliminariesDe Giusti, Marisa RaquelSobrado, ArielVosou, AgustínVillarreal, Gonzalo LujánCiencias InformáticasBibliotecologíaInformation SystemsInternetworkingSeDiCI; semantic repository; ontology and thesaurusPresentation of a web collection platform designed to relate and unify information available on different standard web sources with a view to creating a userbrowseable thematic repository. The platform will be used at the Intellectual Creation Diffusion Service combined with ontologies and thesaurus to provide improved data sorting. Data is currently spread on web resources and traditional search engines return ranked lists with no semantic relation among documents. Users have to spend a great deal of time relating documents and trying to figure out which ones fully address the issue domain. It is only after locating similarities and differences that information fragments are applied to the user s work, enabling knowledge creation. The proposed platform sorts out the different theme domain functioning modules to allow their use in various knowledge areas. Development includes two agents that searches data base stored URLs, one is capable of identifying bookmarked pages, interpreting labels and providing rules for extracting information and storing it in a RDF data file; on the other hand, the other agent is in charge of getting related URLs from the given one. After this stage, homogenization is applied and transformed information is sorted out according to domain ontologies. The platform allows for more efficient automatic extraction processes and information search among heterogeneous sources that represent the same concepts using different standards.Dirección PREBI-SEDICI2009-10info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionArticulohttp://purl.org/coar/resource_type/c_6501info:ar-repo/semantics/articuloapplication/pdf89-92http://sedici.unlp.edu.ar/handle/10915/5525enginfo:eu-repo/semantics/altIdentifier/url/http://journal.info.unlp.edu.ar/journal/journal26/papers/JCST-Oct09-7.pdfinfo:eu-repo/semantics/openAccesshttp://creativecommons.org/licenses/by/3.0/Creative Commons Attribution 3.0 Unported (CC BY 3.0)reponame:SEDICI (UNLP)instname:Universidad Nacional de La Platainstacron:UNLP2025-09-29T10:49:41Zoai:sedici.unlp.edu.ar:10915/5525Institucionalhttp://sedici.unlp.edu.ar/Universidad públicaNo correspondehttp://sedici.unlp.edu.ar/oai/snrdalira@sedici.unlp.edu.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:13292025-09-29 10:49:41.753SEDICI (UNLP) - Universidad Nacional de La Platafalse
dc.title.none.fl_str_mv Platform for collection from heterogeneous web sources and its application to a semantic repository organization at SeDiCI: Preliminaries
title Platform for collection from heterogeneous web sources and its application to a semantic repository organization at SeDiCI: Preliminaries
spellingShingle Platform for collection from heterogeneous web sources and its application to a semantic repository organization at SeDiCI: Preliminaries
De Giusti, Marisa Raquel
Ciencias Informáticas
Bibliotecología
Information Systems
Internetworking
SeDiCI; semantic repository; ontology and thesaurus
title_short Platform for collection from heterogeneous web sources and its application to a semantic repository organization at SeDiCI: Preliminaries
title_full Platform for collection from heterogeneous web sources and its application to a semantic repository organization at SeDiCI: Preliminaries
title_fullStr Platform for collection from heterogeneous web sources and its application to a semantic repository organization at SeDiCI: Preliminaries
title_full_unstemmed Platform for collection from heterogeneous web sources and its application to a semantic repository organization at SeDiCI: Preliminaries
title_sort Platform for collection from heterogeneous web sources and its application to a semantic repository organization at SeDiCI: Preliminaries
dc.creator.none.fl_str_mv De Giusti, Marisa Raquel
Sobrado, Ariel
Vosou, Agustín
Villarreal, Gonzalo Luján
author De Giusti, Marisa Raquel
author_facet De Giusti, Marisa Raquel
Sobrado, Ariel
Vosou, Agustín
Villarreal, Gonzalo Luján
author_role author
author2 Sobrado, Ariel
Vosou, Agustín
Villarreal, Gonzalo Luján
author2_role author
author
author
dc.subject.none.fl_str_mv Ciencias Informáticas
Bibliotecología
Information Systems
Internetworking
SeDiCI; semantic repository; ontology and thesaurus
topic Ciencias Informáticas
Bibliotecología
Information Systems
Internetworking
SeDiCI; semantic repository; ontology and thesaurus
dc.description.none.fl_txt_mv Presentation of a web collection platform designed to relate and unify information available on different standard web sources with a view to creating a userbrowseable thematic repository. The platform will be used at the Intellectual Creation Diffusion Service combined with ontologies and thesaurus to provide improved data sorting. Data is currently spread on web resources and traditional search engines return ranked lists with no semantic relation among documents. Users have to spend a great deal of time relating documents and trying to figure out which ones fully address the issue domain. It is only after locating similarities and differences that information fragments are applied to the user s work, enabling knowledge creation. The proposed platform sorts out the different theme domain functioning modules to allow their use in various knowledge areas. Development includes two agents that searches data base stored URLs, one is capable of identifying bookmarked pages, interpreting labels and providing rules for extracting information and storing it in a RDF data file; on the other hand, the other agent is in charge of getting related URLs from the given one. After this stage, homogenization is applied and transformed information is sorted out according to domain ontologies. The platform allows for more efficient automatic extraction processes and information search among heterogeneous sources that represent the same concepts using different standards.
Dirección PREBI-SEDICI
description Presentation of a web collection platform designed to relate and unify information available on different standard web sources with a view to creating a userbrowseable thematic repository. The platform will be used at the Intellectual Creation Diffusion Service combined with ontologies and thesaurus to provide improved data sorting. Data is currently spread on web resources and traditional search engines return ranked lists with no semantic relation among documents. Users have to spend a great deal of time relating documents and trying to figure out which ones fully address the issue domain. It is only after locating similarities and differences that information fragments are applied to the user s work, enabling knowledge creation. The proposed platform sorts out the different theme domain functioning modules to allow their use in various knowledge areas. Development includes two agents that searches data base stored URLs, one is capable of identifying bookmarked pages, interpreting labels and providing rules for extracting information and storing it in a RDF data file; on the other hand, the other agent is in charge of getting related URLs from the given one. After this stage, homogenization is applied and transformed information is sorted out according to domain ontologies. The platform allows for more efficient automatic extraction processes and information search among heterogeneous sources that represent the same concepts using different standards.
publishDate 2009
dc.date.none.fl_str_mv 2009-10
dc.type.none.fl_str_mv info:eu-repo/semantics/article
info:eu-repo/semantics/publishedVersion
Articulo
http://purl.org/coar/resource_type/c_6501
info:ar-repo/semantics/articulo
format article
status_str publishedVersion
dc.identifier.none.fl_str_mv http://sedici.unlp.edu.ar/handle/10915/5525
url http://sedici.unlp.edu.ar/handle/10915/5525
dc.language.none.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv info:eu-repo/semantics/altIdentifier/url/http://journal.info.unlp.edu.ar/journal/journal26/papers/JCST-Oct09-7.pdf
dc.rights.none.fl_str_mv info:eu-repo/semantics/openAccess
http://creativecommons.org/licenses/by/3.0/
Creative Commons Attribution 3.0 Unported (CC BY 3.0)
eu_rights_str_mv openAccess
rights_invalid_str_mv http://creativecommons.org/licenses/by/3.0/
Creative Commons Attribution 3.0 Unported (CC BY 3.0)
dc.format.none.fl_str_mv application/pdf
89-92
dc.source.none.fl_str_mv reponame:SEDICI (UNLP)
instname:Universidad Nacional de La Plata
instacron:UNLP
reponame_str SEDICI (UNLP)
collection SEDICI (UNLP)
instname_str Universidad Nacional de La Plata
instacron_str UNLP
institution UNLP
repository.name.fl_str_mv SEDICI (UNLP) - Universidad Nacional de La Plata
repository.mail.fl_str_mv alira@sedici.unlp.edu.ar
_version_ 1844615750705217536
score 13.070432