Leveraging Large Language Models for Ontology-Based Data Access: A Preliminary Analysis
- Autores
- Gómez, Sergio Alejandro; Fillottrani, Pablo Rubén
- Año de publicación
- 2024
- Idioma
- inglés
- Tipo de recurso
- documento de conferencia
- Estado
- versión publicada
- Descripción
- In Ontology-Based Data Access (OBDA), we study how to represent legacy data sources using ontologies. This enables a modern, distributed, uniform data representation format with the ability to perform intelligent querying and processing. This task requires the development of software to interpret the data and express it as ontologies, which takes considerable time. On the other hand, large language models (LLM) have lately shown themselves to be great solution providers due to their ability to generate solutions from input specified in natural language by an end user. In this paper, we explore the potential of LLM to perform OBDA automatically. Our research hypothesis is that is possible to use an LLM tool like ChatGPT to perform OBDA. For this purpose, we studied ChatGPT responses with different problems associated with OBDA. We discovered that ChatGPT is able to generate ontologies from free text as well as from tables expressed as text or in CSV format. ChatGPT is also able to generate SPARQL queries, and it is also successful in expressing relational tables as ontologies being capable of correcting violations of integrity constraints when appropriately directed.
Red de Universidades con Carreras en Informática - Materia
-
Ciencias Informáticas
Ontology-Based Data Access
Large Language Models
Ontologies
CSV - Nivel de accesibilidad
- acceso abierto
- Condiciones de uso
- http://creativecommons.org/licenses/by-nc-sa/4.0/
- Repositorio
.jpg)
- Institución
- Universidad Nacional de La Plata
- OAI Identificador
- oai:sedici.unlp.edu.ar:10915/176820
Ver los metadatos del registro completo
| id |
SEDICI_dfcbebceee531776b5b4d6fb00440d28 |
|---|---|
| oai_identifier_str |
oai:sedici.unlp.edu.ar:10915/176820 |
| network_acronym_str |
SEDICI |
| repository_id_str |
1329 |
| network_name_str |
SEDICI (UNLP) |
| spelling |
Leveraging Large Language Models for Ontology-Based Data Access: A Preliminary AnalysisGómez, Sergio AlejandroFillottrani, Pablo RubénCiencias InformáticasOntology-Based Data AccessLarge Language ModelsOntologiesCSVIn Ontology-Based Data Access (OBDA), we study how to represent legacy data sources using ontologies. This enables a modern, distributed, uniform data representation format with the ability to perform intelligent querying and processing. This task requires the development of software to interpret the data and express it as ontologies, which takes considerable time. On the other hand, large language models (LLM) have lately shown themselves to be great solution providers due to their ability to generate solutions from input specified in natural language by an end user. In this paper, we explore the potential of LLM to perform OBDA automatically. Our research hypothesis is that is possible to use an LLM tool like ChatGPT to perform OBDA. For this purpose, we studied ChatGPT responses with different problems associated with OBDA. We discovered that ChatGPT is able to generate ontologies from free text as well as from tables expressed as text or in CSV format. ChatGPT is also able to generate SPARQL queries, and it is also successful in expressing relational tables as ontologies being capable of correcting violations of integrity constraints when appropriately directed.Red de Universidades con Carreras en Informática2024-10info:eu-repo/semantics/conferenceObjectinfo:eu-repo/semantics/publishedVersionObjeto de conferenciahttp://purl.org/coar/resource_type/c_5794info:ar-repo/semantics/documentoDeConferenciaapplication/pdf996-1005http://sedici.unlp.edu.ar/handle/10915/176820enginfo:eu-repo/semantics/altIdentifier/isbn/978-950-34-2428-5info:eu-repo/semantics/reference/hdl/10915/172755info:eu-repo/semantics/openAccesshttp://creativecommons.org/licenses/by-nc-sa/4.0/Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)reponame:SEDICI (UNLP)instname:Universidad Nacional de La Platainstacron:UNLP2025-10-15T11:39:14Zoai:sedici.unlp.edu.ar:10915/176820Institucionalhttp://sedici.unlp.edu.ar/Universidad públicaNo correspondehttp://sedici.unlp.edu.ar/oai/snrdalira@sedici.unlp.edu.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:13292025-10-15 11:39:14.968SEDICI (UNLP) - Universidad Nacional de La Platafalse |
| dc.title.none.fl_str_mv |
Leveraging Large Language Models for Ontology-Based Data Access: A Preliminary Analysis |
| title |
Leveraging Large Language Models for Ontology-Based Data Access: A Preliminary Analysis |
| spellingShingle |
Leveraging Large Language Models for Ontology-Based Data Access: A Preliminary Analysis Gómez, Sergio Alejandro Ciencias Informáticas Ontology-Based Data Access Large Language Models Ontologies CSV |
| title_short |
Leveraging Large Language Models for Ontology-Based Data Access: A Preliminary Analysis |
| title_full |
Leveraging Large Language Models for Ontology-Based Data Access: A Preliminary Analysis |
| title_fullStr |
Leveraging Large Language Models for Ontology-Based Data Access: A Preliminary Analysis |
| title_full_unstemmed |
Leveraging Large Language Models for Ontology-Based Data Access: A Preliminary Analysis |
| title_sort |
Leveraging Large Language Models for Ontology-Based Data Access: A Preliminary Analysis |
| dc.creator.none.fl_str_mv |
Gómez, Sergio Alejandro Fillottrani, Pablo Rubén |
| author |
Gómez, Sergio Alejandro |
| author_facet |
Gómez, Sergio Alejandro Fillottrani, Pablo Rubén |
| author_role |
author |
| author2 |
Fillottrani, Pablo Rubén |
| author2_role |
author |
| dc.subject.none.fl_str_mv |
Ciencias Informáticas Ontology-Based Data Access Large Language Models Ontologies CSV |
| topic |
Ciencias Informáticas Ontology-Based Data Access Large Language Models Ontologies CSV |
| dc.description.none.fl_txt_mv |
In Ontology-Based Data Access (OBDA), we study how to represent legacy data sources using ontologies. This enables a modern, distributed, uniform data representation format with the ability to perform intelligent querying and processing. This task requires the development of software to interpret the data and express it as ontologies, which takes considerable time. On the other hand, large language models (LLM) have lately shown themselves to be great solution providers due to their ability to generate solutions from input specified in natural language by an end user. In this paper, we explore the potential of LLM to perform OBDA automatically. Our research hypothesis is that is possible to use an LLM tool like ChatGPT to perform OBDA. For this purpose, we studied ChatGPT responses with different problems associated with OBDA. We discovered that ChatGPT is able to generate ontologies from free text as well as from tables expressed as text or in CSV format. ChatGPT is also able to generate SPARQL queries, and it is also successful in expressing relational tables as ontologies being capable of correcting violations of integrity constraints when appropriately directed. Red de Universidades con Carreras en Informática |
| description |
In Ontology-Based Data Access (OBDA), we study how to represent legacy data sources using ontologies. This enables a modern, distributed, uniform data representation format with the ability to perform intelligent querying and processing. This task requires the development of software to interpret the data and express it as ontologies, which takes considerable time. On the other hand, large language models (LLM) have lately shown themselves to be great solution providers due to their ability to generate solutions from input specified in natural language by an end user. In this paper, we explore the potential of LLM to perform OBDA automatically. Our research hypothesis is that is possible to use an LLM tool like ChatGPT to perform OBDA. For this purpose, we studied ChatGPT responses with different problems associated with OBDA. We discovered that ChatGPT is able to generate ontologies from free text as well as from tables expressed as text or in CSV format. ChatGPT is also able to generate SPARQL queries, and it is also successful in expressing relational tables as ontologies being capable of correcting violations of integrity constraints when appropriately directed. |
| publishDate |
2024 |
| dc.date.none.fl_str_mv |
2024-10 |
| dc.type.none.fl_str_mv |
info:eu-repo/semantics/conferenceObject info:eu-repo/semantics/publishedVersion Objeto de conferencia http://purl.org/coar/resource_type/c_5794 info:ar-repo/semantics/documentoDeConferencia |
| format |
conferenceObject |
| status_str |
publishedVersion |
| dc.identifier.none.fl_str_mv |
http://sedici.unlp.edu.ar/handle/10915/176820 |
| url |
http://sedici.unlp.edu.ar/handle/10915/176820 |
| dc.language.none.fl_str_mv |
eng |
| language |
eng |
| dc.relation.none.fl_str_mv |
info:eu-repo/semantics/altIdentifier/isbn/978-950-34-2428-5 info:eu-repo/semantics/reference/hdl/10915/172755 |
| dc.rights.none.fl_str_mv |
info:eu-repo/semantics/openAccess http://creativecommons.org/licenses/by-nc-sa/4.0/ Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) |
| eu_rights_str_mv |
openAccess |
| rights_invalid_str_mv |
http://creativecommons.org/licenses/by-nc-sa/4.0/ Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) |
| dc.format.none.fl_str_mv |
application/pdf 996-1005 |
| dc.source.none.fl_str_mv |
reponame:SEDICI (UNLP) instname:Universidad Nacional de La Plata instacron:UNLP |
| reponame_str |
SEDICI (UNLP) |
| collection |
SEDICI (UNLP) |
| instname_str |
Universidad Nacional de La Plata |
| instacron_str |
UNLP |
| institution |
UNLP |
| repository.name.fl_str_mv |
SEDICI (UNLP) - Universidad Nacional de La Plata |
| repository.mail.fl_str_mv |
alira@sedici.unlp.edu.ar |
| _version_ |
1846064408449515520 |
| score |
13.22299 |