Evaluating Information Extraction Approaches in the Construction of a Real Estate Observatory
- Autores
- Tanevitch, Luciana; Antonelli, Leandro; Torres, Diego
- Año de publicación
- 2025
- Idioma
- inglés
- Tipo de recurso
- documento de conferencia
- Estado
- versión publicada
- Descripción
- A real estate observatory plays a significant role in the aggregation and analysis of real estate market data. The information that lies in real estate advertisements can be leveraged to populate such an observatory. However, this data can present itself in both a structured and an unstructured manner. Unstructured data represents a problem to automatically process and extract information since it lacks a predefined structure. Thus, there’s a need for techniques to give structure to unstructured data. Information Extraction (IE) is the process of structuring data from unstructured data. Natural Language Processing techniques enable machines to understand texts, making them particularly significant in the context of IE. This work evaluates both rule-based and machine-learning based IE approaches to extract features from real estate descriptions within advertisements. Those features are relevant in the context of real estate observatory construction. The performance of each approach is measured using precision, recall and f1-score metrics.
- Materia
-
Ciencias de la Computación e Información
Information Extraction
Natural Language Processing
Real Estate Observatory - Nivel de accesibilidad
- acceso abierto
- Condiciones de uso
- http://creativecommons.org/licenses/by-nc-nd/4.0/
- Repositorio
- Institución
- Comisión de Investigaciones Científicas de la Provincia de Buenos Aires
- OAI Identificador
- oai:digital.cic.gba.gob.ar:11746/12550
Ver los metadatos del registro completo
id |
CICBA_9d7ed196963ca41961f3094fcfed1bc4 |
---|---|
oai_identifier_str |
oai:digital.cic.gba.gob.ar:11746/12550 |
network_acronym_str |
CICBA |
repository_id_str |
9441 |
network_name_str |
CIC Digital (CICBA) |
spelling |
Evaluating Information Extraction Approaches in the Construction of a Real Estate ObservatoryTanevitch, LucianaAntonelli, LeandroTorres, DiegoCiencias de la Computación e InformaciónInformation ExtractionNatural Language ProcessingReal Estate ObservatoryA real estate observatory plays a significant role in the aggregation and analysis of real estate market data. The information that lies in real estate advertisements can be leveraged to populate such an observatory. However, this data can present itself in both a structured and an unstructured manner. Unstructured data represents a problem to automatically process and extract information since it lacks a predefined structure. Thus, there’s a need for techniques to give structure to unstructured data. Information Extraction (IE) is the process of structuring data from unstructured data. Natural Language Processing techniques enable machines to understand texts, making them particularly significant in the context of IE. This work evaluates both rule-based and machine-learning based IE approaches to extract features from real estate descriptions within advertisements. Those features are relevant in the context of real estate observatory construction. The performance of each approach is measured using precision, recall and f1-score metrics.2025info:eu-repo/semantics/conferenceObjectinfo:eu-repo/semantics/publishedVersionhttp://purl.org/coar/resource_type/c_5794info:ar-repo/semantics/documentoDeConferenciaapplication/pdfhttps://digital.cic.gba.gob.ar/handle/11746/12550enginfo:eu-repo/semantics/altIdentifier/isbn/978-3-031-91690-8info:eu-repo/semantics/altIdentifier/doi/10.1007/978-3-031-91690-8_6info:eu-repo/semantics/openAccesshttp://creativecommons.org/licenses/by-nc-nd/4.0/reponame:CIC Digital (CICBA)instname:Comisión de Investigaciones Científicas de la Provincia de Buenos Airesinstacron:CICBA2025-09-29T13:39:48Zoai:digital.cic.gba.gob.ar:11746/12550Institucionalhttp://digital.cic.gba.gob.arOrganismo científico-tecnológicoNo correspondehttp://digital.cic.gba.gob.ar/oai/snrdmarisa.degiusti@sedici.unlp.edu.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:94412025-09-29 13:39:48.629CIC Digital (CICBA) - Comisión de Investigaciones Científicas de la Provincia de Buenos Airesfalse |
dc.title.none.fl_str_mv |
Evaluating Information Extraction Approaches in the Construction of a Real Estate Observatory |
title |
Evaluating Information Extraction Approaches in the Construction of a Real Estate Observatory |
spellingShingle |
Evaluating Information Extraction Approaches in the Construction of a Real Estate Observatory Tanevitch, Luciana Ciencias de la Computación e Información Information Extraction Natural Language Processing Real Estate Observatory |
title_short |
Evaluating Information Extraction Approaches in the Construction of a Real Estate Observatory |
title_full |
Evaluating Information Extraction Approaches in the Construction of a Real Estate Observatory |
title_fullStr |
Evaluating Information Extraction Approaches in the Construction of a Real Estate Observatory |
title_full_unstemmed |
Evaluating Information Extraction Approaches in the Construction of a Real Estate Observatory |
title_sort |
Evaluating Information Extraction Approaches in the Construction of a Real Estate Observatory |
dc.creator.none.fl_str_mv |
Tanevitch, Luciana Antonelli, Leandro Torres, Diego |
author |
Tanevitch, Luciana |
author_facet |
Tanevitch, Luciana Antonelli, Leandro Torres, Diego |
author_role |
author |
author2 |
Antonelli, Leandro Torres, Diego |
author2_role |
author author |
dc.subject.none.fl_str_mv |
Ciencias de la Computación e Información Information Extraction Natural Language Processing Real Estate Observatory |
topic |
Ciencias de la Computación e Información Information Extraction Natural Language Processing Real Estate Observatory |
dc.description.none.fl_txt_mv |
A real estate observatory plays a significant role in the aggregation and analysis of real estate market data. The information that lies in real estate advertisements can be leveraged to populate such an observatory. However, this data can present itself in both a structured and an unstructured manner. Unstructured data represents a problem to automatically process and extract information since it lacks a predefined structure. Thus, there’s a need for techniques to give structure to unstructured data. Information Extraction (IE) is the process of structuring data from unstructured data. Natural Language Processing techniques enable machines to understand texts, making them particularly significant in the context of IE. This work evaluates both rule-based and machine-learning based IE approaches to extract features from real estate descriptions within advertisements. Those features are relevant in the context of real estate observatory construction. The performance of each approach is measured using precision, recall and f1-score metrics. |
description |
A real estate observatory plays a significant role in the aggregation and analysis of real estate market data. The information that lies in real estate advertisements can be leveraged to populate such an observatory. However, this data can present itself in both a structured and an unstructured manner. Unstructured data represents a problem to automatically process and extract information since it lacks a predefined structure. Thus, there’s a need for techniques to give structure to unstructured data. Information Extraction (IE) is the process of structuring data from unstructured data. Natural Language Processing techniques enable machines to understand texts, making them particularly significant in the context of IE. This work evaluates both rule-based and machine-learning based IE approaches to extract features from real estate descriptions within advertisements. Those features are relevant in the context of real estate observatory construction. The performance of each approach is measured using precision, recall and f1-score metrics. |
publishDate |
2025 |
dc.date.none.fl_str_mv |
2025 |
dc.type.none.fl_str_mv |
info:eu-repo/semantics/conferenceObject info:eu-repo/semantics/publishedVersion http://purl.org/coar/resource_type/c_5794 info:ar-repo/semantics/documentoDeConferencia |
format |
conferenceObject |
status_str |
publishedVersion |
dc.identifier.none.fl_str_mv |
https://digital.cic.gba.gob.ar/handle/11746/12550 |
url |
https://digital.cic.gba.gob.ar/handle/11746/12550 |
dc.language.none.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
info:eu-repo/semantics/altIdentifier/isbn/978-3-031-91690-8 info:eu-repo/semantics/altIdentifier/doi/10.1007/978-3-031-91690-8_6 |
dc.rights.none.fl_str_mv |
info:eu-repo/semantics/openAccess http://creativecommons.org/licenses/by-nc-nd/4.0/ |
eu_rights_str_mv |
openAccess |
rights_invalid_str_mv |
http://creativecommons.org/licenses/by-nc-nd/4.0/ |
dc.format.none.fl_str_mv |
application/pdf |
dc.source.none.fl_str_mv |
reponame:CIC Digital (CICBA) instname:Comisión de Investigaciones Científicas de la Provincia de Buenos Aires instacron:CICBA |
reponame_str |
CIC Digital (CICBA) |
collection |
CIC Digital (CICBA) |
instname_str |
Comisión de Investigaciones Científicas de la Provincia de Buenos Aires |
instacron_str |
CICBA |
institution |
CICBA |
repository.name.fl_str_mv |
CIC Digital (CICBA) - Comisión de Investigaciones Científicas de la Provincia de Buenos Aires |
repository.mail.fl_str_mv |
marisa.degiusti@sedici.unlp.edu.ar |
_version_ |
1844618577485758464 |
score |
13.070432 |