Evaluating Information Extraction Approaches in the Construction of a Real Estate Observatory

Autores
Tanevitch, Luciana; Antonelli, Leandro; Torres, Diego
Año de publicación
2025
Idioma
inglés
Tipo de recurso
documento de conferencia
Estado
versión publicada
Descripción
A real estate observatory plays a significant role in the aggregation and analysis of real estate market data. The information that lies in real estate advertisements can be leveraged to populate such an observatory. However, this data can present itself in both a structured and an unstructured manner. Unstructured data represents a problem to automatically process and extract information since it lacks a predefined structure. Thus, there’s a need for techniques to give structure to unstructured data. Information Extraction (IE) is the process of structuring data from unstructured data. Natural Language Processing techniques enable machines to understand texts, making them particularly significant in the context of IE. This work evaluates both rule-based and machine-learning based IE approaches to extract features from real estate descriptions within advertisements. Those features are relevant in the context of real estate observatory construction. The performance of each approach is measured using precision, recall and f1-score metrics.
Materia
Ciencias de la Computación e Información
Information Extraction
Natural Language Processing
Real Estate Observatory
Nivel de accesibilidad
acceso abierto
Condiciones de uso
http://creativecommons.org/licenses/by-nc-nd/4.0/
Repositorio
CIC Digital (CICBA)
Institución
Comisión de Investigaciones Científicas de la Provincia de Buenos Aires
OAI Identificador
oai:digital.cic.gba.gob.ar:11746/12550

id CICBA_9d7ed196963ca41961f3094fcfed1bc4
oai_identifier_str oai:digital.cic.gba.gob.ar:11746/12550
network_acronym_str CICBA
repository_id_str 9441
network_name_str CIC Digital (CICBA)
spelling Evaluating Information Extraction Approaches in the Construction of a Real Estate ObservatoryTanevitch, LucianaAntonelli, LeandroTorres, DiegoCiencias de la Computación e InformaciónInformation ExtractionNatural Language ProcessingReal Estate ObservatoryA real estate observatory plays a significant role in the aggregation and analysis of real estate market data. The information that lies in real estate advertisements can be leveraged to populate such an observatory. However, this data can present itself in both a structured and an unstructured manner. Unstructured data represents a problem to automatically process and extract information since it lacks a predefined structure. Thus, there’s a need for techniques to give structure to unstructured data. Information Extraction (IE) is the process of structuring data from unstructured data. Natural Language Processing techniques enable machines to understand texts, making them particularly significant in the context of IE. This work evaluates both rule-based and machine-learning based IE approaches to extract features from real estate descriptions within advertisements. Those features are relevant in the context of real estate observatory construction. The performance of each approach is measured using precision, recall and f1-score metrics.2025info:eu-repo/semantics/conferenceObjectinfo:eu-repo/semantics/publishedVersionhttp://purl.org/coar/resource_type/c_5794info:ar-repo/semantics/documentoDeConferenciaapplication/pdfhttps://digital.cic.gba.gob.ar/handle/11746/12550enginfo:eu-repo/semantics/altIdentifier/isbn/978-3-031-91690-8info:eu-repo/semantics/altIdentifier/doi/10.1007/978-3-031-91690-8_6info:eu-repo/semantics/openAccesshttp://creativecommons.org/licenses/by-nc-nd/4.0/reponame:CIC Digital (CICBA)instname:Comisión de Investigaciones Científicas de la Provincia de Buenos Airesinstacron:CICBA2025-09-29T13:39:48Zoai:digital.cic.gba.gob.ar:11746/12550Institucionalhttp://digital.cic.gba.gob.arOrganismo científico-tecnológicoNo correspondehttp://digital.cic.gba.gob.ar/oai/snrdmarisa.degiusti@sedici.unlp.edu.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:94412025-09-29 13:39:48.629CIC Digital (CICBA) - Comisión de Investigaciones Científicas de la Provincia de Buenos Airesfalse
dc.title.none.fl_str_mv Evaluating Information Extraction Approaches in the Construction of a Real Estate Observatory
title Evaluating Information Extraction Approaches in the Construction of a Real Estate Observatory
spellingShingle Evaluating Information Extraction Approaches in the Construction of a Real Estate Observatory
Tanevitch, Luciana
Ciencias de la Computación e Información
Information Extraction
Natural Language Processing
Real Estate Observatory
title_short Evaluating Information Extraction Approaches in the Construction of a Real Estate Observatory
title_full Evaluating Information Extraction Approaches in the Construction of a Real Estate Observatory
title_fullStr Evaluating Information Extraction Approaches in the Construction of a Real Estate Observatory
title_full_unstemmed Evaluating Information Extraction Approaches in the Construction of a Real Estate Observatory
title_sort Evaluating Information Extraction Approaches in the Construction of a Real Estate Observatory
dc.creator.none.fl_str_mv Tanevitch, Luciana
Antonelli, Leandro
Torres, Diego
author Tanevitch, Luciana
author_facet Tanevitch, Luciana
Antonelli, Leandro
Torres, Diego
author_role author
author2 Antonelli, Leandro
Torres, Diego
author2_role author
author
dc.subject.none.fl_str_mv Ciencias de la Computación e Información
Information Extraction
Natural Language Processing
Real Estate Observatory
topic Ciencias de la Computación e Información
Information Extraction
Natural Language Processing
Real Estate Observatory
dc.description.none.fl_txt_mv A real estate observatory plays a significant role in the aggregation and analysis of real estate market data. The information that lies in real estate advertisements can be leveraged to populate such an observatory. However, this data can present itself in both a structured and an unstructured manner. Unstructured data represents a problem to automatically process and extract information since it lacks a predefined structure. Thus, there’s a need for techniques to give structure to unstructured data. Information Extraction (IE) is the process of structuring data from unstructured data. Natural Language Processing techniques enable machines to understand texts, making them particularly significant in the context of IE. This work evaluates both rule-based and machine-learning based IE approaches to extract features from real estate descriptions within advertisements. Those features are relevant in the context of real estate observatory construction. The performance of each approach is measured using precision, recall and f1-score metrics.
description A real estate observatory plays a significant role in the aggregation and analysis of real estate market data. The information that lies in real estate advertisements can be leveraged to populate such an observatory. However, this data can present itself in both a structured and an unstructured manner. Unstructured data represents a problem to automatically process and extract information since it lacks a predefined structure. Thus, there’s a need for techniques to give structure to unstructured data. Information Extraction (IE) is the process of structuring data from unstructured data. Natural Language Processing techniques enable machines to understand texts, making them particularly significant in the context of IE. This work evaluates both rule-based and machine-learning based IE approaches to extract features from real estate descriptions within advertisements. Those features are relevant in the context of real estate observatory construction. The performance of each approach is measured using precision, recall and f1-score metrics.
publishDate 2025
dc.date.none.fl_str_mv 2025
dc.type.none.fl_str_mv info:eu-repo/semantics/conferenceObject
info:eu-repo/semantics/publishedVersion
http://purl.org/coar/resource_type/c_5794
info:ar-repo/semantics/documentoDeConferencia
format conferenceObject
status_str publishedVersion
dc.identifier.none.fl_str_mv https://digital.cic.gba.gob.ar/handle/11746/12550
url https://digital.cic.gba.gob.ar/handle/11746/12550
dc.language.none.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv info:eu-repo/semantics/altIdentifier/isbn/978-3-031-91690-8
info:eu-repo/semantics/altIdentifier/doi/10.1007/978-3-031-91690-8_6
dc.rights.none.fl_str_mv info:eu-repo/semantics/openAccess
http://creativecommons.org/licenses/by-nc-nd/4.0/
eu_rights_str_mv openAccess
rights_invalid_str_mv http://creativecommons.org/licenses/by-nc-nd/4.0/
dc.format.none.fl_str_mv application/pdf
dc.source.none.fl_str_mv reponame:CIC Digital (CICBA)
instname:Comisión de Investigaciones Científicas de la Provincia de Buenos Aires
instacron:CICBA
reponame_str CIC Digital (CICBA)
collection CIC Digital (CICBA)
instname_str Comisión de Investigaciones Científicas de la Provincia de Buenos Aires
instacron_str CICBA
institution CICBA
repository.name.fl_str_mv CIC Digital (CICBA) - Comisión de Investigaciones Científicas de la Provincia de Buenos Aires
repository.mail.fl_str_mv marisa.degiusti@sedici.unlp.edu.ar
_version_ 1844618577485758464
score 13.070432