Legal NERC with ontologies, Wikipedia and curriculum learning

Autores
Cardellino, Cristian; Teruel, Milagro; Alonso Alemany, Laura; Villata, Serena
Año de publicación
2017
Idioma
inglés
Tipo de recurso
documento de conferencia
Estado
versión publicada
Descripción
Ponencia presentada en la 15th European Chapter of the Association for Computational Linguistics (EACL 2017), 2017, Valencia, Spain. pp.254 - 259, 2017.
Fil: Cardellino, Cristian. Universidad Nacional de Córdoba. Facultad de Matemática, Astronomía, Física y Computación; Argentina.
Fil: Teruel, Milagro. Universidad Nacional de Córdoba. Facultad de Matemática, Astronomía, Física y Computación; Argentina.
Fil: Alonso Alemany, Laura. Universidad Nacional de Córdoba. Facultad de Matemática, Astronomía, Física y Computación; Argentina.
Fil: Villata, Serena. Universite Cote d’Azur; France.
In this paper, we present a Wikipediabased approach to develop resources for the legal domain. We establish a mapping between a legal domain ontology, LKIF (Hoekstra et al., 2007), and a Wikipediabased ontology, YAGO (Suchanek et al., 2007), and through that we populate LKIF. Moreover, we use the mentions of those entities in Wikipedia text to train a specific Named Entity Recognizer and Classifier. We find that this classifier works well in the Wikipedia, but, as could be expected, performance decreases in a corpus of judgments of the European Court of Human Rights. However, this tool will be used as a preprocess for human annotation. We resort to a technique called curriculum learning aimed to overcome problems of overfitting by learning increasingly more complex concepts. However, we find that in this particular setting, the method works best by learning from most specific to most general concepts, not the other way round.
http://aclanthology.info/papers/E17-2041/legal-nerc-with-ontologies-wikipedia-and-curriculum-learning
Fil: Cardellino, Cristian. Universidad Nacional de Córdoba. Facultad de Matemática, Astronomía, Física y Computación; Argentina.
Fil: Teruel, Milagro. Universidad Nacional de Córdoba. Facultad de Matemática, Astronomía, Física y Computación; Argentina.
Fil: Alonso Alemany, Laura. Universidad Nacional de Córdoba. Facultad de Matemática, Astronomía, Física y Computación; Argentina.
Fil: Villata, Serena. Universite Cote d’Azur; France.
Otras Ciencias de la Computación e Información
Materia
Ontologies
Natural language processing
Legal informatics
Information extraction
Nivel de accesibilidad
acceso abierto
Condiciones de uso
Repositorio
Repositorio Digital Universitario (UNC)
Institución
Universidad Nacional de Córdoba
OAI Identificador
oai:rdu.unc.edu.ar:11086/552665

id RDUUNC_1e91776dbcf34bf7cbc94dc95be9c087
oai_identifier_str oai:rdu.unc.edu.ar:11086/552665
network_acronym_str RDUUNC
repository_id_str 2572
network_name_str Repositorio Digital Universitario (UNC)
spelling Legal NERC with ontologies, Wikipedia and curriculum learningCardellino, CristianTeruel, MilagroAlonso Alemany, LauraVillata, SerenaOntologiesNatural language processingLegal informaticsInformation extractionPonencia presentada en la 15th European Chapter of the Association for Computational Linguistics (EACL 2017), 2017, Valencia, Spain. pp.254 - 259, 2017.Fil: Cardellino, Cristian. Universidad Nacional de Córdoba. Facultad de Matemática, Astronomía, Física y Computación; Argentina.Fil: Teruel, Milagro. Universidad Nacional de Córdoba. Facultad de Matemática, Astronomía, Física y Computación; Argentina.Fil: Alonso Alemany, Laura. Universidad Nacional de Córdoba. Facultad de Matemática, Astronomía, Física y Computación; Argentina.Fil: Villata, Serena. Universite Cote d’Azur; France.In this paper, we present a Wikipediabased approach to develop resources for the legal domain. We establish a mapping between a legal domain ontology, LKIF (Hoekstra et al., 2007), and a Wikipediabased ontology, YAGO (Suchanek et al., 2007), and through that we populate LKIF. Moreover, we use the mentions of those entities in Wikipedia text to train a specific Named Entity Recognizer and Classifier. We find that this classifier works well in the Wikipedia, but, as could be expected, performance decreases in a corpus of judgments of the European Court of Human Rights. However, this tool will be used as a preprocess for human annotation. We resort to a technique called curriculum learning aimed to overcome problems of overfitting by learning increasingly more complex concepts. However, we find that in this particular setting, the method works best by learning from most specific to most general concepts, not the other way round.http://aclanthology.info/papers/E17-2041/legal-nerc-with-ontologies-wikipedia-and-curriculum-learningFil: Cardellino, Cristian. Universidad Nacional de Córdoba. Facultad de Matemática, Astronomía, Física y Computación; Argentina.Fil: Teruel, Milagro. Universidad Nacional de Córdoba. Facultad de Matemática, Astronomía, Física y Computación; Argentina.Fil: Alonso Alemany, Laura. Universidad Nacional de Córdoba. Facultad de Matemática, Astronomía, Física y Computación; Argentina.Fil: Villata, Serena. Universite Cote d’Azur; France.Otras Ciencias de la Computación e Información2017info:eu-repo/semantics/conferenceObjectinfo:eu-repo/semantics/publishedVersionhttp://purl.org/coar/resource_type/c_5794info:ar-repo/semantics/documentoDeConferenciaapplication/pdfhttp://hdl.handle.net/11086/552665enghttps://hal.science/hal-01572444info:eu-repo/semantics/openAccessreponame:Repositorio Digital Universitario (UNC)instname:Universidad Nacional de Córdobainstacron:UNC2025-09-29T13:42:49Zoai:rdu.unc.edu.ar:11086/552665Institucionalhttps://rdu.unc.edu.ar/Universidad públicaNo correspondehttp://rdu.unc.edu.ar/oai/snrdoca.unc@gmail.comArgentinaNo correspondeNo correspondeNo correspondeopendoar:25722025-09-29 13:42:49.706Repositorio Digital Universitario (UNC) - Universidad Nacional de Córdobafalse
dc.title.none.fl_str_mv Legal NERC with ontologies, Wikipedia and curriculum learning
title Legal NERC with ontologies, Wikipedia and curriculum learning
spellingShingle Legal NERC with ontologies, Wikipedia and curriculum learning
Cardellino, Cristian
Ontologies
Natural language processing
Legal informatics
Information extraction
title_short Legal NERC with ontologies, Wikipedia and curriculum learning
title_full Legal NERC with ontologies, Wikipedia and curriculum learning
title_fullStr Legal NERC with ontologies, Wikipedia and curriculum learning
title_full_unstemmed Legal NERC with ontologies, Wikipedia and curriculum learning
title_sort Legal NERC with ontologies, Wikipedia and curriculum learning
dc.creator.none.fl_str_mv Cardellino, Cristian
Teruel, Milagro
Alonso Alemany, Laura
Villata, Serena
author Cardellino, Cristian
author_facet Cardellino, Cristian
Teruel, Milagro
Alonso Alemany, Laura
Villata, Serena
author_role author
author2 Teruel, Milagro
Alonso Alemany, Laura
Villata, Serena
author2_role author
author
author
dc.subject.none.fl_str_mv Ontologies
Natural language processing
Legal informatics
Information extraction
topic Ontologies
Natural language processing
Legal informatics
Information extraction
dc.description.none.fl_txt_mv Ponencia presentada en la 15th European Chapter of the Association for Computational Linguistics (EACL 2017), 2017, Valencia, Spain. pp.254 - 259, 2017.
Fil: Cardellino, Cristian. Universidad Nacional de Córdoba. Facultad de Matemática, Astronomía, Física y Computación; Argentina.
Fil: Teruel, Milagro. Universidad Nacional de Córdoba. Facultad de Matemática, Astronomía, Física y Computación; Argentina.
Fil: Alonso Alemany, Laura. Universidad Nacional de Córdoba. Facultad de Matemática, Astronomía, Física y Computación; Argentina.
Fil: Villata, Serena. Universite Cote d’Azur; France.
In this paper, we present a Wikipediabased approach to develop resources for the legal domain. We establish a mapping between a legal domain ontology, LKIF (Hoekstra et al., 2007), and a Wikipediabased ontology, YAGO (Suchanek et al., 2007), and through that we populate LKIF. Moreover, we use the mentions of those entities in Wikipedia text to train a specific Named Entity Recognizer and Classifier. We find that this classifier works well in the Wikipedia, but, as could be expected, performance decreases in a corpus of judgments of the European Court of Human Rights. However, this tool will be used as a preprocess for human annotation. We resort to a technique called curriculum learning aimed to overcome problems of overfitting by learning increasingly more complex concepts. However, we find that in this particular setting, the method works best by learning from most specific to most general concepts, not the other way round.
http://aclanthology.info/papers/E17-2041/legal-nerc-with-ontologies-wikipedia-and-curriculum-learning
Fil: Cardellino, Cristian. Universidad Nacional de Córdoba. Facultad de Matemática, Astronomía, Física y Computación; Argentina.
Fil: Teruel, Milagro. Universidad Nacional de Córdoba. Facultad de Matemática, Astronomía, Física y Computación; Argentina.
Fil: Alonso Alemany, Laura. Universidad Nacional de Córdoba. Facultad de Matemática, Astronomía, Física y Computación; Argentina.
Fil: Villata, Serena. Universite Cote d’Azur; France.
Otras Ciencias de la Computación e Información
description Ponencia presentada en la 15th European Chapter of the Association for Computational Linguistics (EACL 2017), 2017, Valencia, Spain. pp.254 - 259, 2017.
publishDate 2017
dc.date.none.fl_str_mv 2017
dc.type.none.fl_str_mv info:eu-repo/semantics/conferenceObject
info:eu-repo/semantics/publishedVersion
http://purl.org/coar/resource_type/c_5794
info:ar-repo/semantics/documentoDeConferencia
format conferenceObject
status_str publishedVersion
dc.identifier.none.fl_str_mv http://hdl.handle.net/11086/552665
url http://hdl.handle.net/11086/552665
dc.language.none.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv https://hal.science/hal-01572444
dc.rights.none.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.source.none.fl_str_mv reponame:Repositorio Digital Universitario (UNC)
instname:Universidad Nacional de Córdoba
instacron:UNC
reponame_str Repositorio Digital Universitario (UNC)
collection Repositorio Digital Universitario (UNC)
instname_str Universidad Nacional de Córdoba
instacron_str UNC
institution UNC
repository.name.fl_str_mv Repositorio Digital Universitario (UNC) - Universidad Nacional de Córdoba
repository.mail.fl_str_mv oca.unc@gmail.com
_version_ 1844618937412616192
score 13.070432