Legal NERC with ontologies, Wikipedia and curriculum learning
- Autores
- Cardellino, Cristian; Teruel, Milagro; Alonso Alemany, Laura; Villata, Serena
- Año de publicación
- 2017
- Idioma
- inglés
- Tipo de recurso
- documento de conferencia
- Estado
- versión publicada
- Descripción
- Ponencia presentada en la 15th European Chapter of the Association for Computational Linguistics (EACL 2017), 2017, Valencia, Spain. pp.254 - 259, 2017.
Fil: Cardellino, Cristian. Universidad Nacional de Córdoba. Facultad de Matemática, Astronomía, Física y Computación; Argentina.
Fil: Teruel, Milagro. Universidad Nacional de Córdoba. Facultad de Matemática, Astronomía, Física y Computación; Argentina.
Fil: Alonso Alemany, Laura. Universidad Nacional de Córdoba. Facultad de Matemática, Astronomía, Física y Computación; Argentina.
Fil: Villata, Serena. Universite Cote d’Azur; France.
In this paper, we present a Wikipediabased approach to develop resources for the legal domain. We establish a mapping between a legal domain ontology, LKIF (Hoekstra et al., 2007), and a Wikipediabased ontology, YAGO (Suchanek et al., 2007), and through that we populate LKIF. Moreover, we use the mentions of those entities in Wikipedia text to train a specific Named Entity Recognizer and Classifier. We find that this classifier works well in the Wikipedia, but, as could be expected, performance decreases in a corpus of judgments of the European Court of Human Rights. However, this tool will be used as a preprocess for human annotation. We resort to a technique called curriculum learning aimed to overcome problems of overfitting by learning increasingly more complex concepts. However, we find that in this particular setting, the method works best by learning from most specific to most general concepts, not the other way round.
http://aclanthology.info/papers/E17-2041/legal-nerc-with-ontologies-wikipedia-and-curriculum-learning
Fil: Cardellino, Cristian. Universidad Nacional de Córdoba. Facultad de Matemática, Astronomía, Física y Computación; Argentina.
Fil: Teruel, Milagro. Universidad Nacional de Córdoba. Facultad de Matemática, Astronomía, Física y Computación; Argentina.
Fil: Alonso Alemany, Laura. Universidad Nacional de Córdoba. Facultad de Matemática, Astronomía, Física y Computación; Argentina.
Fil: Villata, Serena. Universite Cote d’Azur; France.
Otras Ciencias de la Computación e Información - Materia
-
Ontologies
Natural language processing
Legal informatics
Information extraction - Nivel de accesibilidad
- acceso abierto
- Condiciones de uso
- Repositorio
- Institución
- Universidad Nacional de Córdoba
- OAI Identificador
- oai:rdu.unc.edu.ar:11086/552665
Ver los metadatos del registro completo
id |
RDUUNC_1e91776dbcf34bf7cbc94dc95be9c087 |
---|---|
oai_identifier_str |
oai:rdu.unc.edu.ar:11086/552665 |
network_acronym_str |
RDUUNC |
repository_id_str |
2572 |
network_name_str |
Repositorio Digital Universitario (UNC) |
spelling |
Legal NERC with ontologies, Wikipedia and curriculum learningCardellino, CristianTeruel, MilagroAlonso Alemany, LauraVillata, SerenaOntologiesNatural language processingLegal informaticsInformation extractionPonencia presentada en la 15th European Chapter of the Association for Computational Linguistics (EACL 2017), 2017, Valencia, Spain. pp.254 - 259, 2017.Fil: Cardellino, Cristian. Universidad Nacional de Córdoba. Facultad de Matemática, Astronomía, Física y Computación; Argentina.Fil: Teruel, Milagro. Universidad Nacional de Córdoba. Facultad de Matemática, Astronomía, Física y Computación; Argentina.Fil: Alonso Alemany, Laura. Universidad Nacional de Córdoba. Facultad de Matemática, Astronomía, Física y Computación; Argentina.Fil: Villata, Serena. Universite Cote d’Azur; France.In this paper, we present a Wikipediabased approach to develop resources for the legal domain. We establish a mapping between a legal domain ontology, LKIF (Hoekstra et al., 2007), and a Wikipediabased ontology, YAGO (Suchanek et al., 2007), and through that we populate LKIF. Moreover, we use the mentions of those entities in Wikipedia text to train a specific Named Entity Recognizer and Classifier. We find that this classifier works well in the Wikipedia, but, as could be expected, performance decreases in a corpus of judgments of the European Court of Human Rights. However, this tool will be used as a preprocess for human annotation. We resort to a technique called curriculum learning aimed to overcome problems of overfitting by learning increasingly more complex concepts. However, we find that in this particular setting, the method works best by learning from most specific to most general concepts, not the other way round.http://aclanthology.info/papers/E17-2041/legal-nerc-with-ontologies-wikipedia-and-curriculum-learningFil: Cardellino, Cristian. Universidad Nacional de Córdoba. Facultad de Matemática, Astronomía, Física y Computación; Argentina.Fil: Teruel, Milagro. Universidad Nacional de Córdoba. Facultad de Matemática, Astronomía, Física y Computación; Argentina.Fil: Alonso Alemany, Laura. Universidad Nacional de Córdoba. Facultad de Matemática, Astronomía, Física y Computación; Argentina.Fil: Villata, Serena. Universite Cote d’Azur; France.Otras Ciencias de la Computación e Información2017info:eu-repo/semantics/conferenceObjectinfo:eu-repo/semantics/publishedVersionhttp://purl.org/coar/resource_type/c_5794info:ar-repo/semantics/documentoDeConferenciaapplication/pdfhttp://hdl.handle.net/11086/552665enghttps://hal.science/hal-01572444info:eu-repo/semantics/openAccessreponame:Repositorio Digital Universitario (UNC)instname:Universidad Nacional de Córdobainstacron:UNC2025-09-29T13:42:49Zoai:rdu.unc.edu.ar:11086/552665Institucionalhttps://rdu.unc.edu.ar/Universidad públicaNo correspondehttp://rdu.unc.edu.ar/oai/snrdoca.unc@gmail.comArgentinaNo correspondeNo correspondeNo correspondeopendoar:25722025-09-29 13:42:49.706Repositorio Digital Universitario (UNC) - Universidad Nacional de Córdobafalse |
dc.title.none.fl_str_mv |
Legal NERC with ontologies, Wikipedia and curriculum learning |
title |
Legal NERC with ontologies, Wikipedia and curriculum learning |
spellingShingle |
Legal NERC with ontologies, Wikipedia and curriculum learning Cardellino, Cristian Ontologies Natural language processing Legal informatics Information extraction |
title_short |
Legal NERC with ontologies, Wikipedia and curriculum learning |
title_full |
Legal NERC with ontologies, Wikipedia and curriculum learning |
title_fullStr |
Legal NERC with ontologies, Wikipedia and curriculum learning |
title_full_unstemmed |
Legal NERC with ontologies, Wikipedia and curriculum learning |
title_sort |
Legal NERC with ontologies, Wikipedia and curriculum learning |
dc.creator.none.fl_str_mv |
Cardellino, Cristian Teruel, Milagro Alonso Alemany, Laura Villata, Serena |
author |
Cardellino, Cristian |
author_facet |
Cardellino, Cristian Teruel, Milagro Alonso Alemany, Laura Villata, Serena |
author_role |
author |
author2 |
Teruel, Milagro Alonso Alemany, Laura Villata, Serena |
author2_role |
author author author |
dc.subject.none.fl_str_mv |
Ontologies Natural language processing Legal informatics Information extraction |
topic |
Ontologies Natural language processing Legal informatics Information extraction |
dc.description.none.fl_txt_mv |
Ponencia presentada en la 15th European Chapter of the Association for Computational Linguistics (EACL 2017), 2017, Valencia, Spain. pp.254 - 259, 2017. Fil: Cardellino, Cristian. Universidad Nacional de Córdoba. Facultad de Matemática, Astronomía, Física y Computación; Argentina. Fil: Teruel, Milagro. Universidad Nacional de Córdoba. Facultad de Matemática, Astronomía, Física y Computación; Argentina. Fil: Alonso Alemany, Laura. Universidad Nacional de Córdoba. Facultad de Matemática, Astronomía, Física y Computación; Argentina. Fil: Villata, Serena. Universite Cote d’Azur; France. In this paper, we present a Wikipediabased approach to develop resources for the legal domain. We establish a mapping between a legal domain ontology, LKIF (Hoekstra et al., 2007), and a Wikipediabased ontology, YAGO (Suchanek et al., 2007), and through that we populate LKIF. Moreover, we use the mentions of those entities in Wikipedia text to train a specific Named Entity Recognizer and Classifier. We find that this classifier works well in the Wikipedia, but, as could be expected, performance decreases in a corpus of judgments of the European Court of Human Rights. However, this tool will be used as a preprocess for human annotation. We resort to a technique called curriculum learning aimed to overcome problems of overfitting by learning increasingly more complex concepts. However, we find that in this particular setting, the method works best by learning from most specific to most general concepts, not the other way round. http://aclanthology.info/papers/E17-2041/legal-nerc-with-ontologies-wikipedia-and-curriculum-learning Fil: Cardellino, Cristian. Universidad Nacional de Córdoba. Facultad de Matemática, Astronomía, Física y Computación; Argentina. Fil: Teruel, Milagro. Universidad Nacional de Córdoba. Facultad de Matemática, Astronomía, Física y Computación; Argentina. Fil: Alonso Alemany, Laura. Universidad Nacional de Córdoba. Facultad de Matemática, Astronomía, Física y Computación; Argentina. Fil: Villata, Serena. Universite Cote d’Azur; France. Otras Ciencias de la Computación e Información |
description |
Ponencia presentada en la 15th European Chapter of the Association for Computational Linguistics (EACL 2017), 2017, Valencia, Spain. pp.254 - 259, 2017. |
publishDate |
2017 |
dc.date.none.fl_str_mv |
2017 |
dc.type.none.fl_str_mv |
info:eu-repo/semantics/conferenceObject info:eu-repo/semantics/publishedVersion http://purl.org/coar/resource_type/c_5794 info:ar-repo/semantics/documentoDeConferencia |
format |
conferenceObject |
status_str |
publishedVersion |
dc.identifier.none.fl_str_mv |
http://hdl.handle.net/11086/552665 |
url |
http://hdl.handle.net/11086/552665 |
dc.language.none.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
https://hal.science/hal-01572444 |
dc.rights.none.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.source.none.fl_str_mv |
reponame:Repositorio Digital Universitario (UNC) instname:Universidad Nacional de Córdoba instacron:UNC |
reponame_str |
Repositorio Digital Universitario (UNC) |
collection |
Repositorio Digital Universitario (UNC) |
instname_str |
Universidad Nacional de Córdoba |
instacron_str |
UNC |
institution |
UNC |
repository.name.fl_str_mv |
Repositorio Digital Universitario (UNC) - Universidad Nacional de Córdoba |
repository.mail.fl_str_mv |
oca.unc@gmail.com |
_version_ |
1844618937412616192 |
score |
13.070432 |