TeXTracT: a Web-based Tool for Building NLP-enabled Applications

Autores
Rago, Alejandro; Ramos, Facundo M.; Vélez, Juan I.; Díaz Pace, J. Andrés; Marcos, Claudio
Año de publicación
2016
Idioma
inglés
Tipo de recurso
documento de conferencia
Estado
versión publicada
Descripción
Over the last few years, the software industry has showed an increasing interest for applications with Natural Language Processing (NLP) capabilities. Several cloud-based solutions have emerged with the purpose of simplifying and streamlining the integration of NLP techniques via Web services. These NLP techniques cover tasks such as language detection, entity recognition, sentiment analysis, classification, among others. However, the services provided are not always as extensible and configurable as a developer may want, preventing their use in industry-grade developments and limiting their adoption in specialized domains (e.g., for analyzing technical documentation). In this context, we have developed a tool called TeXTracT that is designed to be composable, extensible, configurable and accessible. In our tool, NLP techniques can be accessed independently and orchestrated in a pipeline via RESTful Web services. Moreover, the architecture supports the setup and deployment of NLP techniques on demand. The NLP infrastructure is built upon the UIMA framework, which defines communication protocols and uniform service interfaces for text analysis modules. TeXTracT has been evaluated in two case-studies to assess its pros and cons.
Sociedad Argentina de Informática e Investigación Operativa (SADIO)
Materia
Ciencias Informáticas
Web-based services
Procesamiento de Lenguaje Natural
Frameworks
Software development
Nivel de accesibilidad
acceso abierto
Condiciones de uso
http://creativecommons.org/licenses/by-sa/3.0/
Repositorio
SEDICI (UNLP)
Institución
Universidad Nacional de La Plata
OAI Identificador
oai:sedici.unlp.edu.ar:10915/57251

id SEDICI_dca8cfb204cdf44df2f7c4ba70f83774
oai_identifier_str oai:sedici.unlp.edu.ar:10915/57251
network_acronym_str SEDICI
repository_id_str 1329
network_name_str SEDICI (UNLP)
spelling TeXTracT: a Web-based Tool for Building NLP-enabled ApplicationsRago, AlejandroRamos, Facundo M.Vélez, Juan I.Díaz Pace, J. AndrésMarcos, ClaudioCiencias InformáticasWeb-based servicesProcesamiento de Lenguaje NaturalFrameworksSoftware developmentOver the last few years, the software industry has showed an increasing interest for applications with Natural Language Processing (NLP) capabilities. Several cloud-based solutions have emerged with the purpose of simplifying and streamlining the integration of NLP techniques via Web services. These NLP techniques cover tasks such as language detection, entity recognition, sentiment analysis, classification, among others. However, the services provided are not always as extensible and configurable as a developer may want, preventing their use in industry-grade developments and limiting their adoption in specialized domains (e.g., for analyzing technical documentation). In this context, we have developed a tool called TeXTracT that is designed to be composable, extensible, configurable and accessible. In our tool, NLP techniques can be accessed independently and orchestrated in a pipeline via RESTful Web services. Moreover, the architecture supports the setup and deployment of NLP techniques on demand. The NLP infrastructure is built upon the UIMA framework, which defines communication protocols and uniform service interfaces for text analysis modules. TeXTracT has been evaluated in two case-studies to assess its pros and cons.Sociedad Argentina de Informática e Investigación Operativa (SADIO)2016-09info:eu-repo/semantics/conferenceObjectinfo:eu-repo/semantics/publishedVersionObjeto de conferenciahttp://purl.org/coar/resource_type/c_5794info:ar-repo/semantics/documentoDeConferenciaapplication/pdf123-134http://sedici.unlp.edu.ar/handle/10915/57251enginfo:eu-repo/semantics/altIdentifier/url/http://45jaiio.sadio.org.ar/sites/default/files/asse-14.pdfinfo:eu-repo/semantics/altIdentifier/issn/2451-7593info:eu-repo/semantics/openAccesshttp://creativecommons.org/licenses/by-sa/3.0/Creative Commons Attribution-ShareAlike 3.0 Unported (CC BY-SA 3.0)reponame:SEDICI (UNLP)instname:Universidad Nacional de La Platainstacron:UNLP2025-10-22T16:47:45Zoai:sedici.unlp.edu.ar:10915/57251Institucionalhttp://sedici.unlp.edu.ar/Universidad públicaNo correspondehttp://sedici.unlp.edu.ar/oai/snrdalira@sedici.unlp.edu.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:13292025-10-22 16:47:45.283SEDICI (UNLP) - Universidad Nacional de La Platafalse
dc.title.none.fl_str_mv TeXTracT: a Web-based Tool for Building NLP-enabled Applications
title TeXTracT: a Web-based Tool for Building NLP-enabled Applications
spellingShingle TeXTracT: a Web-based Tool for Building NLP-enabled Applications
Rago, Alejandro
Ciencias Informáticas
Web-based services
Procesamiento de Lenguaje Natural
Frameworks
Software development
title_short TeXTracT: a Web-based Tool for Building NLP-enabled Applications
title_full TeXTracT: a Web-based Tool for Building NLP-enabled Applications
title_fullStr TeXTracT: a Web-based Tool for Building NLP-enabled Applications
title_full_unstemmed TeXTracT: a Web-based Tool for Building NLP-enabled Applications
title_sort TeXTracT: a Web-based Tool for Building NLP-enabled Applications
dc.creator.none.fl_str_mv Rago, Alejandro
Ramos, Facundo M.
Vélez, Juan I.
Díaz Pace, J. Andrés
Marcos, Claudio
author Rago, Alejandro
author_facet Rago, Alejandro
Ramos, Facundo M.
Vélez, Juan I.
Díaz Pace, J. Andrés
Marcos, Claudio
author_role author
author2 Ramos, Facundo M.
Vélez, Juan I.
Díaz Pace, J. Andrés
Marcos, Claudio
author2_role author
author
author
author
dc.subject.none.fl_str_mv Ciencias Informáticas
Web-based services
Procesamiento de Lenguaje Natural
Frameworks
Software development
topic Ciencias Informáticas
Web-based services
Procesamiento de Lenguaje Natural
Frameworks
Software development
dc.description.none.fl_txt_mv Over the last few years, the software industry has showed an increasing interest for applications with Natural Language Processing (NLP) capabilities. Several cloud-based solutions have emerged with the purpose of simplifying and streamlining the integration of NLP techniques via Web services. These NLP techniques cover tasks such as language detection, entity recognition, sentiment analysis, classification, among others. However, the services provided are not always as extensible and configurable as a developer may want, preventing their use in industry-grade developments and limiting their adoption in specialized domains (e.g., for analyzing technical documentation). In this context, we have developed a tool called TeXTracT that is designed to be composable, extensible, configurable and accessible. In our tool, NLP techniques can be accessed independently and orchestrated in a pipeline via RESTful Web services. Moreover, the architecture supports the setup and deployment of NLP techniques on demand. The NLP infrastructure is built upon the UIMA framework, which defines communication protocols and uniform service interfaces for text analysis modules. TeXTracT has been evaluated in two case-studies to assess its pros and cons.
Sociedad Argentina de Informática e Investigación Operativa (SADIO)
description Over the last few years, the software industry has showed an increasing interest for applications with Natural Language Processing (NLP) capabilities. Several cloud-based solutions have emerged with the purpose of simplifying and streamlining the integration of NLP techniques via Web services. These NLP techniques cover tasks such as language detection, entity recognition, sentiment analysis, classification, among others. However, the services provided are not always as extensible and configurable as a developer may want, preventing their use in industry-grade developments and limiting their adoption in specialized domains (e.g., for analyzing technical documentation). In this context, we have developed a tool called TeXTracT that is designed to be composable, extensible, configurable and accessible. In our tool, NLP techniques can be accessed independently and orchestrated in a pipeline via RESTful Web services. Moreover, the architecture supports the setup and deployment of NLP techniques on demand. The NLP infrastructure is built upon the UIMA framework, which defines communication protocols and uniform service interfaces for text analysis modules. TeXTracT has been evaluated in two case-studies to assess its pros and cons.
publishDate 2016
dc.date.none.fl_str_mv 2016-09
dc.type.none.fl_str_mv info:eu-repo/semantics/conferenceObject
info:eu-repo/semantics/publishedVersion
Objeto de conferencia
http://purl.org/coar/resource_type/c_5794
info:ar-repo/semantics/documentoDeConferencia
format conferenceObject
status_str publishedVersion
dc.identifier.none.fl_str_mv http://sedici.unlp.edu.ar/handle/10915/57251
url http://sedici.unlp.edu.ar/handle/10915/57251
dc.language.none.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv info:eu-repo/semantics/altIdentifier/url/http://45jaiio.sadio.org.ar/sites/default/files/asse-14.pdf
info:eu-repo/semantics/altIdentifier/issn/2451-7593
dc.rights.none.fl_str_mv info:eu-repo/semantics/openAccess
http://creativecommons.org/licenses/by-sa/3.0/
Creative Commons Attribution-ShareAlike 3.0 Unported (CC BY-SA 3.0)
eu_rights_str_mv openAccess
rights_invalid_str_mv http://creativecommons.org/licenses/by-sa/3.0/
Creative Commons Attribution-ShareAlike 3.0 Unported (CC BY-SA 3.0)
dc.format.none.fl_str_mv application/pdf
123-134
dc.source.none.fl_str_mv reponame:SEDICI (UNLP)
instname:Universidad Nacional de La Plata
instacron:UNLP
reponame_str SEDICI (UNLP)
collection SEDICI (UNLP)
instname_str Universidad Nacional de La Plata
instacron_str UNLP
institution UNLP
repository.name.fl_str_mv SEDICI (UNLP) - Universidad Nacional de La Plata
repository.mail.fl_str_mv alira@sedici.unlp.edu.ar
_version_ 1846783011292446720
score 12.982451