Word frequency–rank relationship in tagged texts
- Autores
- Chacoma, Andrés Alberto; Zanette, Damian Horacio
- Año de publicación
- 2021
- Idioma
- inglés
- Tipo de recurso
- artículo
- Estado
- versión publicada
- Descripción
- We analyze the frequency–rank relationship in sub-vocabularies corresponding to three different grammatical classes (nouns, verbs, and others) in a collection of literary works in English, whose words have been automatically tagged according to their grammatical role. Comparing with a null hypothesis which assumes that words belonging to each class are uniformly distributed across the frequency–ranked vocabulary of the whole work, we disclose statistically significant differences between the three classes. This results point to the fact that frequency–rank relationships may reflect linguistic features associated with grammatical function.
Fil: Chacoma, Andrés Alberto. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Córdoba. Instituto de Física Enrique Gaviola. Universidad Nacional de Córdoba. Instituto de Física Enrique Gaviola; Argentina
Fil: Zanette, Damian Horacio. Comisión Nacional de Energía Atómica. Gerencia del Área de Energía Nuclear. Instituto Balseiro; Argentina. Comisión Nacional de Energía Atómica. Centro Atómico Bariloche; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Patagonia Norte; Argentina - Materia
-
FREQUENCY–RANK STATISTICS
GRAMMATICAL FUNCTION
LANGUAGE PROCESSING
LINGUISTIC REGULARITIES
QUANTITATIVE LINGUISTICS - Nivel de accesibilidad
- acceso abierto
- Condiciones de uso
- https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
- Repositorio
- Institución
- Consejo Nacional de Investigaciones Científicas y Técnicas
- OAI Identificador
- oai:ri.conicet.gov.ar:11336/185074
Ver los metadatos del registro completo
id |
CONICETDig_934bdffe8d3ca72c3af53515b46c702a |
---|---|
oai_identifier_str |
oai:ri.conicet.gov.ar:11336/185074 |
network_acronym_str |
CONICETDig |
repository_id_str |
3498 |
network_name_str |
CONICET Digital (CONICET) |
spelling |
Word frequency–rank relationship in tagged textsChacoma, Andrés AlbertoZanette, Damian HoracioFREQUENCY–RANK STATISTICSGRAMMATICAL FUNCTIONLANGUAGE PROCESSINGLINGUISTIC REGULARITIESQUANTITATIVE LINGUISTICShttps://purl.org/becyt/ford/1.3https://purl.org/becyt/ford/1We analyze the frequency–rank relationship in sub-vocabularies corresponding to three different grammatical classes (nouns, verbs, and others) in a collection of literary works in English, whose words have been automatically tagged according to their grammatical role. Comparing with a null hypothesis which assumes that words belonging to each class are uniformly distributed across the frequency–ranked vocabulary of the whole work, we disclose statistically significant differences between the three classes. This results point to the fact that frequency–rank relationships may reflect linguistic features associated with grammatical function.Fil: Chacoma, Andrés Alberto. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Córdoba. Instituto de Física Enrique Gaviola. Universidad Nacional de Córdoba. Instituto de Física Enrique Gaviola; ArgentinaFil: Zanette, Damian Horacio. Comisión Nacional de Energía Atómica. Gerencia del Área de Energía Nuclear. Instituto Balseiro; Argentina. Comisión Nacional de Energía Atómica. Centro Atómico Bariloche; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Patagonia Norte; ArgentinaElsevier Science2021-07-15info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionhttp://purl.org/coar/resource_type/c_6501info:ar-repo/semantics/articuloapplication/pdfapplication/pdfapplication/pdfhttp://hdl.handle.net/11336/185074Chacoma, Andrés Alberto; Zanette, Damian Horacio; Word frequency–rank relationship in tagged texts; Elsevier Science; Physica A: Statistical Mechanics and its Applications; 574; 15-7-2021; 1-100378-43711873-2119CONICET DigitalCONICETenginfo:eu-repo/semantics/altIdentifier/url/https://www.sciencedirect.com/science/article/pii/S0378437121002922?via%3Dihubinfo:eu-repo/semantics/altIdentifier/doi/10.1016/j.physa.2021.126020info:eu-repo/semantics/altIdentifier/url/https://arxiv.org/abs/2102.10992info:eu-repo/semantics/openAccesshttps://creativecommons.org/licenses/by-nc-sa/2.5/ar/reponame:CONICET Digital (CONICET)instname:Consejo Nacional de Investigaciones Científicas y Técnicas2025-09-29T10:18:23Zoai:ri.conicet.gov.ar:11336/185074instacron:CONICETInstitucionalhttp://ri.conicet.gov.ar/Organismo científico-tecnológicoNo correspondehttp://ri.conicet.gov.ar/oai/requestdasensio@conicet.gov.ar; lcarlino@conicet.gov.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:34982025-09-29 10:18:23.856CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicasfalse |
dc.title.none.fl_str_mv |
Word frequency–rank relationship in tagged texts |
title |
Word frequency–rank relationship in tagged texts |
spellingShingle |
Word frequency–rank relationship in tagged texts Chacoma, Andrés Alberto FREQUENCY–RANK STATISTICS GRAMMATICAL FUNCTION LANGUAGE PROCESSING LINGUISTIC REGULARITIES QUANTITATIVE LINGUISTICS |
title_short |
Word frequency–rank relationship in tagged texts |
title_full |
Word frequency–rank relationship in tagged texts |
title_fullStr |
Word frequency–rank relationship in tagged texts |
title_full_unstemmed |
Word frequency–rank relationship in tagged texts |
title_sort |
Word frequency–rank relationship in tagged texts |
dc.creator.none.fl_str_mv |
Chacoma, Andrés Alberto Zanette, Damian Horacio |
author |
Chacoma, Andrés Alberto |
author_facet |
Chacoma, Andrés Alberto Zanette, Damian Horacio |
author_role |
author |
author2 |
Zanette, Damian Horacio |
author2_role |
author |
dc.subject.none.fl_str_mv |
FREQUENCY–RANK STATISTICS GRAMMATICAL FUNCTION LANGUAGE PROCESSING LINGUISTIC REGULARITIES QUANTITATIVE LINGUISTICS |
topic |
FREQUENCY–RANK STATISTICS GRAMMATICAL FUNCTION LANGUAGE PROCESSING LINGUISTIC REGULARITIES QUANTITATIVE LINGUISTICS |
purl_subject.fl_str_mv |
https://purl.org/becyt/ford/1.3 https://purl.org/becyt/ford/1 |
dc.description.none.fl_txt_mv |
We analyze the frequency–rank relationship in sub-vocabularies corresponding to three different grammatical classes (nouns, verbs, and others) in a collection of literary works in English, whose words have been automatically tagged according to their grammatical role. Comparing with a null hypothesis which assumes that words belonging to each class are uniformly distributed across the frequency–ranked vocabulary of the whole work, we disclose statistically significant differences between the three classes. This results point to the fact that frequency–rank relationships may reflect linguistic features associated with grammatical function. Fil: Chacoma, Andrés Alberto. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Córdoba. Instituto de Física Enrique Gaviola. Universidad Nacional de Córdoba. Instituto de Física Enrique Gaviola; Argentina Fil: Zanette, Damian Horacio. Comisión Nacional de Energía Atómica. Gerencia del Área de Energía Nuclear. Instituto Balseiro; Argentina. Comisión Nacional de Energía Atómica. Centro Atómico Bariloche; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Patagonia Norte; Argentina |
description |
We analyze the frequency–rank relationship in sub-vocabularies corresponding to three different grammatical classes (nouns, verbs, and others) in a collection of literary works in English, whose words have been automatically tagged according to their grammatical role. Comparing with a null hypothesis which assumes that words belonging to each class are uniformly distributed across the frequency–ranked vocabulary of the whole work, we disclose statistically significant differences between the three classes. This results point to the fact that frequency–rank relationships may reflect linguistic features associated with grammatical function. |
publishDate |
2021 |
dc.date.none.fl_str_mv |
2021-07-15 |
dc.type.none.fl_str_mv |
info:eu-repo/semantics/article info:eu-repo/semantics/publishedVersion http://purl.org/coar/resource_type/c_6501 info:ar-repo/semantics/articulo |
format |
article |
status_str |
publishedVersion |
dc.identifier.none.fl_str_mv |
http://hdl.handle.net/11336/185074 Chacoma, Andrés Alberto; Zanette, Damian Horacio; Word frequency–rank relationship in tagged texts; Elsevier Science; Physica A: Statistical Mechanics and its Applications; 574; 15-7-2021; 1-10 0378-4371 1873-2119 CONICET Digital CONICET |
url |
http://hdl.handle.net/11336/185074 |
identifier_str_mv |
Chacoma, Andrés Alberto; Zanette, Damian Horacio; Word frequency–rank relationship in tagged texts; Elsevier Science; Physica A: Statistical Mechanics and its Applications; 574; 15-7-2021; 1-10 0378-4371 1873-2119 CONICET Digital CONICET |
dc.language.none.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
info:eu-repo/semantics/altIdentifier/url/https://www.sciencedirect.com/science/article/pii/S0378437121002922?via%3Dihub info:eu-repo/semantics/altIdentifier/doi/10.1016/j.physa.2021.126020 info:eu-repo/semantics/altIdentifier/url/https://arxiv.org/abs/2102.10992 |
dc.rights.none.fl_str_mv |
info:eu-repo/semantics/openAccess https://creativecommons.org/licenses/by-nc-sa/2.5/ar/ |
eu_rights_str_mv |
openAccess |
rights_invalid_str_mv |
https://creativecommons.org/licenses/by-nc-sa/2.5/ar/ |
dc.format.none.fl_str_mv |
application/pdf application/pdf application/pdf |
dc.publisher.none.fl_str_mv |
Elsevier Science |
publisher.none.fl_str_mv |
Elsevier Science |
dc.source.none.fl_str_mv |
reponame:CONICET Digital (CONICET) instname:Consejo Nacional de Investigaciones Científicas y Técnicas |
reponame_str |
CONICET Digital (CONICET) |
collection |
CONICET Digital (CONICET) |
instname_str |
Consejo Nacional de Investigaciones Científicas y Técnicas |
repository.name.fl_str_mv |
CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicas |
repository.mail.fl_str_mv |
dasensio@conicet.gov.ar; lcarlino@conicet.gov.ar |
_version_ |
1844614145938292736 |
score |
13.070432 |