Coherent oscillations in word-use data from 1700 to 2008
- Autores
- Montemurro, Marcelo Alejandro; Zanette, Damian Horacio
- Año de publicación
- 2016
- Idioma
- inglés
- Tipo de recurso
- artículo
- Estado
- versión publicada
- Descripción
- In written language, the choice of specific words is constrained by both grammatical requirements and the specific semantic context of the message to be transmitted. To a significant degree, the semantic context is in turn affected by a broad cultural and historical environment, which also influences matters of style and manners. Over time, those environmental factors leave an imprint in the statistics of language use, with some words becoming more common and other words being preferred less. Here we characterize the patterns of language use over time based on word statistics extracted from more than 4.5 million books written over a period of 308 years. We find evidence of novel systematic oscillatory patterns in word use with a consistent period narrowly distributed around 14 years. The specific phase relationships between different words show structure at two independent levels: first, there is a weak global phase modulation that is primarily linked to overall shifts in the vocabulary across time; and second, a stronger component dependent on well defined semantic relationships between words. In particular, complex network analysis reveals that semantically related words show strong phase coherence. Ultimately, these previously unknown patterns in the statistics of language may be a consequence of changes in the cultural framework that influences the thematic focus of writers.
Fil: Montemurro, Marcelo Alejandro. University of Manchester; Reino Unido
Fil: Zanette, Damian Horacio. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Patagonia Norte; Argentina. Comisión Nacional de Energía Atómica. Gerencia del Area de Investigación y Aplicaciones No Nucleares. Gerencia de Física (Centro Atómico Bariloche); Argentina. Comisión Nacional de Energía Atómica. Gerencia del Área de Energía Nuclear. Instituto Balseiro; Argentina - Materia
-
LANGUAGE STATISTICS
WORD USE
GOOGLE NGRAMMS - Nivel de accesibilidad
- acceso abierto
- Condiciones de uso
- https://creativecommons.org/licenses/by/2.5/ar/
- Repositorio
- Institución
- Consejo Nacional de Investigaciones Científicas y Técnicas
- OAI Identificador
- oai:ri.conicet.gov.ar:11336/74505
Ver los metadatos del registro completo
id |
CONICETDig_db18dbac7e24a17a2c18857effd20d9a |
---|---|
oai_identifier_str |
oai:ri.conicet.gov.ar:11336/74505 |
network_acronym_str |
CONICETDig |
repository_id_str |
3498 |
network_name_str |
CONICET Digital (CONICET) |
spelling |
Coherent oscillations in word-use data from 1700 to 2008Montemurro, Marcelo AlejandroZanette, Damian HoracioLANGUAGE STATISTICSWORD USEGOOGLE NGRAMMShttps://purl.org/becyt/ford/1.2https://purl.org/becyt/ford/1In written language, the choice of specific words is constrained by both grammatical requirements and the specific semantic context of the message to be transmitted. To a significant degree, the semantic context is in turn affected by a broad cultural and historical environment, which also influences matters of style and manners. Over time, those environmental factors leave an imprint in the statistics of language use, with some words becoming more common and other words being preferred less. Here we characterize the patterns of language use over time based on word statistics extracted from more than 4.5 million books written over a period of 308 years. We find evidence of novel systematic oscillatory patterns in word use with a consistent period narrowly distributed around 14 years. The specific phase relationships between different words show structure at two independent levels: first, there is a weak global phase modulation that is primarily linked to overall shifts in the vocabulary across time; and second, a stronger component dependent on well defined semantic relationships between words. In particular, complex network analysis reveals that semantically related words show strong phase coherence. Ultimately, these previously unknown patterns in the statistics of language may be a consequence of changes in the cultural framework that influences the thematic focus of writers.Fil: Montemurro, Marcelo Alejandro. University of Manchester; Reino UnidoFil: Zanette, Damian Horacio. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Patagonia Norte; Argentina. Comisión Nacional de Energía Atómica. Gerencia del Area de Investigación y Aplicaciones No Nucleares. Gerencia de Física (Centro Atómico Bariloche); Argentina. Comisión Nacional de Energía Atómica. Gerencia del Área de Energía Nuclear. Instituto Balseiro; ArgentinaPalgrave Macmillan Ltd2016-12-05info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionhttp://purl.org/coar/resource_type/c_6501info:ar-repo/semantics/articuloapplication/pdfapplication/pdfhttp://hdl.handle.net/11336/74505Montemurro, Marcelo Alejandro; Zanette, Damian Horacio; Coherent oscillations in word-use data from 1700 to 2008; Palgrave Macmillan Ltd; Palgrave Communications; 2; 5-12-2016; 1-92055-1045CONICET DigitalCONICETenginfo:eu-repo/semantics/altIdentifier/url/https://www.nature.com/articles/palcomms201684info:eu-repo/semantics/altIdentifier/doi/10.1057/palcomms.2016.84info:eu-repo/semantics/openAccesshttps://creativecommons.org/licenses/by/2.5/ar/reponame:CONICET Digital (CONICET)instname:Consejo Nacional de Investigaciones Científicas y Técnicas2025-09-29T09:39:22Zoai:ri.conicet.gov.ar:11336/74505instacron:CONICETInstitucionalhttp://ri.conicet.gov.ar/Organismo científico-tecnológicoNo correspondehttp://ri.conicet.gov.ar/oai/requestdasensio@conicet.gov.ar; lcarlino@conicet.gov.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:34982025-09-29 09:39:23.145CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicasfalse |
dc.title.none.fl_str_mv |
Coherent oscillations in word-use data from 1700 to 2008 |
title |
Coherent oscillations in word-use data from 1700 to 2008 |
spellingShingle |
Coherent oscillations in word-use data from 1700 to 2008 Montemurro, Marcelo Alejandro LANGUAGE STATISTICS WORD USE GOOGLE NGRAMMS |
title_short |
Coherent oscillations in word-use data from 1700 to 2008 |
title_full |
Coherent oscillations in word-use data from 1700 to 2008 |
title_fullStr |
Coherent oscillations in word-use data from 1700 to 2008 |
title_full_unstemmed |
Coherent oscillations in word-use data from 1700 to 2008 |
title_sort |
Coherent oscillations in word-use data from 1700 to 2008 |
dc.creator.none.fl_str_mv |
Montemurro, Marcelo Alejandro Zanette, Damian Horacio |
author |
Montemurro, Marcelo Alejandro |
author_facet |
Montemurro, Marcelo Alejandro Zanette, Damian Horacio |
author_role |
author |
author2 |
Zanette, Damian Horacio |
author2_role |
author |
dc.subject.none.fl_str_mv |
LANGUAGE STATISTICS WORD USE GOOGLE NGRAMMS |
topic |
LANGUAGE STATISTICS WORD USE GOOGLE NGRAMMS |
purl_subject.fl_str_mv |
https://purl.org/becyt/ford/1.2 https://purl.org/becyt/ford/1 |
dc.description.none.fl_txt_mv |
In written language, the choice of specific words is constrained by both grammatical requirements and the specific semantic context of the message to be transmitted. To a significant degree, the semantic context is in turn affected by a broad cultural and historical environment, which also influences matters of style and manners. Over time, those environmental factors leave an imprint in the statistics of language use, with some words becoming more common and other words being preferred less. Here we characterize the patterns of language use over time based on word statistics extracted from more than 4.5 million books written over a period of 308 years. We find evidence of novel systematic oscillatory patterns in word use with a consistent period narrowly distributed around 14 years. The specific phase relationships between different words show structure at two independent levels: first, there is a weak global phase modulation that is primarily linked to overall shifts in the vocabulary across time; and second, a stronger component dependent on well defined semantic relationships between words. In particular, complex network analysis reveals that semantically related words show strong phase coherence. Ultimately, these previously unknown patterns in the statistics of language may be a consequence of changes in the cultural framework that influences the thematic focus of writers. Fil: Montemurro, Marcelo Alejandro. University of Manchester; Reino Unido Fil: Zanette, Damian Horacio. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Patagonia Norte; Argentina. Comisión Nacional de Energía Atómica. Gerencia del Area de Investigación y Aplicaciones No Nucleares. Gerencia de Física (Centro Atómico Bariloche); Argentina. Comisión Nacional de Energía Atómica. Gerencia del Área de Energía Nuclear. Instituto Balseiro; Argentina |
description |
In written language, the choice of specific words is constrained by both grammatical requirements and the specific semantic context of the message to be transmitted. To a significant degree, the semantic context is in turn affected by a broad cultural and historical environment, which also influences matters of style and manners. Over time, those environmental factors leave an imprint in the statistics of language use, with some words becoming more common and other words being preferred less. Here we characterize the patterns of language use over time based on word statistics extracted from more than 4.5 million books written over a period of 308 years. We find evidence of novel systematic oscillatory patterns in word use with a consistent period narrowly distributed around 14 years. The specific phase relationships between different words show structure at two independent levels: first, there is a weak global phase modulation that is primarily linked to overall shifts in the vocabulary across time; and second, a stronger component dependent on well defined semantic relationships between words. In particular, complex network analysis reveals that semantically related words show strong phase coherence. Ultimately, these previously unknown patterns in the statistics of language may be a consequence of changes in the cultural framework that influences the thematic focus of writers. |
publishDate |
2016 |
dc.date.none.fl_str_mv |
2016-12-05 |
dc.type.none.fl_str_mv |
info:eu-repo/semantics/article info:eu-repo/semantics/publishedVersion http://purl.org/coar/resource_type/c_6501 info:ar-repo/semantics/articulo |
format |
article |
status_str |
publishedVersion |
dc.identifier.none.fl_str_mv |
http://hdl.handle.net/11336/74505 Montemurro, Marcelo Alejandro; Zanette, Damian Horacio; Coherent oscillations in word-use data from 1700 to 2008; Palgrave Macmillan Ltd; Palgrave Communications; 2; 5-12-2016; 1-9 2055-1045 CONICET Digital CONICET |
url |
http://hdl.handle.net/11336/74505 |
identifier_str_mv |
Montemurro, Marcelo Alejandro; Zanette, Damian Horacio; Coherent oscillations in word-use data from 1700 to 2008; Palgrave Macmillan Ltd; Palgrave Communications; 2; 5-12-2016; 1-9 2055-1045 CONICET Digital CONICET |
dc.language.none.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
info:eu-repo/semantics/altIdentifier/url/https://www.nature.com/articles/palcomms201684 info:eu-repo/semantics/altIdentifier/doi/10.1057/palcomms.2016.84 |
dc.rights.none.fl_str_mv |
info:eu-repo/semantics/openAccess https://creativecommons.org/licenses/by/2.5/ar/ |
eu_rights_str_mv |
openAccess |
rights_invalid_str_mv |
https://creativecommons.org/licenses/by/2.5/ar/ |
dc.format.none.fl_str_mv |
application/pdf application/pdf |
dc.publisher.none.fl_str_mv |
Palgrave Macmillan Ltd |
publisher.none.fl_str_mv |
Palgrave Macmillan Ltd |
dc.source.none.fl_str_mv |
reponame:CONICET Digital (CONICET) instname:Consejo Nacional de Investigaciones Científicas y Técnicas |
reponame_str |
CONICET Digital (CONICET) |
collection |
CONICET Digital (CONICET) |
instname_str |
Consejo Nacional de Investigaciones Científicas y Técnicas |
repository.name.fl_str_mv |
CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicas |
repository.mail.fl_str_mv |
dasensio@conicet.gov.ar; lcarlino@conicet.gov.ar |
_version_ |
1844613245439049728 |
score |
13.070432 |