Prediction of psychosis across protocols and risk cohorts using automated language analysis

Autores
Corcoran, Cheryl M.; Carrillo, Facundo; Fernandez Slezak, Diego; Bedi, Gillinder; Klim, Casimir; Javitt, Daniel C.; Bearden, Carrie E.; Cecchi, Guillermo Alberto
Año de publicación
2018
Idioma
inglés
Tipo de recurso
artículo
Estado
versión publicada
Descripción
Language and speech are the primary source of data for psychiatrists to diagnose and treat mental disorders. In psychosis, the very structure of language can be disturbed, including semantic coherence (e.g., derailment and tangentiality) and syntactic complexity (e.g., concreteness). Subtle disturbances in language are evident in schizophrenia even prior to first psychosis onset, during prodromal stages. Using computer-based natural language processing analyses, we previously showed that, among English-speaking clinical (e.g., ultra) high-risk youths, baseline reduction in semantic coherence (the flow of meaning in speech) and in syntactic complexity could predict subsequent psychosis onset with high accuracy. Herein, we aimed to cross-validate these automated linguistic analytic methods in a second larger risk cohort, also English-speaking, and to discriminate speech in psychosis from normal speech. We identified an automated machine-learning speech classifier – comprising decreased semantic coherence, greater variance in that coherence, and reduced usage of possessive pronouns – that had an 83% accuracy in predicting psychosis onset (intra-protocol), a cross-validated accuracy of 79% of psychosis onset prediction in the original risk cohort (cross-protocol), and a 72% accuracy in discriminating the speech of recent-onset psychosis patients from that of healthy individuals. The classifier was highly correlated with previously identified manual linguistic predictors. Our findings support the utility and validity of automated natural language processing methods to characterize disturbances in semantics and syntax across stages of psychotic disorder. The next steps will be to apply these methods in larger risk cohorts to further test reproducibility, also in languages other than English, and identify sources of variability. This technology has the potential to improve prediction of psychosis outcome among at-risk youths and identify linguistic targets for remediation and preventive intervention. More broadly, automated linguistic analysis can be a powerful tool for diagnosis and treatment across neuropsychiatry.
Fil: Corcoran, Cheryl M.. Icahn School of Medicine at Mount Sinai; Estados Unidos. New York State Psychiatric Institute; Estados Unidos
Fil: Carrillo, Facundo. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria. Instituto de Investigación en Ciencias de la Computación. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Instituto de Investigación en Ciencias de la Computación; Argentina
Fil: Fernandez Slezak, Diego. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria. Instituto de Investigación en Ciencias de la Computación. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Instituto de Investigación en Ciencias de la Computación; Argentina
Fil: Bedi, Gillinder. New York State Psychiatric Institute; Estados Unidos. Columbia University; Estados Unidos. University of Melbourne; Australia
Fil: Klim, Casimir. Columbia University; Estados Unidos. New York State Psychiatric Institute; Estados Unidos
Fil: Javitt, Daniel C.. Columbia University; Estados Unidos. New York State Psychiatric Institute; Estados Unidos
Fil: Bearden, Carrie E.. University of California at Los Angeles; Estados Unidos
Fil: Cecchi, Guillermo Alberto. IBM T.J. Watson Research Center; Estados Unidos
Materia
AUTOMATED LANGUAGE ANALYSIS
HIGH-RISK YOUTHS
MACHINE LEARNING
PREDICTION OF PSYCHOSIS
SEMANTIC COHERENCE
SYNTACTIC COMPLEXITY
Nivel de accesibilidad
acceso abierto
Condiciones de uso
https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
Repositorio
CONICET Digital (CONICET)
Institución
Consejo Nacional de Investigaciones Científicas y Técnicas
OAI Identificador
oai:ri.conicet.gov.ar:11336/97054

id CONICETDig_532d8e6a240a2445787cf7dd5ac73510
oai_identifier_str oai:ri.conicet.gov.ar:11336/97054
network_acronym_str CONICETDig
repository_id_str 3498
network_name_str CONICET Digital (CONICET)
spelling Prediction of psychosis across protocols and risk cohorts using automated language analysisCorcoran, Cheryl M.Carrillo, FacundoFernandez Slezak, DiegoBedi, GillinderKlim, CasimirJavitt, Daniel C.Bearden, Carrie E.Cecchi, Guillermo AlbertoAUTOMATED LANGUAGE ANALYSISHIGH-RISK YOUTHSMACHINE LEARNINGPREDICTION OF PSYCHOSISSEMANTIC COHERENCESYNTACTIC COMPLEXITYhttps://purl.org/becyt/ford/1.2https://purl.org/becyt/ford/1https://purl.org/becyt/ford/3.2https://purl.org/becyt/ford/3Language and speech are the primary source of data for psychiatrists to diagnose and treat mental disorders. In psychosis, the very structure of language can be disturbed, including semantic coherence (e.g., derailment and tangentiality) and syntactic complexity (e.g., concreteness). Subtle disturbances in language are evident in schizophrenia even prior to first psychosis onset, during prodromal stages. Using computer-based natural language processing analyses, we previously showed that, among English-speaking clinical (e.g., ultra) high-risk youths, baseline reduction in semantic coherence (the flow of meaning in speech) and in syntactic complexity could predict subsequent psychosis onset with high accuracy. Herein, we aimed to cross-validate these automated linguistic analytic methods in a second larger risk cohort, also English-speaking, and to discriminate speech in psychosis from normal speech. We identified an automated machine-learning speech classifier – comprising decreased semantic coherence, greater variance in that coherence, and reduced usage of possessive pronouns – that had an 83% accuracy in predicting psychosis onset (intra-protocol), a cross-validated accuracy of 79% of psychosis onset prediction in the original risk cohort (cross-protocol), and a 72% accuracy in discriminating the speech of recent-onset psychosis patients from that of healthy individuals. The classifier was highly correlated with previously identified manual linguistic predictors. Our findings support the utility and validity of automated natural language processing methods to characterize disturbances in semantics and syntax across stages of psychotic disorder. The next steps will be to apply these methods in larger risk cohorts to further test reproducibility, also in languages other than English, and identify sources of variability. This technology has the potential to improve prediction of psychosis outcome among at-risk youths and identify linguistic targets for remediation and preventive intervention. More broadly, automated linguistic analysis can be a powerful tool for diagnosis and treatment across neuropsychiatry.Fil: Corcoran, Cheryl M.. Icahn School of Medicine at Mount Sinai; Estados Unidos. New York State Psychiatric Institute; Estados UnidosFil: Carrillo, Facundo. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria. Instituto de Investigación en Ciencias de la Computación. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Instituto de Investigación en Ciencias de la Computación; ArgentinaFil: Fernandez Slezak, Diego. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria. Instituto de Investigación en Ciencias de la Computación. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Instituto de Investigación en Ciencias de la Computación; ArgentinaFil: Bedi, Gillinder. New York State Psychiatric Institute; Estados Unidos. Columbia University; Estados Unidos. University of Melbourne; AustraliaFil: Klim, Casimir. Columbia University; Estados Unidos. New York State Psychiatric Institute; Estados UnidosFil: Javitt, Daniel C.. Columbia University; Estados Unidos. New York State Psychiatric Institute; Estados UnidosFil: Bearden, Carrie E.. University of California at Los Angeles; Estados UnidosFil: Cecchi, Guillermo Alberto. IBM T.J. Watson Research Center; Estados UnidosWiley2018-02info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionhttp://purl.org/coar/resource_type/c_6501info:ar-repo/semantics/articuloapplication/pdfapplication/pdfapplication/pdfhttp://hdl.handle.net/11336/97054Corcoran, Cheryl M.; Carrillo, Facundo; Fernandez Slezak, Diego; Bedi, Gillinder; Klim, Casimir; et al.; Prediction of psychosis across protocols and risk cohorts using automated language analysis; Wiley; World Psychiatry; 17; 1; 2-2018; 67-751723-8617CONICET DigitalCONICETenginfo:eu-repo/semantics/altIdentifier/doi/10.1002/wps.20491info:eu-repo/semantics/altIdentifier/url/https://onlinelibrary.wiley.com/doi/full/10.1002/wps.20491info:eu-repo/semantics/altIdentifier/url/https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5775133/info:eu-repo/semantics/openAccesshttps://creativecommons.org/licenses/by-nc-sa/2.5/ar/reponame:CONICET Digital (CONICET)instname:Consejo Nacional de Investigaciones Científicas y Técnicas2025-09-29T09:51:19Zoai:ri.conicet.gov.ar:11336/97054instacron:CONICETInstitucionalhttp://ri.conicet.gov.ar/Organismo científico-tecnológicoNo correspondehttp://ri.conicet.gov.ar/oai/requestdasensio@conicet.gov.ar; lcarlino@conicet.gov.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:34982025-09-29 09:51:19.27CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicasfalse
dc.title.none.fl_str_mv Prediction of psychosis across protocols and risk cohorts using automated language analysis
title Prediction of psychosis across protocols and risk cohorts using automated language analysis
spellingShingle Prediction of psychosis across protocols and risk cohorts using automated language analysis
Corcoran, Cheryl M.
AUTOMATED LANGUAGE ANALYSIS
HIGH-RISK YOUTHS
MACHINE LEARNING
PREDICTION OF PSYCHOSIS
SEMANTIC COHERENCE
SYNTACTIC COMPLEXITY
title_short Prediction of psychosis across protocols and risk cohorts using automated language analysis
title_full Prediction of psychosis across protocols and risk cohorts using automated language analysis
title_fullStr Prediction of psychosis across protocols and risk cohorts using automated language analysis
title_full_unstemmed Prediction of psychosis across protocols and risk cohorts using automated language analysis
title_sort Prediction of psychosis across protocols and risk cohorts using automated language analysis
dc.creator.none.fl_str_mv Corcoran, Cheryl M.
Carrillo, Facundo
Fernandez Slezak, Diego
Bedi, Gillinder
Klim, Casimir
Javitt, Daniel C.
Bearden, Carrie E.
Cecchi, Guillermo Alberto
author Corcoran, Cheryl M.
author_facet Corcoran, Cheryl M.
Carrillo, Facundo
Fernandez Slezak, Diego
Bedi, Gillinder
Klim, Casimir
Javitt, Daniel C.
Bearden, Carrie E.
Cecchi, Guillermo Alberto
author_role author
author2 Carrillo, Facundo
Fernandez Slezak, Diego
Bedi, Gillinder
Klim, Casimir
Javitt, Daniel C.
Bearden, Carrie E.
Cecchi, Guillermo Alberto
author2_role author
author
author
author
author
author
author
dc.subject.none.fl_str_mv AUTOMATED LANGUAGE ANALYSIS
HIGH-RISK YOUTHS
MACHINE LEARNING
PREDICTION OF PSYCHOSIS
SEMANTIC COHERENCE
SYNTACTIC COMPLEXITY
topic AUTOMATED LANGUAGE ANALYSIS
HIGH-RISK YOUTHS
MACHINE LEARNING
PREDICTION OF PSYCHOSIS
SEMANTIC COHERENCE
SYNTACTIC COMPLEXITY
purl_subject.fl_str_mv https://purl.org/becyt/ford/1.2
https://purl.org/becyt/ford/1
https://purl.org/becyt/ford/3.2
https://purl.org/becyt/ford/3
dc.description.none.fl_txt_mv Language and speech are the primary source of data for psychiatrists to diagnose and treat mental disorders. In psychosis, the very structure of language can be disturbed, including semantic coherence (e.g., derailment and tangentiality) and syntactic complexity (e.g., concreteness). Subtle disturbances in language are evident in schizophrenia even prior to first psychosis onset, during prodromal stages. Using computer-based natural language processing analyses, we previously showed that, among English-speaking clinical (e.g., ultra) high-risk youths, baseline reduction in semantic coherence (the flow of meaning in speech) and in syntactic complexity could predict subsequent psychosis onset with high accuracy. Herein, we aimed to cross-validate these automated linguistic analytic methods in a second larger risk cohort, also English-speaking, and to discriminate speech in psychosis from normal speech. We identified an automated machine-learning speech classifier – comprising decreased semantic coherence, greater variance in that coherence, and reduced usage of possessive pronouns – that had an 83% accuracy in predicting psychosis onset (intra-protocol), a cross-validated accuracy of 79% of psychosis onset prediction in the original risk cohort (cross-protocol), and a 72% accuracy in discriminating the speech of recent-onset psychosis patients from that of healthy individuals. The classifier was highly correlated with previously identified manual linguistic predictors. Our findings support the utility and validity of automated natural language processing methods to characterize disturbances in semantics and syntax across stages of psychotic disorder. The next steps will be to apply these methods in larger risk cohorts to further test reproducibility, also in languages other than English, and identify sources of variability. This technology has the potential to improve prediction of psychosis outcome among at-risk youths and identify linguistic targets for remediation and preventive intervention. More broadly, automated linguistic analysis can be a powerful tool for diagnosis and treatment across neuropsychiatry.
Fil: Corcoran, Cheryl M.. Icahn School of Medicine at Mount Sinai; Estados Unidos. New York State Psychiatric Institute; Estados Unidos
Fil: Carrillo, Facundo. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria. Instituto de Investigación en Ciencias de la Computación. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Instituto de Investigación en Ciencias de la Computación; Argentina
Fil: Fernandez Slezak, Diego. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria. Instituto de Investigación en Ciencias de la Computación. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Instituto de Investigación en Ciencias de la Computación; Argentina
Fil: Bedi, Gillinder. New York State Psychiatric Institute; Estados Unidos. Columbia University; Estados Unidos. University of Melbourne; Australia
Fil: Klim, Casimir. Columbia University; Estados Unidos. New York State Psychiatric Institute; Estados Unidos
Fil: Javitt, Daniel C.. Columbia University; Estados Unidos. New York State Psychiatric Institute; Estados Unidos
Fil: Bearden, Carrie E.. University of California at Los Angeles; Estados Unidos
Fil: Cecchi, Guillermo Alberto. IBM T.J. Watson Research Center; Estados Unidos
description Language and speech are the primary source of data for psychiatrists to diagnose and treat mental disorders. In psychosis, the very structure of language can be disturbed, including semantic coherence (e.g., derailment and tangentiality) and syntactic complexity (e.g., concreteness). Subtle disturbances in language are evident in schizophrenia even prior to first psychosis onset, during prodromal stages. Using computer-based natural language processing analyses, we previously showed that, among English-speaking clinical (e.g., ultra) high-risk youths, baseline reduction in semantic coherence (the flow of meaning in speech) and in syntactic complexity could predict subsequent psychosis onset with high accuracy. Herein, we aimed to cross-validate these automated linguistic analytic methods in a second larger risk cohort, also English-speaking, and to discriminate speech in psychosis from normal speech. We identified an automated machine-learning speech classifier – comprising decreased semantic coherence, greater variance in that coherence, and reduced usage of possessive pronouns – that had an 83% accuracy in predicting psychosis onset (intra-protocol), a cross-validated accuracy of 79% of psychosis onset prediction in the original risk cohort (cross-protocol), and a 72% accuracy in discriminating the speech of recent-onset psychosis patients from that of healthy individuals. The classifier was highly correlated with previously identified manual linguistic predictors. Our findings support the utility and validity of automated natural language processing methods to characterize disturbances in semantics and syntax across stages of psychotic disorder. The next steps will be to apply these methods in larger risk cohorts to further test reproducibility, also in languages other than English, and identify sources of variability. This technology has the potential to improve prediction of psychosis outcome among at-risk youths and identify linguistic targets for remediation and preventive intervention. More broadly, automated linguistic analysis can be a powerful tool for diagnosis and treatment across neuropsychiatry.
publishDate 2018
dc.date.none.fl_str_mv 2018-02
dc.type.none.fl_str_mv info:eu-repo/semantics/article
info:eu-repo/semantics/publishedVersion
http://purl.org/coar/resource_type/c_6501
info:ar-repo/semantics/articulo
format article
status_str publishedVersion
dc.identifier.none.fl_str_mv http://hdl.handle.net/11336/97054
Corcoran, Cheryl M.; Carrillo, Facundo; Fernandez Slezak, Diego; Bedi, Gillinder; Klim, Casimir; et al.; Prediction of psychosis across protocols and risk cohorts using automated language analysis; Wiley; World Psychiatry; 17; 1; 2-2018; 67-75
1723-8617
CONICET Digital
CONICET
url http://hdl.handle.net/11336/97054
identifier_str_mv Corcoran, Cheryl M.; Carrillo, Facundo; Fernandez Slezak, Diego; Bedi, Gillinder; Klim, Casimir; et al.; Prediction of psychosis across protocols and risk cohorts using automated language analysis; Wiley; World Psychiatry; 17; 1; 2-2018; 67-75
1723-8617
CONICET Digital
CONICET
dc.language.none.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv info:eu-repo/semantics/altIdentifier/doi/10.1002/wps.20491
info:eu-repo/semantics/altIdentifier/url/https://onlinelibrary.wiley.com/doi/full/10.1002/wps.20491
info:eu-repo/semantics/altIdentifier/url/https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5775133/
dc.rights.none.fl_str_mv info:eu-repo/semantics/openAccess
https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
eu_rights_str_mv openAccess
rights_invalid_str_mv https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
dc.format.none.fl_str_mv application/pdf
application/pdf
application/pdf
dc.publisher.none.fl_str_mv Wiley
publisher.none.fl_str_mv Wiley
dc.source.none.fl_str_mv reponame:CONICET Digital (CONICET)
instname:Consejo Nacional de Investigaciones Científicas y Técnicas
reponame_str CONICET Digital (CONICET)
collection CONICET Digital (CONICET)
instname_str Consejo Nacional de Investigaciones Científicas y Técnicas
repository.name.fl_str_mv CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicas
repository.mail.fl_str_mv dasensio@conicet.gov.ar; lcarlino@conicet.gov.ar
_version_ 1844613577776824320
score 13.070432