Capturing coevolutionary signals inrepeat proteins

Autores
Espada, Rocío; Parra, Rodrigo Gonzalo; Mora, Thierry; Walczak, Aleksandra M.; Ferreiro, Diego
Año de publicación
2015
Idioma
inglés
Tipo de recurso
artículo
Estado
versión publicada
Descripción
Background: The analysis of correlations of amino acid occurrences in globular domains has led to the development of statistical tools that can identify native contacts - portions of the chains that come to close distance in folded structural ensembles. Here we introduce a direct coupling analysis for repeat proteins - natural systems for which the identification of folding domains remains challenging. Results: We show that the inherent translational symmetry of repeat protein sequences introduces a strong bias in the pair correlations at precisely the length scale of the repeat-unit. Equalizing for this bias in an objective way reveals true co-evolutionary signals from which local native contacts can be identified. Importantly, parameter values obtained for all other interactions are not significantly affected by the equalization. We quantify the robustness of the procedure and assign confidence levels to the interactions, identifying the minimum number of sequences needed to extract evolutionary information in several repeat protein families. Conclusions: The overall procedure can be used to reconstruct the interactions at distances larger than repeat-pairs, identifying the characteristics of the strongest couplings in each family, and can be applied to any system that appears translationally symmetric.
Fil: Espada, Rocío. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria. Instituto de Química Biológica de la Facultad de Ciencias Exactas y Naturales. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Instituto de Química Biológica de la Facultad de Ciencias Exactas y Naturales; Argentina
Fil: Parra, Rodrigo Gonzalo. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria. Instituto de Química Biológica de la Facultad de Ciencias Exactas y Naturales. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Instituto de Química Biológica de la Facultad de Ciencias Exactas y Naturales; Argentina
Fil: Mora, Thierry. Centre de Recherches de Biochimie Macromoléculaire; Francia
Fil: Walczak, Aleksandra M.. Centre de Recherches de Biochimie Macromoléculaire; Francia
Fil: Ferreiro, Diego. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria. Instituto de Química Biológica de la Facultad de Ciencias Exactas y Naturales. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Instituto de Química Biológica de la Facultad de Ciencias Exactas y Naturales; Argentina
Materia
CO-EVOLUTION
DIRECT COUPLING ANALYSIS
DIRECT INFORMATION
REPEAT PROTEINS
Nivel de accesibilidad
acceso abierto
Condiciones de uso
https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
Repositorio
CONICET Digital (CONICET)
Institución
Consejo Nacional de Investigaciones Científicas y Técnicas
OAI Identificador
oai:ri.conicet.gov.ar:11336/48743

id CONICETDig_9bf4ab86cdde250eee07dd4f840cc018
oai_identifier_str oai:ri.conicet.gov.ar:11336/48743
network_acronym_str CONICETDig
repository_id_str 3498
network_name_str CONICET Digital (CONICET)
spelling Capturing coevolutionary signals inrepeat proteinsEspada, RocíoParra, Rodrigo GonzaloMora, ThierryWalczak, Aleksandra M.Ferreiro, DiegoCO-EVOLUTIONDIRECT COUPLING ANALYSISDIRECT INFORMATIONREPEAT PROTEINShttps://purl.org/becyt/ford/1.2https://purl.org/becyt/ford/1Background: The analysis of correlations of amino acid occurrences in globular domains has led to the development of statistical tools that can identify native contacts - portions of the chains that come to close distance in folded structural ensembles. Here we introduce a direct coupling analysis for repeat proteins - natural systems for which the identification of folding domains remains challenging. Results: We show that the inherent translational symmetry of repeat protein sequences introduces a strong bias in the pair correlations at precisely the length scale of the repeat-unit. Equalizing for this bias in an objective way reveals true co-evolutionary signals from which local native contacts can be identified. Importantly, parameter values obtained for all other interactions are not significantly affected by the equalization. We quantify the robustness of the procedure and assign confidence levels to the interactions, identifying the minimum number of sequences needed to extract evolutionary information in several repeat protein families. Conclusions: The overall procedure can be used to reconstruct the interactions at distances larger than repeat-pairs, identifying the characteristics of the strongest couplings in each family, and can be applied to any system that appears translationally symmetric.Fil: Espada, Rocío. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria. Instituto de Química Biológica de la Facultad de Ciencias Exactas y Naturales. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Instituto de Química Biológica de la Facultad de Ciencias Exactas y Naturales; ArgentinaFil: Parra, Rodrigo Gonzalo. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria. Instituto de Química Biológica de la Facultad de Ciencias Exactas y Naturales. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Instituto de Química Biológica de la Facultad de Ciencias Exactas y Naturales; ArgentinaFil: Mora, Thierry. Centre de Recherches de Biochimie Macromoléculaire; FranciaFil: Walczak, Aleksandra M.. Centre de Recherches de Biochimie Macromoléculaire; FranciaFil: Ferreiro, Diego. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria. Instituto de Química Biológica de la Facultad de Ciencias Exactas y Naturales. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Instituto de Química Biológica de la Facultad de Ciencias Exactas y Naturales; ArgentinaBioMed Central2015-12info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionhttp://purl.org/coar/resource_type/c_6501info:ar-repo/semantics/articuloapplication/pdfapplication/pdfapplication/pdfhttp://hdl.handle.net/11336/48743Espada, Rocío; Parra, Rodrigo Gonzalo; Mora, Thierry; Walczak, Aleksandra M.; Ferreiro, Diego; Capturing coevolutionary signals inrepeat proteins; BioMed Central; BMC Bioinformatics; 16; 1; 12-2015; 207-2171471-2105CONICET DigitalCONICETenginfo:eu-repo/semantics/altIdentifier/url/http://www.biomedcentral.com/1471-2105/16/207info:eu-repo/semantics/altIdentifier/doi/10.1186/s12859-015-0648-3info:eu-repo/semantics/openAccesshttps://creativecommons.org/licenses/by-nc-sa/2.5/ar/reponame:CONICET Digital (CONICET)instname:Consejo Nacional de Investigaciones Científicas y Técnicas2025-09-29T09:52:10Zoai:ri.conicet.gov.ar:11336/48743instacron:CONICETInstitucionalhttp://ri.conicet.gov.ar/Organismo científico-tecnológicoNo correspondehttp://ri.conicet.gov.ar/oai/requestdasensio@conicet.gov.ar; lcarlino@conicet.gov.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:34982025-09-29 09:52:11.233CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicasfalse
dc.title.none.fl_str_mv Capturing coevolutionary signals inrepeat proteins
title Capturing coevolutionary signals inrepeat proteins
spellingShingle Capturing coevolutionary signals inrepeat proteins
Espada, Rocío
CO-EVOLUTION
DIRECT COUPLING ANALYSIS
DIRECT INFORMATION
REPEAT PROTEINS
title_short Capturing coevolutionary signals inrepeat proteins
title_full Capturing coevolutionary signals inrepeat proteins
title_fullStr Capturing coevolutionary signals inrepeat proteins
title_full_unstemmed Capturing coevolutionary signals inrepeat proteins
title_sort Capturing coevolutionary signals inrepeat proteins
dc.creator.none.fl_str_mv Espada, Rocío
Parra, Rodrigo Gonzalo
Mora, Thierry
Walczak, Aleksandra M.
Ferreiro, Diego
author Espada, Rocío
author_facet Espada, Rocío
Parra, Rodrigo Gonzalo
Mora, Thierry
Walczak, Aleksandra M.
Ferreiro, Diego
author_role author
author2 Parra, Rodrigo Gonzalo
Mora, Thierry
Walczak, Aleksandra M.
Ferreiro, Diego
author2_role author
author
author
author
dc.subject.none.fl_str_mv CO-EVOLUTION
DIRECT COUPLING ANALYSIS
DIRECT INFORMATION
REPEAT PROTEINS
topic CO-EVOLUTION
DIRECT COUPLING ANALYSIS
DIRECT INFORMATION
REPEAT PROTEINS
purl_subject.fl_str_mv https://purl.org/becyt/ford/1.2
https://purl.org/becyt/ford/1
dc.description.none.fl_txt_mv Background: The analysis of correlations of amino acid occurrences in globular domains has led to the development of statistical tools that can identify native contacts - portions of the chains that come to close distance in folded structural ensembles. Here we introduce a direct coupling analysis for repeat proteins - natural systems for which the identification of folding domains remains challenging. Results: We show that the inherent translational symmetry of repeat protein sequences introduces a strong bias in the pair correlations at precisely the length scale of the repeat-unit. Equalizing for this bias in an objective way reveals true co-evolutionary signals from which local native contacts can be identified. Importantly, parameter values obtained for all other interactions are not significantly affected by the equalization. We quantify the robustness of the procedure and assign confidence levels to the interactions, identifying the minimum number of sequences needed to extract evolutionary information in several repeat protein families. Conclusions: The overall procedure can be used to reconstruct the interactions at distances larger than repeat-pairs, identifying the characteristics of the strongest couplings in each family, and can be applied to any system that appears translationally symmetric.
Fil: Espada, Rocío. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria. Instituto de Química Biológica de la Facultad de Ciencias Exactas y Naturales. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Instituto de Química Biológica de la Facultad de Ciencias Exactas y Naturales; Argentina
Fil: Parra, Rodrigo Gonzalo. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria. Instituto de Química Biológica de la Facultad de Ciencias Exactas y Naturales. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Instituto de Química Biológica de la Facultad de Ciencias Exactas y Naturales; Argentina
Fil: Mora, Thierry. Centre de Recherches de Biochimie Macromoléculaire; Francia
Fil: Walczak, Aleksandra M.. Centre de Recherches de Biochimie Macromoléculaire; Francia
Fil: Ferreiro, Diego. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria. Instituto de Química Biológica de la Facultad de Ciencias Exactas y Naturales. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Instituto de Química Biológica de la Facultad de Ciencias Exactas y Naturales; Argentina
description Background: The analysis of correlations of amino acid occurrences in globular domains has led to the development of statistical tools that can identify native contacts - portions of the chains that come to close distance in folded structural ensembles. Here we introduce a direct coupling analysis for repeat proteins - natural systems for which the identification of folding domains remains challenging. Results: We show that the inherent translational symmetry of repeat protein sequences introduces a strong bias in the pair correlations at precisely the length scale of the repeat-unit. Equalizing for this bias in an objective way reveals true co-evolutionary signals from which local native contacts can be identified. Importantly, parameter values obtained for all other interactions are not significantly affected by the equalization. We quantify the robustness of the procedure and assign confidence levels to the interactions, identifying the minimum number of sequences needed to extract evolutionary information in several repeat protein families. Conclusions: The overall procedure can be used to reconstruct the interactions at distances larger than repeat-pairs, identifying the characteristics of the strongest couplings in each family, and can be applied to any system that appears translationally symmetric.
publishDate 2015
dc.date.none.fl_str_mv 2015-12
dc.type.none.fl_str_mv info:eu-repo/semantics/article
info:eu-repo/semantics/publishedVersion
http://purl.org/coar/resource_type/c_6501
info:ar-repo/semantics/articulo
format article
status_str publishedVersion
dc.identifier.none.fl_str_mv http://hdl.handle.net/11336/48743
Espada, Rocío; Parra, Rodrigo Gonzalo; Mora, Thierry; Walczak, Aleksandra M.; Ferreiro, Diego; Capturing coevolutionary signals inrepeat proteins; BioMed Central; BMC Bioinformatics; 16; 1; 12-2015; 207-217
1471-2105
CONICET Digital
CONICET
url http://hdl.handle.net/11336/48743
identifier_str_mv Espada, Rocío; Parra, Rodrigo Gonzalo; Mora, Thierry; Walczak, Aleksandra M.; Ferreiro, Diego; Capturing coevolutionary signals inrepeat proteins; BioMed Central; BMC Bioinformatics; 16; 1; 12-2015; 207-217
1471-2105
CONICET Digital
CONICET
dc.language.none.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv info:eu-repo/semantics/altIdentifier/url/http://www.biomedcentral.com/1471-2105/16/207
info:eu-repo/semantics/altIdentifier/doi/10.1186/s12859-015-0648-3
dc.rights.none.fl_str_mv info:eu-repo/semantics/openAccess
https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
eu_rights_str_mv openAccess
rights_invalid_str_mv https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
dc.format.none.fl_str_mv application/pdf
application/pdf
application/pdf
dc.publisher.none.fl_str_mv BioMed Central
publisher.none.fl_str_mv BioMed Central
dc.source.none.fl_str_mv reponame:CONICET Digital (CONICET)
instname:Consejo Nacional de Investigaciones Científicas y Técnicas
reponame_str CONICET Digital (CONICET)
collection CONICET Digital (CONICET)
instname_str Consejo Nacional de Investigaciones Científicas y Técnicas
repository.name.fl_str_mv CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicas
repository.mail.fl_str_mv dasensio@conicet.gov.ar; lcarlino@conicet.gov.ar
_version_ 1844613601139097600
score 13.070432