Codon usage clusters correlation: Towards protein solubility prediction in heterologous expression systems in E. coli

Autores
Pellizza Pena, Leonardo Agustín; Smal, Clara; Rodrigo, Guido; Aran, Martin
Año de publicación
2018
Idioma
inglés
Tipo de recurso
artículo
Estado
versión publicada
Descripción
Production of soluble recombinant proteins is crucial to the development of industry and basic research. However, the aggregation due to the incorrect folding of the nascent polypeptides is still a mayor bottleneck. Understanding the factors governing protein solubility is important to grasp the underlying mechanisms and improve the design of recombinant proteins. Here we show a quantitative study of the expression and solubility of a set of proteins from Bizionia argentinensis. Through the analysis of different features known to modulate protein production, we defined two parameters based on the %MinMax algorithm to compare codon usage clusters between the host and the target genes. We demonstrate that the absolute difference between all %MinMax frequencies of the host and the target gene is significantly negatively correlated with protein expression levels. But most importantly, a strong positive correlation between solubility and the degree of conservation of codons usage clusters is observed for two independent datasets. Moreover, we evince that this correlation is higher in codon usage clusters involved in less compact protein secondary structure regions. Our results provide important tools for protein design and support the notion that codon usage may dictate translation rate and modulate co-Translational folding.
Fil: Pellizza Pena, Leonardo Agustín. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Instituto de Investigaciones Bioquímicas de Buenos Aires. Fundación Instituto Leloir. Instituto de Investigaciones Bioquímicas de Buenos Aires; Argentina
Fil: Smal, Clara. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Instituto de Investigaciones Bioquímicas de Buenos Aires. Fundación Instituto Leloir. Instituto de Investigaciones Bioquímicas de Buenos Aires; Argentina
Fil: Rodrigo, Guido. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Instituto de Investigaciones Bioquímicas de Buenos Aires. Fundación Instituto Leloir. Instituto de Investigaciones Bioquímicas de Buenos Aires; Argentina
Fil: Aran, Martin. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Instituto de Investigaciones Bioquímicas de Buenos Aires. Fundación Instituto Leloir. Instituto de Investigaciones Bioquímicas de Buenos Aires; Argentina
Materia
Production of soluble recombinant proteins
%MinMax algorithm
Bizionia argentinensis
Nivel de accesibilidad
acceso abierto
Condiciones de uso
https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
Repositorio
CONICET Digital (CONICET)
Institución
Consejo Nacional de Investigaciones Científicas y Técnicas
OAI Identificador
oai:ri.conicet.gov.ar:11336/91344

id CONICETDig_2d0b4f7b3ea86853feb34ce1673a22db
oai_identifier_str oai:ri.conicet.gov.ar:11336/91344
network_acronym_str CONICETDig
repository_id_str 3498
network_name_str CONICET Digital (CONICET)
spelling Codon usage clusters correlation: Towards protein solubility prediction in heterologous expression systems in E. coliPellizza Pena, Leonardo AgustínSmal, ClaraRodrigo, GuidoAran, MartinProduction of soluble recombinant proteins%MinMax algorithmBizionia argentinensishttps://purl.org/becyt/ford/1.6https://purl.org/becyt/ford/1Production of soluble recombinant proteins is crucial to the development of industry and basic research. However, the aggregation due to the incorrect folding of the nascent polypeptides is still a mayor bottleneck. Understanding the factors governing protein solubility is important to grasp the underlying mechanisms and improve the design of recombinant proteins. Here we show a quantitative study of the expression and solubility of a set of proteins from Bizionia argentinensis. Through the analysis of different features known to modulate protein production, we defined two parameters based on the %MinMax algorithm to compare codon usage clusters between the host and the target genes. We demonstrate that the absolute difference between all %MinMax frequencies of the host and the target gene is significantly negatively correlated with protein expression levels. But most importantly, a strong positive correlation between solubility and the degree of conservation of codons usage clusters is observed for two independent datasets. Moreover, we evince that this correlation is higher in codon usage clusters involved in less compact protein secondary structure regions. Our results provide important tools for protein design and support the notion that codon usage may dictate translation rate and modulate co-Translational folding.Fil: Pellizza Pena, Leonardo Agustín. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Instituto de Investigaciones Bioquímicas de Buenos Aires. Fundación Instituto Leloir. Instituto de Investigaciones Bioquímicas de Buenos Aires; ArgentinaFil: Smal, Clara. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Instituto de Investigaciones Bioquímicas de Buenos Aires. Fundación Instituto Leloir. Instituto de Investigaciones Bioquímicas de Buenos Aires; ArgentinaFil: Rodrigo, Guido. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Instituto de Investigaciones Bioquímicas de Buenos Aires. Fundación Instituto Leloir. Instituto de Investigaciones Bioquímicas de Buenos Aires; ArgentinaFil: Aran, Martin. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Instituto de Investigaciones Bioquímicas de Buenos Aires. Fundación Instituto Leloir. Instituto de Investigaciones Bioquímicas de Buenos Aires; ArgentinaNature Publishing Group2018-12info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionhttp://purl.org/coar/resource_type/c_6501info:ar-repo/semantics/articuloapplication/pdfapplication/pdfapplication/pdfapplication/pdfhttp://hdl.handle.net/11336/91344Pellizza Pena, Leonardo Agustín; Smal, Clara; Rodrigo, Guido; Aran, Martin; Codon usage clusters correlation: Towards protein solubility prediction in heterologous expression systems in E. coli; Nature Publishing Group; Scientific Reports; 8; 1; 12-2018; 1-122045-2322CONICET DigitalCONICETenginfo:eu-repo/semantics/altIdentifier/url/http://www.nature.com/articles/s41598-018-29035-zinfo:eu-repo/semantics/altIdentifier/doi/10.1038/s41598-018-29035-zinfo:eu-repo/semantics/openAccesshttps://creativecommons.org/licenses/by-nc-sa/2.5/ar/reponame:CONICET Digital (CONICET)instname:Consejo Nacional de Investigaciones Científicas y Técnicas2025-10-22T11:08:47Zoai:ri.conicet.gov.ar:11336/91344instacron:CONICETInstitucionalhttp://ri.conicet.gov.ar/Organismo científico-tecnológicoNo correspondehttp://ri.conicet.gov.ar/oai/requestdasensio@conicet.gov.ar; lcarlino@conicet.gov.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:34982025-10-22 11:08:48.095CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicasfalse
dc.title.none.fl_str_mv Codon usage clusters correlation: Towards protein solubility prediction in heterologous expression systems in E. coli
title Codon usage clusters correlation: Towards protein solubility prediction in heterologous expression systems in E. coli
spellingShingle Codon usage clusters correlation: Towards protein solubility prediction in heterologous expression systems in E. coli
Pellizza Pena, Leonardo Agustín
Production of soluble recombinant proteins
%MinMax algorithm
Bizionia argentinensis
title_short Codon usage clusters correlation: Towards protein solubility prediction in heterologous expression systems in E. coli
title_full Codon usage clusters correlation: Towards protein solubility prediction in heterologous expression systems in E. coli
title_fullStr Codon usage clusters correlation: Towards protein solubility prediction in heterologous expression systems in E. coli
title_full_unstemmed Codon usage clusters correlation: Towards protein solubility prediction in heterologous expression systems in E. coli
title_sort Codon usage clusters correlation: Towards protein solubility prediction in heterologous expression systems in E. coli
dc.creator.none.fl_str_mv Pellizza Pena, Leonardo Agustín
Smal, Clara
Rodrigo, Guido
Aran, Martin
author Pellizza Pena, Leonardo Agustín
author_facet Pellizza Pena, Leonardo Agustín
Smal, Clara
Rodrigo, Guido
Aran, Martin
author_role author
author2 Smal, Clara
Rodrigo, Guido
Aran, Martin
author2_role author
author
author
dc.subject.none.fl_str_mv Production of soluble recombinant proteins
%MinMax algorithm
Bizionia argentinensis
topic Production of soluble recombinant proteins
%MinMax algorithm
Bizionia argentinensis
purl_subject.fl_str_mv https://purl.org/becyt/ford/1.6
https://purl.org/becyt/ford/1
dc.description.none.fl_txt_mv Production of soluble recombinant proteins is crucial to the development of industry and basic research. However, the aggregation due to the incorrect folding of the nascent polypeptides is still a mayor bottleneck. Understanding the factors governing protein solubility is important to grasp the underlying mechanisms and improve the design of recombinant proteins. Here we show a quantitative study of the expression and solubility of a set of proteins from Bizionia argentinensis. Through the analysis of different features known to modulate protein production, we defined two parameters based on the %MinMax algorithm to compare codon usage clusters between the host and the target genes. We demonstrate that the absolute difference between all %MinMax frequencies of the host and the target gene is significantly negatively correlated with protein expression levels. But most importantly, a strong positive correlation between solubility and the degree of conservation of codons usage clusters is observed for two independent datasets. Moreover, we evince that this correlation is higher in codon usage clusters involved in less compact protein secondary structure regions. Our results provide important tools for protein design and support the notion that codon usage may dictate translation rate and modulate co-Translational folding.
Fil: Pellizza Pena, Leonardo Agustín. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Instituto de Investigaciones Bioquímicas de Buenos Aires. Fundación Instituto Leloir. Instituto de Investigaciones Bioquímicas de Buenos Aires; Argentina
Fil: Smal, Clara. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Instituto de Investigaciones Bioquímicas de Buenos Aires. Fundación Instituto Leloir. Instituto de Investigaciones Bioquímicas de Buenos Aires; Argentina
Fil: Rodrigo, Guido. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Instituto de Investigaciones Bioquímicas de Buenos Aires. Fundación Instituto Leloir. Instituto de Investigaciones Bioquímicas de Buenos Aires; Argentina
Fil: Aran, Martin. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Instituto de Investigaciones Bioquímicas de Buenos Aires. Fundación Instituto Leloir. Instituto de Investigaciones Bioquímicas de Buenos Aires; Argentina
description Production of soluble recombinant proteins is crucial to the development of industry and basic research. However, the aggregation due to the incorrect folding of the nascent polypeptides is still a mayor bottleneck. Understanding the factors governing protein solubility is important to grasp the underlying mechanisms and improve the design of recombinant proteins. Here we show a quantitative study of the expression and solubility of a set of proteins from Bizionia argentinensis. Through the analysis of different features known to modulate protein production, we defined two parameters based on the %MinMax algorithm to compare codon usage clusters between the host and the target genes. We demonstrate that the absolute difference between all %MinMax frequencies of the host and the target gene is significantly negatively correlated with protein expression levels. But most importantly, a strong positive correlation between solubility and the degree of conservation of codons usage clusters is observed for two independent datasets. Moreover, we evince that this correlation is higher in codon usage clusters involved in less compact protein secondary structure regions. Our results provide important tools for protein design and support the notion that codon usage may dictate translation rate and modulate co-Translational folding.
publishDate 2018
dc.date.none.fl_str_mv 2018-12
dc.type.none.fl_str_mv info:eu-repo/semantics/article
info:eu-repo/semantics/publishedVersion
http://purl.org/coar/resource_type/c_6501
info:ar-repo/semantics/articulo
format article
status_str publishedVersion
dc.identifier.none.fl_str_mv http://hdl.handle.net/11336/91344
Pellizza Pena, Leonardo Agustín; Smal, Clara; Rodrigo, Guido; Aran, Martin; Codon usage clusters correlation: Towards protein solubility prediction in heterologous expression systems in E. coli; Nature Publishing Group; Scientific Reports; 8; 1; 12-2018; 1-12
2045-2322
CONICET Digital
CONICET
url http://hdl.handle.net/11336/91344
identifier_str_mv Pellizza Pena, Leonardo Agustín; Smal, Clara; Rodrigo, Guido; Aran, Martin; Codon usage clusters correlation: Towards protein solubility prediction in heterologous expression systems in E. coli; Nature Publishing Group; Scientific Reports; 8; 1; 12-2018; 1-12
2045-2322
CONICET Digital
CONICET
dc.language.none.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv info:eu-repo/semantics/altIdentifier/url/http://www.nature.com/articles/s41598-018-29035-z
info:eu-repo/semantics/altIdentifier/doi/10.1038/s41598-018-29035-z
dc.rights.none.fl_str_mv info:eu-repo/semantics/openAccess
https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
eu_rights_str_mv openAccess
rights_invalid_str_mv https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
dc.format.none.fl_str_mv application/pdf
application/pdf
application/pdf
application/pdf
dc.publisher.none.fl_str_mv Nature Publishing Group
publisher.none.fl_str_mv Nature Publishing Group
dc.source.none.fl_str_mv reponame:CONICET Digital (CONICET)
instname:Consejo Nacional de Investigaciones Científicas y Técnicas
reponame_str CONICET Digital (CONICET)
collection CONICET Digital (CONICET)
instname_str Consejo Nacional de Investigaciones Científicas y Técnicas
repository.name.fl_str_mv CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicas
repository.mail.fl_str_mv dasensio@conicet.gov.ar; lcarlino@conicet.gov.ar
_version_ 1846781423989555200
score 12.982451