Conserved and divergent signals in 5’ splice site sequences across fungi, metazoa and plants

Autores
Beckel, Maximiliano Sebastián; Kaufman, Bruno; Yanovsky, Marcelo Javier; Chernomoretz, Ariel
Año de publicación
2023
Idioma
inglés
Tipo de recurso
artículo
Estado
versión publicada
Descripción
In eukaryotic organisms the ensemble of 5’ splice site sequences reflects the balance between natural nucleotide variability and minimal molecular constraints necessary to ensure splicing fidelity. This compromise shapes the underlying statistical patterns in the composition of donor splice site sequences. The scope of this study was to mine conserved and divergent signals in the composition of 5’ splice site sequences. Because 5´ donor sequences are a major cue for proper recognition of splice sites, we reasoned that statistical regularities in their composition could reflect the biological functionality and evolutionary history associated with splicing mechanisms. Results: We considered a regularized maximum entropy modeling framework to mine for non-trivial two-site correlations in donor sequence datasets corresponding to 30 different eukaryotes. For each analyzed species, we identified minimal sets of two-site coupling patterns that were able to replicate, at a given regularization level, the observed one-site and two-site frequencies in donor sequences. By performing a systematic and comparative analysis of 5’splice sites we showed that lineage information could be traced from joint di-nucleotide probabilities. We were able to identify characteristic two-site coupling patterns for plants and animals, and propose that they may echo differences in splicing regulation previously reported between these groups.
Fil: Beckel, Maximiliano Sebastián. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Instituto de Investigaciones Bioquímicas de Buenos Aires. Fundación Instituto Leloir. Instituto de Investigaciones Bioquímicas de Buenos Aires; Argentina
Fil: Kaufman, Bruno. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Instituto de Investigaciones Bioquímicas de Buenos Aires. Fundación Instituto Leloir. Instituto de Investigaciones Bioquímicas de Buenos Aires; Argentina
Fil: Yanovsky, Marcelo Javier. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Instituto de Investigaciones Bioquímicas de Buenos Aires. Fundación Instituto Leloir. Instituto de Investigaciones Bioquímicas de Buenos Aires; Argentina
Fil: Chernomoretz, Ariel. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria. Instituto de Física del Plasma. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Instituto de Física del Plasma; Argentina
Materia
splicing
donor sequences
regularized maximum entropy model
Nivel de accesibilidad
acceso abierto
Condiciones de uso
https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
Repositorio
CONICET Digital (CONICET)
Institución
Consejo Nacional de Investigaciones Científicas y Técnicas
OAI Identificador
oai:ri.conicet.gov.ar:11336/228396

id CONICETDig_481e0b7ffa112daa90d894c023ea9bd0
oai_identifier_str oai:ri.conicet.gov.ar:11336/228396
network_acronym_str CONICETDig
repository_id_str 3498
network_name_str CONICET Digital (CONICET)
spelling Conserved and divergent signals in 5’ splice site sequences across fungi, metazoa and plantsBeckel, Maximiliano SebastiánKaufman, BrunoYanovsky, Marcelo JavierChernomoretz, Arielsplicingdonor sequencesregularized maximum entropy modelhttps://purl.org/becyt/ford/1.2https://purl.org/becyt/ford/1In eukaryotic organisms the ensemble of 5’ splice site sequences reflects the balance between natural nucleotide variability and minimal molecular constraints necessary to ensure splicing fidelity. This compromise shapes the underlying statistical patterns in the composition of donor splice site sequences. The scope of this study was to mine conserved and divergent signals in the composition of 5’ splice site sequences. Because 5´ donor sequences are a major cue for proper recognition of splice sites, we reasoned that statistical regularities in their composition could reflect the biological functionality and evolutionary history associated with splicing mechanisms. Results: We considered a regularized maximum entropy modeling framework to mine for non-trivial two-site correlations in donor sequence datasets corresponding to 30 different eukaryotes. For each analyzed species, we identified minimal sets of two-site coupling patterns that were able to replicate, at a given regularization level, the observed one-site and two-site frequencies in donor sequences. By performing a systematic and comparative analysis of 5’splice sites we showed that lineage information could be traced from joint di-nucleotide probabilities. We were able to identify characteristic two-site coupling patterns for plants and animals, and propose that they may echo differences in splicing regulation previously reported between these groups.Fil: Beckel, Maximiliano Sebastián. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Instituto de Investigaciones Bioquímicas de Buenos Aires. Fundación Instituto Leloir. Instituto de Investigaciones Bioquímicas de Buenos Aires; ArgentinaFil: Kaufman, Bruno. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Instituto de Investigaciones Bioquímicas de Buenos Aires. Fundación Instituto Leloir. Instituto de Investigaciones Bioquímicas de Buenos Aires; ArgentinaFil: Yanovsky, Marcelo Javier. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Instituto de Investigaciones Bioquímicas de Buenos Aires. Fundación Instituto Leloir. Instituto de Investigaciones Bioquímicas de Buenos Aires; ArgentinaFil: Chernomoretz, Ariel. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria. Instituto de Física del Plasma. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Instituto de Física del Plasma; ArgentinaPublic Library of Science2023-10info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionhttp://purl.org/coar/resource_type/c_6501info:ar-repo/semantics/articuloapplication/pdfapplication/pdfhttp://hdl.handle.net/11336/228396Beckel, Maximiliano Sebastián; Kaufman, Bruno; Yanovsky, Marcelo Javier; Chernomoretz, Ariel; Conserved and divergent signals in 5’ splice site sequences across fungi, metazoa and plants; Public Library of Science; Plos Computational Biology; 19; 10; 10-2023; 1-181553-734XCONICET DigitalCONICETenginfo:eu-repo/semantics/altIdentifier/doi/10.1371/journal.pcbi.1011540info:eu-repo/semantics/openAccesshttps://creativecommons.org/licenses/by-nc-sa/2.5/ar/reponame:CONICET Digital (CONICET)instname:Consejo Nacional de Investigaciones Científicas y Técnicas2026-05-06T16:47:23Zoai:ri.conicet.gov.ar:11336/228396instacron:CONICETInstitucionalhttp://ri.conicet.gov.ar/Organismo científico-tecnológicoNo correspondehttp://ri.conicet.gov.ar/oai/requestdasensio@conicet.gov.ar; lcarlino@conicet.gov.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:34982026-05-06 16:47:24.267CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicasfalse
dc.title.none.fl_str_mv Conserved and divergent signals in 5’ splice site sequences across fungi, metazoa and plants
title Conserved and divergent signals in 5’ splice site sequences across fungi, metazoa and plants
spellingShingle Conserved and divergent signals in 5’ splice site sequences across fungi, metazoa and plants
Beckel, Maximiliano Sebastián
splicing
donor sequences
regularized maximum entropy model
title_short Conserved and divergent signals in 5’ splice site sequences across fungi, metazoa and plants
title_full Conserved and divergent signals in 5’ splice site sequences across fungi, metazoa and plants
title_fullStr Conserved and divergent signals in 5’ splice site sequences across fungi, metazoa and plants
title_full_unstemmed Conserved and divergent signals in 5’ splice site sequences across fungi, metazoa and plants
title_sort Conserved and divergent signals in 5’ splice site sequences across fungi, metazoa and plants
dc.creator.none.fl_str_mv Beckel, Maximiliano Sebastián
Kaufman, Bruno
Yanovsky, Marcelo Javier
Chernomoretz, Ariel
author Beckel, Maximiliano Sebastián
author_facet Beckel, Maximiliano Sebastián
Kaufman, Bruno
Yanovsky, Marcelo Javier
Chernomoretz, Ariel
author_role author
author2 Kaufman, Bruno
Yanovsky, Marcelo Javier
Chernomoretz, Ariel
author2_role author
author
author
dc.subject.none.fl_str_mv splicing
donor sequences
regularized maximum entropy model
topic splicing
donor sequences
regularized maximum entropy model
purl_subject.fl_str_mv https://purl.org/becyt/ford/1.2
https://purl.org/becyt/ford/1
dc.description.none.fl_txt_mv In eukaryotic organisms the ensemble of 5’ splice site sequences reflects the balance between natural nucleotide variability and minimal molecular constraints necessary to ensure splicing fidelity. This compromise shapes the underlying statistical patterns in the composition of donor splice site sequences. The scope of this study was to mine conserved and divergent signals in the composition of 5’ splice site sequences. Because 5´ donor sequences are a major cue for proper recognition of splice sites, we reasoned that statistical regularities in their composition could reflect the biological functionality and evolutionary history associated with splicing mechanisms. Results: We considered a regularized maximum entropy modeling framework to mine for non-trivial two-site correlations in donor sequence datasets corresponding to 30 different eukaryotes. For each analyzed species, we identified minimal sets of two-site coupling patterns that were able to replicate, at a given regularization level, the observed one-site and two-site frequencies in donor sequences. By performing a systematic and comparative analysis of 5’splice sites we showed that lineage information could be traced from joint di-nucleotide probabilities. We were able to identify characteristic two-site coupling patterns for plants and animals, and propose that they may echo differences in splicing regulation previously reported between these groups.
Fil: Beckel, Maximiliano Sebastián. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Instituto de Investigaciones Bioquímicas de Buenos Aires. Fundación Instituto Leloir. Instituto de Investigaciones Bioquímicas de Buenos Aires; Argentina
Fil: Kaufman, Bruno. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Instituto de Investigaciones Bioquímicas de Buenos Aires. Fundación Instituto Leloir. Instituto de Investigaciones Bioquímicas de Buenos Aires; Argentina
Fil: Yanovsky, Marcelo Javier. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Instituto de Investigaciones Bioquímicas de Buenos Aires. Fundación Instituto Leloir. Instituto de Investigaciones Bioquímicas de Buenos Aires; Argentina
Fil: Chernomoretz, Ariel. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria. Instituto de Física del Plasma. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Instituto de Física del Plasma; Argentina
description In eukaryotic organisms the ensemble of 5’ splice site sequences reflects the balance between natural nucleotide variability and minimal molecular constraints necessary to ensure splicing fidelity. This compromise shapes the underlying statistical patterns in the composition of donor splice site sequences. The scope of this study was to mine conserved and divergent signals in the composition of 5’ splice site sequences. Because 5´ donor sequences are a major cue for proper recognition of splice sites, we reasoned that statistical regularities in their composition could reflect the biological functionality and evolutionary history associated with splicing mechanisms. Results: We considered a regularized maximum entropy modeling framework to mine for non-trivial two-site correlations in donor sequence datasets corresponding to 30 different eukaryotes. For each analyzed species, we identified minimal sets of two-site coupling patterns that were able to replicate, at a given regularization level, the observed one-site and two-site frequencies in donor sequences. By performing a systematic and comparative analysis of 5’splice sites we showed that lineage information could be traced from joint di-nucleotide probabilities. We were able to identify characteristic two-site coupling patterns for plants and animals, and propose that they may echo differences in splicing regulation previously reported between these groups.
publishDate 2023
dc.date.none.fl_str_mv 2023-10
dc.type.none.fl_str_mv info:eu-repo/semantics/article
info:eu-repo/semantics/publishedVersion
http://purl.org/coar/resource_type/c_6501
info:ar-repo/semantics/articulo
format article
status_str publishedVersion
dc.identifier.none.fl_str_mv http://hdl.handle.net/11336/228396
Beckel, Maximiliano Sebastián; Kaufman, Bruno; Yanovsky, Marcelo Javier; Chernomoretz, Ariel; Conserved and divergent signals in 5’ splice site sequences across fungi, metazoa and plants; Public Library of Science; Plos Computational Biology; 19; 10; 10-2023; 1-18
1553-734X
CONICET Digital
CONICET
url http://hdl.handle.net/11336/228396
identifier_str_mv Beckel, Maximiliano Sebastián; Kaufman, Bruno; Yanovsky, Marcelo Javier; Chernomoretz, Ariel; Conserved and divergent signals in 5’ splice site sequences across fungi, metazoa and plants; Public Library of Science; Plos Computational Biology; 19; 10; 10-2023; 1-18
1553-734X
CONICET Digital
CONICET
dc.language.none.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv info:eu-repo/semantics/altIdentifier/doi/10.1371/journal.pcbi.1011540
dc.rights.none.fl_str_mv info:eu-repo/semantics/openAccess
https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
eu_rights_str_mv openAccess
rights_invalid_str_mv https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
dc.format.none.fl_str_mv application/pdf
application/pdf
dc.publisher.none.fl_str_mv Public Library of Science
publisher.none.fl_str_mv Public Library of Science
dc.source.none.fl_str_mv reponame:CONICET Digital (CONICET)
instname:Consejo Nacional de Investigaciones Científicas y Técnicas
reponame_str CONICET Digital (CONICET)
collection CONICET Digital (CONICET)
instname_str Consejo Nacional de Investigaciones Científicas y Técnicas
repository.name.fl_str_mv CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicas
repository.mail.fl_str_mv dasensio@conicet.gov.ar; lcarlino@conicet.gov.ar
_version_ 1864650606351745024
score 13.1485815