Conserved and divergent signals in 5’ splice site sequences across fungi, metazoa and plants
- Autores
- Beckel, Maximiliano Sebastián; Kaufman, Bruno; Yanovsky, Marcelo Javier; Chernomoretz, Ariel
- Año de publicación
- 2023
- Idioma
- inglés
- Tipo de recurso
- artículo
- Estado
- versión publicada
- Descripción
- In eukaryotic organisms the ensemble of 5’ splice site sequences reflects the balance between natural nucleotide variability and minimal molecular constraints necessary to ensure splicing fidelity. This compromise shapes the underlying statistical patterns in the composition of donor splice site sequences. The scope of this study was to mine conserved and divergent signals in the composition of 5’ splice site sequences. Because 5´ donor sequences are a major cue for proper recognition of splice sites, we reasoned that statistical regularities in their composition could reflect the biological functionality and evolutionary history associated with splicing mechanisms. Results: We considered a regularized maximum entropy modeling framework to mine for non-trivial two-site correlations in donor sequence datasets corresponding to 30 different eukaryotes. For each analyzed species, we identified minimal sets of two-site coupling patterns that were able to replicate, at a given regularization level, the observed one-site and two-site frequencies in donor sequences. By performing a systematic and comparative analysis of 5’splice sites we showed that lineage information could be traced from joint di-nucleotide probabilities. We were able to identify characteristic two-site coupling patterns for plants and animals, and propose that they may echo differences in splicing regulation previously reported between these groups.
Fil: Beckel, Maximiliano Sebastián. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Instituto de Investigaciones Bioquímicas de Buenos Aires. Fundación Instituto Leloir. Instituto de Investigaciones Bioquímicas de Buenos Aires; Argentina
Fil: Kaufman, Bruno. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Instituto de Investigaciones Bioquímicas de Buenos Aires. Fundación Instituto Leloir. Instituto de Investigaciones Bioquímicas de Buenos Aires; Argentina
Fil: Yanovsky, Marcelo Javier. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Instituto de Investigaciones Bioquímicas de Buenos Aires. Fundación Instituto Leloir. Instituto de Investigaciones Bioquímicas de Buenos Aires; Argentina
Fil: Chernomoretz, Ariel. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria. Instituto de Física del Plasma. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Instituto de Física del Plasma; Argentina - Materia
-
splicing
donor sequences
regularized maximum entropy model - Nivel de accesibilidad
- acceso abierto
- Condiciones de uso
- https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
- Repositorio
.jpg)
- Institución
- Consejo Nacional de Investigaciones Científicas y Técnicas
- OAI Identificador
- oai:ri.conicet.gov.ar:11336/228396
Ver los metadatos del registro completo
| id |
CONICETDig_481e0b7ffa112daa90d894c023ea9bd0 |
|---|---|
| oai_identifier_str |
oai:ri.conicet.gov.ar:11336/228396 |
| network_acronym_str |
CONICETDig |
| repository_id_str |
3498 |
| network_name_str |
CONICET Digital (CONICET) |
| spelling |
Conserved and divergent signals in 5’ splice site sequences across fungi, metazoa and plantsBeckel, Maximiliano SebastiánKaufman, BrunoYanovsky, Marcelo JavierChernomoretz, Arielsplicingdonor sequencesregularized maximum entropy modelhttps://purl.org/becyt/ford/1.2https://purl.org/becyt/ford/1In eukaryotic organisms the ensemble of 5’ splice site sequences reflects the balance between natural nucleotide variability and minimal molecular constraints necessary to ensure splicing fidelity. This compromise shapes the underlying statistical patterns in the composition of donor splice site sequences. The scope of this study was to mine conserved and divergent signals in the composition of 5’ splice site sequences. Because 5´ donor sequences are a major cue for proper recognition of splice sites, we reasoned that statistical regularities in their composition could reflect the biological functionality and evolutionary history associated with splicing mechanisms. Results: We considered a regularized maximum entropy modeling framework to mine for non-trivial two-site correlations in donor sequence datasets corresponding to 30 different eukaryotes. For each analyzed species, we identified minimal sets of two-site coupling patterns that were able to replicate, at a given regularization level, the observed one-site and two-site frequencies in donor sequences. By performing a systematic and comparative analysis of 5’splice sites we showed that lineage information could be traced from joint di-nucleotide probabilities. We were able to identify characteristic two-site coupling patterns for plants and animals, and propose that they may echo differences in splicing regulation previously reported between these groups.Fil: Beckel, Maximiliano Sebastián. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Instituto de Investigaciones Bioquímicas de Buenos Aires. Fundación Instituto Leloir. Instituto de Investigaciones Bioquímicas de Buenos Aires; ArgentinaFil: Kaufman, Bruno. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Instituto de Investigaciones Bioquímicas de Buenos Aires. Fundación Instituto Leloir. Instituto de Investigaciones Bioquímicas de Buenos Aires; ArgentinaFil: Yanovsky, Marcelo Javier. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Instituto de Investigaciones Bioquímicas de Buenos Aires. Fundación Instituto Leloir. Instituto de Investigaciones Bioquímicas de Buenos Aires; ArgentinaFil: Chernomoretz, Ariel. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria. Instituto de Física del Plasma. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Instituto de Física del Plasma; ArgentinaPublic Library of Science2023-10info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionhttp://purl.org/coar/resource_type/c_6501info:ar-repo/semantics/articuloapplication/pdfapplication/pdfhttp://hdl.handle.net/11336/228396Beckel, Maximiliano Sebastián; Kaufman, Bruno; Yanovsky, Marcelo Javier; Chernomoretz, Ariel; Conserved and divergent signals in 5’ splice site sequences across fungi, metazoa and plants; Public Library of Science; Plos Computational Biology; 19; 10; 10-2023; 1-181553-734XCONICET DigitalCONICETenginfo:eu-repo/semantics/altIdentifier/doi/10.1371/journal.pcbi.1011540info:eu-repo/semantics/openAccesshttps://creativecommons.org/licenses/by-nc-sa/2.5/ar/reponame:CONICET Digital (CONICET)instname:Consejo Nacional de Investigaciones Científicas y Técnicas2026-05-06T16:47:23Zoai:ri.conicet.gov.ar:11336/228396instacron:CONICETInstitucionalhttp://ri.conicet.gov.ar/Organismo científico-tecnológicoNo correspondehttp://ri.conicet.gov.ar/oai/requestdasensio@conicet.gov.ar; lcarlino@conicet.gov.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:34982026-05-06 16:47:24.267CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicasfalse |
| dc.title.none.fl_str_mv |
Conserved and divergent signals in 5’ splice site sequences across fungi, metazoa and plants |
| title |
Conserved and divergent signals in 5’ splice site sequences across fungi, metazoa and plants |
| spellingShingle |
Conserved and divergent signals in 5’ splice site sequences across fungi, metazoa and plants Beckel, Maximiliano Sebastián splicing donor sequences regularized maximum entropy model |
| title_short |
Conserved and divergent signals in 5’ splice site sequences across fungi, metazoa and plants |
| title_full |
Conserved and divergent signals in 5’ splice site sequences across fungi, metazoa and plants |
| title_fullStr |
Conserved and divergent signals in 5’ splice site sequences across fungi, metazoa and plants |
| title_full_unstemmed |
Conserved and divergent signals in 5’ splice site sequences across fungi, metazoa and plants |
| title_sort |
Conserved and divergent signals in 5’ splice site sequences across fungi, metazoa and plants |
| dc.creator.none.fl_str_mv |
Beckel, Maximiliano Sebastián Kaufman, Bruno Yanovsky, Marcelo Javier Chernomoretz, Ariel |
| author |
Beckel, Maximiliano Sebastián |
| author_facet |
Beckel, Maximiliano Sebastián Kaufman, Bruno Yanovsky, Marcelo Javier Chernomoretz, Ariel |
| author_role |
author |
| author2 |
Kaufman, Bruno Yanovsky, Marcelo Javier Chernomoretz, Ariel |
| author2_role |
author author author |
| dc.subject.none.fl_str_mv |
splicing donor sequences regularized maximum entropy model |
| topic |
splicing donor sequences regularized maximum entropy model |
| purl_subject.fl_str_mv |
https://purl.org/becyt/ford/1.2 https://purl.org/becyt/ford/1 |
| dc.description.none.fl_txt_mv |
In eukaryotic organisms the ensemble of 5’ splice site sequences reflects the balance between natural nucleotide variability and minimal molecular constraints necessary to ensure splicing fidelity. This compromise shapes the underlying statistical patterns in the composition of donor splice site sequences. The scope of this study was to mine conserved and divergent signals in the composition of 5’ splice site sequences. Because 5´ donor sequences are a major cue for proper recognition of splice sites, we reasoned that statistical regularities in their composition could reflect the biological functionality and evolutionary history associated with splicing mechanisms. Results: We considered a regularized maximum entropy modeling framework to mine for non-trivial two-site correlations in donor sequence datasets corresponding to 30 different eukaryotes. For each analyzed species, we identified minimal sets of two-site coupling patterns that were able to replicate, at a given regularization level, the observed one-site and two-site frequencies in donor sequences. By performing a systematic and comparative analysis of 5’splice sites we showed that lineage information could be traced from joint di-nucleotide probabilities. We were able to identify characteristic two-site coupling patterns for plants and animals, and propose that they may echo differences in splicing regulation previously reported between these groups. Fil: Beckel, Maximiliano Sebastián. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Instituto de Investigaciones Bioquímicas de Buenos Aires. Fundación Instituto Leloir. Instituto de Investigaciones Bioquímicas de Buenos Aires; Argentina Fil: Kaufman, Bruno. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Instituto de Investigaciones Bioquímicas de Buenos Aires. Fundación Instituto Leloir. Instituto de Investigaciones Bioquímicas de Buenos Aires; Argentina Fil: Yanovsky, Marcelo Javier. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Instituto de Investigaciones Bioquímicas de Buenos Aires. Fundación Instituto Leloir. Instituto de Investigaciones Bioquímicas de Buenos Aires; Argentina Fil: Chernomoretz, Ariel. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria. Instituto de Física del Plasma. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Instituto de Física del Plasma; Argentina |
| description |
In eukaryotic organisms the ensemble of 5’ splice site sequences reflects the balance between natural nucleotide variability and minimal molecular constraints necessary to ensure splicing fidelity. This compromise shapes the underlying statistical patterns in the composition of donor splice site sequences. The scope of this study was to mine conserved and divergent signals in the composition of 5’ splice site sequences. Because 5´ donor sequences are a major cue for proper recognition of splice sites, we reasoned that statistical regularities in their composition could reflect the biological functionality and evolutionary history associated with splicing mechanisms. Results: We considered a regularized maximum entropy modeling framework to mine for non-trivial two-site correlations in donor sequence datasets corresponding to 30 different eukaryotes. For each analyzed species, we identified minimal sets of two-site coupling patterns that were able to replicate, at a given regularization level, the observed one-site and two-site frequencies in donor sequences. By performing a systematic and comparative analysis of 5’splice sites we showed that lineage information could be traced from joint di-nucleotide probabilities. We were able to identify characteristic two-site coupling patterns for plants and animals, and propose that they may echo differences in splicing regulation previously reported between these groups. |
| publishDate |
2023 |
| dc.date.none.fl_str_mv |
2023-10 |
| dc.type.none.fl_str_mv |
info:eu-repo/semantics/article info:eu-repo/semantics/publishedVersion http://purl.org/coar/resource_type/c_6501 info:ar-repo/semantics/articulo |
| format |
article |
| status_str |
publishedVersion |
| dc.identifier.none.fl_str_mv |
http://hdl.handle.net/11336/228396 Beckel, Maximiliano Sebastián; Kaufman, Bruno; Yanovsky, Marcelo Javier; Chernomoretz, Ariel; Conserved and divergent signals in 5’ splice site sequences across fungi, metazoa and plants; Public Library of Science; Plos Computational Biology; 19; 10; 10-2023; 1-18 1553-734X CONICET Digital CONICET |
| url |
http://hdl.handle.net/11336/228396 |
| identifier_str_mv |
Beckel, Maximiliano Sebastián; Kaufman, Bruno; Yanovsky, Marcelo Javier; Chernomoretz, Ariel; Conserved and divergent signals in 5’ splice site sequences across fungi, metazoa and plants; Public Library of Science; Plos Computational Biology; 19; 10; 10-2023; 1-18 1553-734X CONICET Digital CONICET |
| dc.language.none.fl_str_mv |
eng |
| language |
eng |
| dc.relation.none.fl_str_mv |
info:eu-repo/semantics/altIdentifier/doi/10.1371/journal.pcbi.1011540 |
| dc.rights.none.fl_str_mv |
info:eu-repo/semantics/openAccess https://creativecommons.org/licenses/by-nc-sa/2.5/ar/ |
| eu_rights_str_mv |
openAccess |
| rights_invalid_str_mv |
https://creativecommons.org/licenses/by-nc-sa/2.5/ar/ |
| dc.format.none.fl_str_mv |
application/pdf application/pdf |
| dc.publisher.none.fl_str_mv |
Public Library of Science |
| publisher.none.fl_str_mv |
Public Library of Science |
| dc.source.none.fl_str_mv |
reponame:CONICET Digital (CONICET) instname:Consejo Nacional de Investigaciones Científicas y Técnicas |
| reponame_str |
CONICET Digital (CONICET) |
| collection |
CONICET Digital (CONICET) |
| instname_str |
Consejo Nacional de Investigaciones Científicas y Técnicas |
| repository.name.fl_str_mv |
CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicas |
| repository.mail.fl_str_mv |
dasensio@conicet.gov.ar; lcarlino@conicet.gov.ar |
| _version_ |
1864650606351745024 |
| score |
13.1485815 |