Pipeline for transferring annotations between proteins beyond globular domains
- Autores
- Martinez Perez, Elizabeth; Pajkos, Mátyás; Tosatto, Silvio C. E.; Gibson, Toby James; Dosztanyi, Zsuzsanna; Marino, Cristina Ester
- Año de publicación
- 2023
- Idioma
- inglés
- Tipo de recurso
- artículo
- Estado
- versión publicada
- Descripción
- Background DisProt is the primary repository of Intrinsically Disordered Proteins (IDPs). This database is manually curated and the annotations there have strong experimental support. Currently, DisProt contains a relatively small number of proteins highlighting the importance of transferring annotations regarding verified disorder state and corresponding functions to homologous proteins in other species. In such a way, providing them with highly valuable information to better understand their biological roles. While the principles and practicalities of homology transfer are well-established for globular proteins, these are largely lacking for disordered proteins. Methods We used DisProt to evaluate the transferability of the annotation terms to orthologous proteins. For each protein, we looked for their orthologs, with the assumption that they will have a similar function. Then, for each protein and their orthologs we made multiple sequence alignments (MSAs). Disordered sequences are fast evolving and can be hard to align: Therefore we implemented alignment quality control steps ensuring robust alignments before mapping the annotations. Results We have designed a pipeline to obtain good quality MSAs and to transfer annotations from any protein to their orthologs. Applying the pipeline to DisProt proteins, from the 1,731 entries with 5,623 annotations we can reach 97,555 orthologs and transfer a total of 301,190 terms by homology. We also provide a web server for consulting the results of DisProt proteins and execute the pipeline for any other protein. The server Homology Transfer IDP (HoTIDP) is accessible at http://hotidp.leloir.org.ar.
Fil: Martinez Perez, Elizabeth. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Instituto de Investigaciones Bioquímicas de Buenos Aires. Fundación Instituto Leloir. Instituto de Investigaciones Bioquímicas de Buenos Aires; Argentina. Fundación Instituto Leloir; Argentina
Fil: Pajkos, Mátyás. Eötvös University; Argentina
Fil: Tosatto, Silvio C. E.. Università di Padova; Italia
Fil: Gibson, Toby James. European Molecular Biology Laboratory Heidelberg; Alemania
Fil: Dosztanyi, Zsuzsanna. Eötvös University; Argentina
Fil: Marino, Cristina Ester. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Instituto de Investigaciones Bioquímicas de Buenos Aires. Fundación Instituto Leloir. Instituto de Investigaciones Bioquímicas de Buenos Aires; Argentina. Fundación Instituto Leloir; Argentina - Materia
-
ANNOTATION
DISPROT
HOMOLOGY TRANSFER
INTRINSICALLY DISORDERED PROTEINS
MULTIPLE SEQUENCE ALIGNMENT
ONTOLOGY TERMS
ORTHOLOGOUS PROTEINS - Nivel de accesibilidad
- acceso abierto
- Condiciones de uso
- https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
- Repositorio
- Institución
- Consejo Nacional de Investigaciones Científicas y Técnicas
- OAI Identificador
- oai:ri.conicet.gov.ar:11336/228717
Ver los metadatos del registro completo
id |
CONICETDig_983df3ce7df1d2c6cd1d2934144e1b4f |
---|---|
oai_identifier_str |
oai:ri.conicet.gov.ar:11336/228717 |
network_acronym_str |
CONICETDig |
repository_id_str |
3498 |
network_name_str |
CONICET Digital (CONICET) |
spelling |
Pipeline for transferring annotations between proteins beyond globular domainsMartinez Perez, ElizabethPajkos, MátyásTosatto, Silvio C. E.Gibson, Toby JamesDosztanyi, ZsuzsannaMarino, Cristina EsterANNOTATIONDISPROTHOMOLOGY TRANSFERINTRINSICALLY DISORDERED PROTEINSMULTIPLE SEQUENCE ALIGNMENTONTOLOGY TERMSORTHOLOGOUS PROTEINShttps://purl.org/becyt/ford/1.7https://purl.org/becyt/ford/1Background DisProt is the primary repository of Intrinsically Disordered Proteins (IDPs). This database is manually curated and the annotations there have strong experimental support. Currently, DisProt contains a relatively small number of proteins highlighting the importance of transferring annotations regarding verified disorder state and corresponding functions to homologous proteins in other species. In such a way, providing them with highly valuable information to better understand their biological roles. While the principles and practicalities of homology transfer are well-established for globular proteins, these are largely lacking for disordered proteins. Methods We used DisProt to evaluate the transferability of the annotation terms to orthologous proteins. For each protein, we looked for their orthologs, with the assumption that they will have a similar function. Then, for each protein and their orthologs we made multiple sequence alignments (MSAs). Disordered sequences are fast evolving and can be hard to align: Therefore we implemented alignment quality control steps ensuring robust alignments before mapping the annotations. Results We have designed a pipeline to obtain good quality MSAs and to transfer annotations from any protein to their orthologs. Applying the pipeline to DisProt proteins, from the 1,731 entries with 5,623 annotations we can reach 97,555 orthologs and transfer a total of 301,190 terms by homology. We also provide a web server for consulting the results of DisProt proteins and execute the pipeline for any other protein. The server Homology Transfer IDP (HoTIDP) is accessible at http://hotidp.leloir.org.ar.Fil: Martinez Perez, Elizabeth. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Instituto de Investigaciones Bioquímicas de Buenos Aires. Fundación Instituto Leloir. Instituto de Investigaciones Bioquímicas de Buenos Aires; Argentina. Fundación Instituto Leloir; ArgentinaFil: Pajkos, Mátyás. Eötvös University; ArgentinaFil: Tosatto, Silvio C. E.. Università di Padova; ItaliaFil: Gibson, Toby James. European Molecular Biology Laboratory Heidelberg; AlemaniaFil: Dosztanyi, Zsuzsanna. Eötvös University; ArgentinaFil: Marino, Cristina Ester. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Instituto de Investigaciones Bioquímicas de Buenos Aires. Fundación Instituto Leloir. Instituto de Investigaciones Bioquímicas de Buenos Aires; Argentina. Fundación Instituto Leloir; ArgentinaJohn Wiley & Sons2023-05info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionhttp://purl.org/coar/resource_type/c_6501info:ar-repo/semantics/articuloapplication/pdfapplication/pdfhttp://hdl.handle.net/11336/228717Martinez Perez, Elizabeth; Pajkos, Mátyás; Tosatto, Silvio C. E.; Gibson, Toby James; Dosztanyi, Zsuzsanna; et al.; Pipeline for transferring annotations between proteins beyond globular domains; John Wiley & Sons; Protein Science; 32; 7; 5-2023; 1-210961-8368CONICET DigitalCONICETenginfo:eu-repo/semantics/altIdentifier/url/https://onlinelibrary.wiley.com/doi/10.1002/pro.4655info:eu-repo/semantics/altIdentifier/doi/10.1002/pro.4655info:eu-repo/semantics/openAccesshttps://creativecommons.org/licenses/by-nc-sa/2.5/ar/reponame:CONICET Digital (CONICET)instname:Consejo Nacional de Investigaciones Científicas y Técnicas2025-09-29T09:51:19Zoai:ri.conicet.gov.ar:11336/228717instacron:CONICETInstitucionalhttp://ri.conicet.gov.ar/Organismo científico-tecnológicoNo correspondehttp://ri.conicet.gov.ar/oai/requestdasensio@conicet.gov.ar; lcarlino@conicet.gov.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:34982025-09-29 09:51:19.297CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicasfalse |
dc.title.none.fl_str_mv |
Pipeline for transferring annotations between proteins beyond globular domains |
title |
Pipeline for transferring annotations between proteins beyond globular domains |
spellingShingle |
Pipeline for transferring annotations between proteins beyond globular domains Martinez Perez, Elizabeth ANNOTATION DISPROT HOMOLOGY TRANSFER INTRINSICALLY DISORDERED PROTEINS MULTIPLE SEQUENCE ALIGNMENT ONTOLOGY TERMS ORTHOLOGOUS PROTEINS |
title_short |
Pipeline for transferring annotations between proteins beyond globular domains |
title_full |
Pipeline for transferring annotations between proteins beyond globular domains |
title_fullStr |
Pipeline for transferring annotations between proteins beyond globular domains |
title_full_unstemmed |
Pipeline for transferring annotations between proteins beyond globular domains |
title_sort |
Pipeline for transferring annotations between proteins beyond globular domains |
dc.creator.none.fl_str_mv |
Martinez Perez, Elizabeth Pajkos, Mátyás Tosatto, Silvio C. E. Gibson, Toby James Dosztanyi, Zsuzsanna Marino, Cristina Ester |
author |
Martinez Perez, Elizabeth |
author_facet |
Martinez Perez, Elizabeth Pajkos, Mátyás Tosatto, Silvio C. E. Gibson, Toby James Dosztanyi, Zsuzsanna Marino, Cristina Ester |
author_role |
author |
author2 |
Pajkos, Mátyás Tosatto, Silvio C. E. Gibson, Toby James Dosztanyi, Zsuzsanna Marino, Cristina Ester |
author2_role |
author author author author author |
dc.subject.none.fl_str_mv |
ANNOTATION DISPROT HOMOLOGY TRANSFER INTRINSICALLY DISORDERED PROTEINS MULTIPLE SEQUENCE ALIGNMENT ONTOLOGY TERMS ORTHOLOGOUS PROTEINS |
topic |
ANNOTATION DISPROT HOMOLOGY TRANSFER INTRINSICALLY DISORDERED PROTEINS MULTIPLE SEQUENCE ALIGNMENT ONTOLOGY TERMS ORTHOLOGOUS PROTEINS |
purl_subject.fl_str_mv |
https://purl.org/becyt/ford/1.7 https://purl.org/becyt/ford/1 |
dc.description.none.fl_txt_mv |
Background DisProt is the primary repository of Intrinsically Disordered Proteins (IDPs). This database is manually curated and the annotations there have strong experimental support. Currently, DisProt contains a relatively small number of proteins highlighting the importance of transferring annotations regarding verified disorder state and corresponding functions to homologous proteins in other species. In such a way, providing them with highly valuable information to better understand their biological roles. While the principles and practicalities of homology transfer are well-established for globular proteins, these are largely lacking for disordered proteins. Methods We used DisProt to evaluate the transferability of the annotation terms to orthologous proteins. For each protein, we looked for their orthologs, with the assumption that they will have a similar function. Then, for each protein and their orthologs we made multiple sequence alignments (MSAs). Disordered sequences are fast evolving and can be hard to align: Therefore we implemented alignment quality control steps ensuring robust alignments before mapping the annotations. Results We have designed a pipeline to obtain good quality MSAs and to transfer annotations from any protein to their orthologs. Applying the pipeline to DisProt proteins, from the 1,731 entries with 5,623 annotations we can reach 97,555 orthologs and transfer a total of 301,190 terms by homology. We also provide a web server for consulting the results of DisProt proteins and execute the pipeline for any other protein. The server Homology Transfer IDP (HoTIDP) is accessible at http://hotidp.leloir.org.ar. Fil: Martinez Perez, Elizabeth. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Instituto de Investigaciones Bioquímicas de Buenos Aires. Fundación Instituto Leloir. Instituto de Investigaciones Bioquímicas de Buenos Aires; Argentina. Fundación Instituto Leloir; Argentina Fil: Pajkos, Mátyás. Eötvös University; Argentina Fil: Tosatto, Silvio C. E.. Università di Padova; Italia Fil: Gibson, Toby James. European Molecular Biology Laboratory Heidelberg; Alemania Fil: Dosztanyi, Zsuzsanna. Eötvös University; Argentina Fil: Marino, Cristina Ester. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Instituto de Investigaciones Bioquímicas de Buenos Aires. Fundación Instituto Leloir. Instituto de Investigaciones Bioquímicas de Buenos Aires; Argentina. Fundación Instituto Leloir; Argentina |
description |
Background DisProt is the primary repository of Intrinsically Disordered Proteins (IDPs). This database is manually curated and the annotations there have strong experimental support. Currently, DisProt contains a relatively small number of proteins highlighting the importance of transferring annotations regarding verified disorder state and corresponding functions to homologous proteins in other species. In such a way, providing them with highly valuable information to better understand their biological roles. While the principles and practicalities of homology transfer are well-established for globular proteins, these are largely lacking for disordered proteins. Methods We used DisProt to evaluate the transferability of the annotation terms to orthologous proteins. For each protein, we looked for their orthologs, with the assumption that they will have a similar function. Then, for each protein and their orthologs we made multiple sequence alignments (MSAs). Disordered sequences are fast evolving and can be hard to align: Therefore we implemented alignment quality control steps ensuring robust alignments before mapping the annotations. Results We have designed a pipeline to obtain good quality MSAs and to transfer annotations from any protein to their orthologs. Applying the pipeline to DisProt proteins, from the 1,731 entries with 5,623 annotations we can reach 97,555 orthologs and transfer a total of 301,190 terms by homology. We also provide a web server for consulting the results of DisProt proteins and execute the pipeline for any other protein. The server Homology Transfer IDP (HoTIDP) is accessible at http://hotidp.leloir.org.ar. |
publishDate |
2023 |
dc.date.none.fl_str_mv |
2023-05 |
dc.type.none.fl_str_mv |
info:eu-repo/semantics/article info:eu-repo/semantics/publishedVersion http://purl.org/coar/resource_type/c_6501 info:ar-repo/semantics/articulo |
format |
article |
status_str |
publishedVersion |
dc.identifier.none.fl_str_mv |
http://hdl.handle.net/11336/228717 Martinez Perez, Elizabeth; Pajkos, Mátyás; Tosatto, Silvio C. E.; Gibson, Toby James; Dosztanyi, Zsuzsanna; et al.; Pipeline for transferring annotations between proteins beyond globular domains; John Wiley & Sons; Protein Science; 32; 7; 5-2023; 1-21 0961-8368 CONICET Digital CONICET |
url |
http://hdl.handle.net/11336/228717 |
identifier_str_mv |
Martinez Perez, Elizabeth; Pajkos, Mátyás; Tosatto, Silvio C. E.; Gibson, Toby James; Dosztanyi, Zsuzsanna; et al.; Pipeline for transferring annotations between proteins beyond globular domains; John Wiley & Sons; Protein Science; 32; 7; 5-2023; 1-21 0961-8368 CONICET Digital CONICET |
dc.language.none.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
info:eu-repo/semantics/altIdentifier/url/https://onlinelibrary.wiley.com/doi/10.1002/pro.4655 info:eu-repo/semantics/altIdentifier/doi/10.1002/pro.4655 |
dc.rights.none.fl_str_mv |
info:eu-repo/semantics/openAccess https://creativecommons.org/licenses/by-nc-sa/2.5/ar/ |
eu_rights_str_mv |
openAccess |
rights_invalid_str_mv |
https://creativecommons.org/licenses/by-nc-sa/2.5/ar/ |
dc.format.none.fl_str_mv |
application/pdf application/pdf |
dc.publisher.none.fl_str_mv |
John Wiley & Sons |
publisher.none.fl_str_mv |
John Wiley & Sons |
dc.source.none.fl_str_mv |
reponame:CONICET Digital (CONICET) instname:Consejo Nacional de Investigaciones Científicas y Técnicas |
reponame_str |
CONICET Digital (CONICET) |
collection |
CONICET Digital (CONICET) |
instname_str |
Consejo Nacional de Investigaciones Científicas y Técnicas |
repository.name.fl_str_mv |
CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicas |
repository.mail.fl_str_mv |
dasensio@conicet.gov.ar; lcarlino@conicet.gov.ar |
_version_ |
1844613577786261504 |
score |
13.070432 |