Hierarchical deep learning for predicting GO annotations by integrating protein knowledge
- Autores
- Merino, Gabriela Alejandra; Saidi, Rabie; Milone, Diego Humberto; Stegmayer, Georgina; Martin, Maria J.
- Año de publicación
- 2022
- Idioma
- inglés
- Tipo de recurso
- artículo
- Estado
- versión publicada
- Descripción
- Motivation: Experimental testing and manual curation are the most precise ways for assigning Gene Ontology (GO) terms describing protein functions. However, they are expensive, time-consuming and cannot cope with the exponential growth of data generated by high-throughput sequencing methods. Hence, researchers need reliable computational systems to help fill the gap with automatic function prediction. The results of the last Critical Assessment of Function Annotation challenge revealed that GO-terms prediction remains a very challenging task. Recent developments on deep learning are significantly breaking out the frontiers leading to new knowledge in protein research thanks to the integration of data from multiple sources. However, deep models hitherto developed for functional prediction are mainly focused on sequence data and have not achieved breakthrough performances yet. Results: We propose DeeProtGO, a novel deep-learning model for predicting GO annotations by integrating protein knowledge. DeeProtGO was trained for solving 18 different prediction problems, defined by the three GO sub-ontologies, the type of proteins, and the taxonomic kingdom. Our experiments reported higher prediction quality when more protein knowledge is integrated. We also benchmarked DeeProtGO against state-of-the-art methods on public datasets, and showed it can effectively improve the prediction of GO annotations.
Fil: Merino, Gabriela Alejandra. European Molecular Biology Laboratory. European Bioinformatics Institute.; Reino Unido. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Santa Fe. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional. Universidad Nacional del Litoral. Facultad de Ingeniería y Ciencias Hídricas. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional; Argentina
Fil: Saidi, Rabie. European Molecular Biology Laboratory. European Bioinformatics Institute.; Reino Unido
Fil: Milone, Diego Humberto. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Santa Fe. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional. Universidad Nacional del Litoral. Facultad de Ingeniería y Ciencias Hídricas. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional; Argentina
Fil: Stegmayer, Georgina. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Santa Fe. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional. Universidad Nacional del Litoral. Facultad de Ingeniería y Ciencias Hídricas. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional; Argentina
Fil: Martin, Maria J.. European Molecular Biology Laboratory. European Bioinformatics Institute.; Reino Unido - Materia
-
AUTOMATIC FUNCTION PREDICTION
PROTEIN ANNOTATION
DEEP LEARNING
KNOWLEDGE INTEGRATION
GO TERMS PREDICTION - Nivel de accesibilidad
- acceso abierto
- Condiciones de uso
- https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
- Repositorio
- Institución
- Consejo Nacional de Investigaciones Científicas y Técnicas
- OAI Identificador
- oai:ri.conicet.gov.ar:11336/210988
Ver los metadatos del registro completo
id |
CONICETDig_d8217e66ef8bd74ac5b6c232cb774621 |
---|---|
oai_identifier_str |
oai:ri.conicet.gov.ar:11336/210988 |
network_acronym_str |
CONICETDig |
repository_id_str |
3498 |
network_name_str |
CONICET Digital (CONICET) |
spelling |
Hierarchical deep learning for predicting GO annotations by integrating protein knowledgeMerino, Gabriela AlejandraSaidi, RabieMilone, Diego HumbertoStegmayer, GeorginaMartin, Maria J.AUTOMATIC FUNCTION PREDICTIONPROTEIN ANNOTATIONDEEP LEARNINGKNOWLEDGE INTEGRATIONGO TERMS PREDICTIONhttps://purl.org/becyt/ford/1.2https://purl.org/becyt/ford/1Motivation: Experimental testing and manual curation are the most precise ways for assigning Gene Ontology (GO) terms describing protein functions. However, they are expensive, time-consuming and cannot cope with the exponential growth of data generated by high-throughput sequencing methods. Hence, researchers need reliable computational systems to help fill the gap with automatic function prediction. The results of the last Critical Assessment of Function Annotation challenge revealed that GO-terms prediction remains a very challenging task. Recent developments on deep learning are significantly breaking out the frontiers leading to new knowledge in protein research thanks to the integration of data from multiple sources. However, deep models hitherto developed for functional prediction are mainly focused on sequence data and have not achieved breakthrough performances yet. Results: We propose DeeProtGO, a novel deep-learning model for predicting GO annotations by integrating protein knowledge. DeeProtGO was trained for solving 18 different prediction problems, defined by the three GO sub-ontologies, the type of proteins, and the taxonomic kingdom. Our experiments reported higher prediction quality when more protein knowledge is integrated. We also benchmarked DeeProtGO against state-of-the-art methods on public datasets, and showed it can effectively improve the prediction of GO annotations.Fil: Merino, Gabriela Alejandra. European Molecular Biology Laboratory. European Bioinformatics Institute.; Reino Unido. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Santa Fe. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional. Universidad Nacional del Litoral. Facultad de Ingeniería y Ciencias Hídricas. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional; ArgentinaFil: Saidi, Rabie. European Molecular Biology Laboratory. European Bioinformatics Institute.; Reino UnidoFil: Milone, Diego Humberto. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Santa Fe. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional. Universidad Nacional del Litoral. Facultad de Ingeniería y Ciencias Hídricas. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional; ArgentinaFil: Stegmayer, Georgina. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Santa Fe. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional. Universidad Nacional del Litoral. Facultad de Ingeniería y Ciencias Hídricas. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional; ArgentinaFil: Martin, Maria J.. European Molecular Biology Laboratory. European Bioinformatics Institute.; Reino UnidoOxford University Press2022-08info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionhttp://purl.org/coar/resource_type/c_6501info:ar-repo/semantics/articuloapplication/pdfapplication/pdfapplication/pdfapplication/pdfhttp://hdl.handle.net/11336/210988Merino, Gabriela Alejandra; Saidi, Rabie; Milone, Diego Humberto; Stegmayer, Georgina; Martin, Maria J.; Hierarchical deep learning for predicting GO annotations by integrating protein knowledge; Oxford University Press; Bioinformatics (Oxford, England); 38; 19; 8-2022; 4488-44961367-4803CONICET DigitalCONICETenginfo:eu-repo/semantics/altIdentifier/url/https://academic.oup.com/bioinformatics/advance-article/doi/10.1093/bioinformatics/btac536/6656346info:eu-repo/semantics/altIdentifier/doi/10.1093/bioinformatics/btac536info:eu-repo/semantics/openAccesshttps://creativecommons.org/licenses/by-nc-sa/2.5/ar/reponame:CONICET Digital (CONICET)instname:Consejo Nacional de Investigaciones Científicas y Técnicas2025-09-29T10:18:09Zoai:ri.conicet.gov.ar:11336/210988instacron:CONICETInstitucionalhttp://ri.conicet.gov.ar/Organismo científico-tecnológicoNo correspondehttp://ri.conicet.gov.ar/oai/requestdasensio@conicet.gov.ar; lcarlino@conicet.gov.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:34982025-09-29 10:18:09.372CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicasfalse |
dc.title.none.fl_str_mv |
Hierarchical deep learning for predicting GO annotations by integrating protein knowledge |
title |
Hierarchical deep learning for predicting GO annotations by integrating protein knowledge |
spellingShingle |
Hierarchical deep learning for predicting GO annotations by integrating protein knowledge Merino, Gabriela Alejandra AUTOMATIC FUNCTION PREDICTION PROTEIN ANNOTATION DEEP LEARNING KNOWLEDGE INTEGRATION GO TERMS PREDICTION |
title_short |
Hierarchical deep learning for predicting GO annotations by integrating protein knowledge |
title_full |
Hierarchical deep learning for predicting GO annotations by integrating protein knowledge |
title_fullStr |
Hierarchical deep learning for predicting GO annotations by integrating protein knowledge |
title_full_unstemmed |
Hierarchical deep learning for predicting GO annotations by integrating protein knowledge |
title_sort |
Hierarchical deep learning for predicting GO annotations by integrating protein knowledge |
dc.creator.none.fl_str_mv |
Merino, Gabriela Alejandra Saidi, Rabie Milone, Diego Humberto Stegmayer, Georgina Martin, Maria J. |
author |
Merino, Gabriela Alejandra |
author_facet |
Merino, Gabriela Alejandra Saidi, Rabie Milone, Diego Humberto Stegmayer, Georgina Martin, Maria J. |
author_role |
author |
author2 |
Saidi, Rabie Milone, Diego Humberto Stegmayer, Georgina Martin, Maria J. |
author2_role |
author author author author |
dc.subject.none.fl_str_mv |
AUTOMATIC FUNCTION PREDICTION PROTEIN ANNOTATION DEEP LEARNING KNOWLEDGE INTEGRATION GO TERMS PREDICTION |
topic |
AUTOMATIC FUNCTION PREDICTION PROTEIN ANNOTATION DEEP LEARNING KNOWLEDGE INTEGRATION GO TERMS PREDICTION |
purl_subject.fl_str_mv |
https://purl.org/becyt/ford/1.2 https://purl.org/becyt/ford/1 |
dc.description.none.fl_txt_mv |
Motivation: Experimental testing and manual curation are the most precise ways for assigning Gene Ontology (GO) terms describing protein functions. However, they are expensive, time-consuming and cannot cope with the exponential growth of data generated by high-throughput sequencing methods. Hence, researchers need reliable computational systems to help fill the gap with automatic function prediction. The results of the last Critical Assessment of Function Annotation challenge revealed that GO-terms prediction remains a very challenging task. Recent developments on deep learning are significantly breaking out the frontiers leading to new knowledge in protein research thanks to the integration of data from multiple sources. However, deep models hitherto developed for functional prediction are mainly focused on sequence data and have not achieved breakthrough performances yet. Results: We propose DeeProtGO, a novel deep-learning model for predicting GO annotations by integrating protein knowledge. DeeProtGO was trained for solving 18 different prediction problems, defined by the three GO sub-ontologies, the type of proteins, and the taxonomic kingdom. Our experiments reported higher prediction quality when more protein knowledge is integrated. We also benchmarked DeeProtGO against state-of-the-art methods on public datasets, and showed it can effectively improve the prediction of GO annotations. Fil: Merino, Gabriela Alejandra. European Molecular Biology Laboratory. European Bioinformatics Institute.; Reino Unido. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Santa Fe. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional. Universidad Nacional del Litoral. Facultad de Ingeniería y Ciencias Hídricas. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional; Argentina Fil: Saidi, Rabie. European Molecular Biology Laboratory. European Bioinformatics Institute.; Reino Unido Fil: Milone, Diego Humberto. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Santa Fe. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional. Universidad Nacional del Litoral. Facultad de Ingeniería y Ciencias Hídricas. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional; Argentina Fil: Stegmayer, Georgina. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Santa Fe. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional. Universidad Nacional del Litoral. Facultad de Ingeniería y Ciencias Hídricas. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional; Argentina Fil: Martin, Maria J.. European Molecular Biology Laboratory. European Bioinformatics Institute.; Reino Unido |
description |
Motivation: Experimental testing and manual curation are the most precise ways for assigning Gene Ontology (GO) terms describing protein functions. However, they are expensive, time-consuming and cannot cope with the exponential growth of data generated by high-throughput sequencing methods. Hence, researchers need reliable computational systems to help fill the gap with automatic function prediction. The results of the last Critical Assessment of Function Annotation challenge revealed that GO-terms prediction remains a very challenging task. Recent developments on deep learning are significantly breaking out the frontiers leading to new knowledge in protein research thanks to the integration of data from multiple sources. However, deep models hitherto developed for functional prediction are mainly focused on sequence data and have not achieved breakthrough performances yet. Results: We propose DeeProtGO, a novel deep-learning model for predicting GO annotations by integrating protein knowledge. DeeProtGO was trained for solving 18 different prediction problems, defined by the three GO sub-ontologies, the type of proteins, and the taxonomic kingdom. Our experiments reported higher prediction quality when more protein knowledge is integrated. We also benchmarked DeeProtGO against state-of-the-art methods on public datasets, and showed it can effectively improve the prediction of GO annotations. |
publishDate |
2022 |
dc.date.none.fl_str_mv |
2022-08 |
dc.type.none.fl_str_mv |
info:eu-repo/semantics/article info:eu-repo/semantics/publishedVersion http://purl.org/coar/resource_type/c_6501 info:ar-repo/semantics/articulo |
format |
article |
status_str |
publishedVersion |
dc.identifier.none.fl_str_mv |
http://hdl.handle.net/11336/210988 Merino, Gabriela Alejandra; Saidi, Rabie; Milone, Diego Humberto; Stegmayer, Georgina; Martin, Maria J.; Hierarchical deep learning for predicting GO annotations by integrating protein knowledge; Oxford University Press; Bioinformatics (Oxford, England); 38; 19; 8-2022; 4488-4496 1367-4803 CONICET Digital CONICET |
url |
http://hdl.handle.net/11336/210988 |
identifier_str_mv |
Merino, Gabriela Alejandra; Saidi, Rabie; Milone, Diego Humberto; Stegmayer, Georgina; Martin, Maria J.; Hierarchical deep learning for predicting GO annotations by integrating protein knowledge; Oxford University Press; Bioinformatics (Oxford, England); 38; 19; 8-2022; 4488-4496 1367-4803 CONICET Digital CONICET |
dc.language.none.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
info:eu-repo/semantics/altIdentifier/url/https://academic.oup.com/bioinformatics/advance-article/doi/10.1093/bioinformatics/btac536/6656346 info:eu-repo/semantics/altIdentifier/doi/10.1093/bioinformatics/btac536 |
dc.rights.none.fl_str_mv |
info:eu-repo/semantics/openAccess https://creativecommons.org/licenses/by-nc-sa/2.5/ar/ |
eu_rights_str_mv |
openAccess |
rights_invalid_str_mv |
https://creativecommons.org/licenses/by-nc-sa/2.5/ar/ |
dc.format.none.fl_str_mv |
application/pdf application/pdf application/pdf application/pdf |
dc.publisher.none.fl_str_mv |
Oxford University Press |
publisher.none.fl_str_mv |
Oxford University Press |
dc.source.none.fl_str_mv |
reponame:CONICET Digital (CONICET) instname:Consejo Nacional de Investigaciones Científicas y Técnicas |
reponame_str |
CONICET Digital (CONICET) |
collection |
CONICET Digital (CONICET) |
instname_str |
Consejo Nacional de Investigaciones Científicas y Técnicas |
repository.name.fl_str_mv |
CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicas |
repository.mail.fl_str_mv |
dasensio@conicet.gov.ar; lcarlino@conicet.gov.ar |
_version_ |
1844614140723724288 |
score |
13.070432 |