An introduction to deep learning on biological sequence data: Examples and solutions
- Autores
- Jurtz, Vanessa Isabell; Johansen, Alexander Rosenberg; Nielsen, Morten; Almagro Armenteros, Jose Juan; Nielsen, Henrik; Sønderby, Casper Kaae; Winther, Ole; Sønderby, Søren Kaae
- Año de publicación
- 2017
- Idioma
- inglés
- Tipo de recurso
- artículo
- Estado
- versión publicada
- Descripción
- Motivation: Deep neural network architectures such as convolutional and long short-term memory networks have become increasingly popular as machine learning tools during the recent years. The availability of greater computational resources, more data, new algorithms for training deep models and easy to use libraries for implementation and training of neural networks are the drivers of this development. The use of deep learning has been especially successful in image recognition; and the development of tools, applications and code examples are in most cases centered within this field rather than within biology. Results: Here, we aim to further the development of deep learning methods within biology by providing application examples and ready to apply and adapt code templates. Given such examples, we illustrate how architectures consisting of convolutional and long short-term memory neural networks can relatively easily be designed and trained to state-of-the-art performance on three biological sequence problems: prediction of subcellular localization, protein secondary structure and the binding of peptides to MHC Class II molecules. Availability and implementation: All implementations and datasets are available online to the scientific community at https://github.com/vanessajurtz/lasagne4bio. Supplementary information: Supplementary data are available at Bioinformatics online.
Fil: Jurtz, Vanessa Isabell. Technical University of Denmark; Dinamarca
Fil: Johansen, Alexander Rosenberg. Technical University of Denmark; Dinamarca
Fil: Nielsen, Morten. Technical University of Denmark; Dinamarca. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - La Plata. Instituto de Investigaciones Biotecnológicas. Instituto de Investigaciones Biotecnológicas "Dr. Raúl Alfonsín" (sede Chascomús). Universidad Nacional de San Martín. Instituto de Investigaciones Biotecnológicas. Instituto de Investigaciones Biotecnológicas "Dr. Raúl Alfonsín" (sede Chascomús); Argentina
Fil: Almagro Armenteros, Jose Juan. Technical University of Denmark; Dinamarca
Fil: Nielsen, Henrik. Technical University of Denmark; Dinamarca
Fil: Sønderby, Casper Kaae. Universidad de Copenhagen; Dinamarca
Fil: Winther, Ole. Universidad de Copenhagen; Dinamarca
Fil: Sønderby, Søren Kaae. Universidad de Copenhagen; Dinamarca - Materia
-
Machine Learning
Biology
Sequence - Nivel de accesibilidad
- acceso abierto
- Condiciones de uso
- https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
- Repositorio
- Institución
- Consejo Nacional de Investigaciones Científicas y Técnicas
- OAI Identificador
- oai:ri.conicet.gov.ar:11336/66355
Ver los metadatos del registro completo
id |
CONICETDig_cac38300eac75f56bb2cbaed14414efe |
---|---|
oai_identifier_str |
oai:ri.conicet.gov.ar:11336/66355 |
network_acronym_str |
CONICETDig |
repository_id_str |
3498 |
network_name_str |
CONICET Digital (CONICET) |
spelling |
An introduction to deep learning on biological sequence data: Examples and solutionsJurtz, Vanessa IsabellJohansen, Alexander RosenbergNielsen, MortenAlmagro Armenteros, Jose JuanNielsen, HenrikSønderby, Casper KaaeWinther, OleSønderby, Søren KaaeMachine LearningBiologySequencehttps://purl.org/becyt/ford/3.3https://purl.org/becyt/ford/3Motivation: Deep neural network architectures such as convolutional and long short-term memory networks have become increasingly popular as machine learning tools during the recent years. The availability of greater computational resources, more data, new algorithms for training deep models and easy to use libraries for implementation and training of neural networks are the drivers of this development. The use of deep learning has been especially successful in image recognition; and the development of tools, applications and code examples are in most cases centered within this field rather than within biology. Results: Here, we aim to further the development of deep learning methods within biology by providing application examples and ready to apply and adapt code templates. Given such examples, we illustrate how architectures consisting of convolutional and long short-term memory neural networks can relatively easily be designed and trained to state-of-the-art performance on three biological sequence problems: prediction of subcellular localization, protein secondary structure and the binding of peptides to MHC Class II molecules. Availability and implementation: All implementations and datasets are available online to the scientific community at https://github.com/vanessajurtz/lasagne4bio. Supplementary information: Supplementary data are available at Bioinformatics online.Fil: Jurtz, Vanessa Isabell. Technical University of Denmark; DinamarcaFil: Johansen, Alexander Rosenberg. Technical University of Denmark; DinamarcaFil: Nielsen, Morten. Technical University of Denmark; Dinamarca. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - La Plata. Instituto de Investigaciones Biotecnológicas. Instituto de Investigaciones Biotecnológicas "Dr. Raúl Alfonsín" (sede Chascomús). Universidad Nacional de San Martín. Instituto de Investigaciones Biotecnológicas. Instituto de Investigaciones Biotecnológicas "Dr. Raúl Alfonsín" (sede Chascomús); ArgentinaFil: Almagro Armenteros, Jose Juan. Technical University of Denmark; DinamarcaFil: Nielsen, Henrik. Technical University of Denmark; DinamarcaFil: Sønderby, Casper Kaae. Universidad de Copenhagen; DinamarcaFil: Winther, Ole. Universidad de Copenhagen; DinamarcaFil: Sønderby, Søren Kaae. Universidad de Copenhagen; DinamarcaOxford University Press2017-11info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionhttp://purl.org/coar/resource_type/c_6501info:ar-repo/semantics/articuloapplication/pdfapplication/pdfapplication/pdfhttp://hdl.handle.net/11336/66355Jurtz, Vanessa Isabell; Johansen, Alexander Rosenberg; Nielsen, Morten; Almagro Armenteros, Jose Juan; Nielsen, Henrik; et al.; An introduction to deep learning on biological sequence data: Examples and solutions; Oxford University Press; Bioinformatics (Oxford, England); 33; 22; 11-2017; 3685-36901367-48031460-2059CONICET DigitalCONICETenginfo:eu-repo/semantics/altIdentifier/doi/10.1093/bioinformatics/btx531info:eu-repo/semantics/altIdentifier/url/https://academic.oup.com/bioinformatics/article-abstract/33/22/3685/4092933info:eu-repo/semantics/altIdentifier/url/https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5870575/info:eu-repo/semantics/openAccesshttps://creativecommons.org/licenses/by-nc-sa/2.5/ar/reponame:CONICET Digital (CONICET)instname:Consejo Nacional de Investigaciones Científicas y Técnicas2025-10-15T15:34:44Zoai:ri.conicet.gov.ar:11336/66355instacron:CONICETInstitucionalhttp://ri.conicet.gov.ar/Organismo científico-tecnológicoNo correspondehttp://ri.conicet.gov.ar/oai/requestdasensio@conicet.gov.ar; lcarlino@conicet.gov.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:34982025-10-15 15:34:44.372CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicasfalse |
dc.title.none.fl_str_mv |
An introduction to deep learning on biological sequence data: Examples and solutions |
title |
An introduction to deep learning on biological sequence data: Examples and solutions |
spellingShingle |
An introduction to deep learning on biological sequence data: Examples and solutions Jurtz, Vanessa Isabell Machine Learning Biology Sequence |
title_short |
An introduction to deep learning on biological sequence data: Examples and solutions |
title_full |
An introduction to deep learning on biological sequence data: Examples and solutions |
title_fullStr |
An introduction to deep learning on biological sequence data: Examples and solutions |
title_full_unstemmed |
An introduction to deep learning on biological sequence data: Examples and solutions |
title_sort |
An introduction to deep learning on biological sequence data: Examples and solutions |
dc.creator.none.fl_str_mv |
Jurtz, Vanessa Isabell Johansen, Alexander Rosenberg Nielsen, Morten Almagro Armenteros, Jose Juan Nielsen, Henrik Sønderby, Casper Kaae Winther, Ole Sønderby, Søren Kaae |
author |
Jurtz, Vanessa Isabell |
author_facet |
Jurtz, Vanessa Isabell Johansen, Alexander Rosenberg Nielsen, Morten Almagro Armenteros, Jose Juan Nielsen, Henrik Sønderby, Casper Kaae Winther, Ole Sønderby, Søren Kaae |
author_role |
author |
author2 |
Johansen, Alexander Rosenberg Nielsen, Morten Almagro Armenteros, Jose Juan Nielsen, Henrik Sønderby, Casper Kaae Winther, Ole Sønderby, Søren Kaae |
author2_role |
author author author author author author author |
dc.subject.none.fl_str_mv |
Machine Learning Biology Sequence |
topic |
Machine Learning Biology Sequence |
purl_subject.fl_str_mv |
https://purl.org/becyt/ford/3.3 https://purl.org/becyt/ford/3 |
dc.description.none.fl_txt_mv |
Motivation: Deep neural network architectures such as convolutional and long short-term memory networks have become increasingly popular as machine learning tools during the recent years. The availability of greater computational resources, more data, new algorithms for training deep models and easy to use libraries for implementation and training of neural networks are the drivers of this development. The use of deep learning has been especially successful in image recognition; and the development of tools, applications and code examples are in most cases centered within this field rather than within biology. Results: Here, we aim to further the development of deep learning methods within biology by providing application examples and ready to apply and adapt code templates. Given such examples, we illustrate how architectures consisting of convolutional and long short-term memory neural networks can relatively easily be designed and trained to state-of-the-art performance on three biological sequence problems: prediction of subcellular localization, protein secondary structure and the binding of peptides to MHC Class II molecules. Availability and implementation: All implementations and datasets are available online to the scientific community at https://github.com/vanessajurtz/lasagne4bio. Supplementary information: Supplementary data are available at Bioinformatics online. Fil: Jurtz, Vanessa Isabell. Technical University of Denmark; Dinamarca Fil: Johansen, Alexander Rosenberg. Technical University of Denmark; Dinamarca Fil: Nielsen, Morten. Technical University of Denmark; Dinamarca. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - La Plata. Instituto de Investigaciones Biotecnológicas. Instituto de Investigaciones Biotecnológicas "Dr. Raúl Alfonsín" (sede Chascomús). Universidad Nacional de San Martín. Instituto de Investigaciones Biotecnológicas. Instituto de Investigaciones Biotecnológicas "Dr. Raúl Alfonsín" (sede Chascomús); Argentina Fil: Almagro Armenteros, Jose Juan. Technical University of Denmark; Dinamarca Fil: Nielsen, Henrik. Technical University of Denmark; Dinamarca Fil: Sønderby, Casper Kaae. Universidad de Copenhagen; Dinamarca Fil: Winther, Ole. Universidad de Copenhagen; Dinamarca Fil: Sønderby, Søren Kaae. Universidad de Copenhagen; Dinamarca |
description |
Motivation: Deep neural network architectures such as convolutional and long short-term memory networks have become increasingly popular as machine learning tools during the recent years. The availability of greater computational resources, more data, new algorithms for training deep models and easy to use libraries for implementation and training of neural networks are the drivers of this development. The use of deep learning has been especially successful in image recognition; and the development of tools, applications and code examples are in most cases centered within this field rather than within biology. Results: Here, we aim to further the development of deep learning methods within biology by providing application examples and ready to apply and adapt code templates. Given such examples, we illustrate how architectures consisting of convolutional and long short-term memory neural networks can relatively easily be designed and trained to state-of-the-art performance on three biological sequence problems: prediction of subcellular localization, protein secondary structure and the binding of peptides to MHC Class II molecules. Availability and implementation: All implementations and datasets are available online to the scientific community at https://github.com/vanessajurtz/lasagne4bio. Supplementary information: Supplementary data are available at Bioinformatics online. |
publishDate |
2017 |
dc.date.none.fl_str_mv |
2017-11 |
dc.type.none.fl_str_mv |
info:eu-repo/semantics/article info:eu-repo/semantics/publishedVersion http://purl.org/coar/resource_type/c_6501 info:ar-repo/semantics/articulo |
format |
article |
status_str |
publishedVersion |
dc.identifier.none.fl_str_mv |
http://hdl.handle.net/11336/66355 Jurtz, Vanessa Isabell; Johansen, Alexander Rosenberg; Nielsen, Morten; Almagro Armenteros, Jose Juan; Nielsen, Henrik; et al.; An introduction to deep learning on biological sequence data: Examples and solutions; Oxford University Press; Bioinformatics (Oxford, England); 33; 22; 11-2017; 3685-3690 1367-4803 1460-2059 CONICET Digital CONICET |
url |
http://hdl.handle.net/11336/66355 |
identifier_str_mv |
Jurtz, Vanessa Isabell; Johansen, Alexander Rosenberg; Nielsen, Morten; Almagro Armenteros, Jose Juan; Nielsen, Henrik; et al.; An introduction to deep learning on biological sequence data: Examples and solutions; Oxford University Press; Bioinformatics (Oxford, England); 33; 22; 11-2017; 3685-3690 1367-4803 1460-2059 CONICET Digital CONICET |
dc.language.none.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
info:eu-repo/semantics/altIdentifier/doi/10.1093/bioinformatics/btx531 info:eu-repo/semantics/altIdentifier/url/https://academic.oup.com/bioinformatics/article-abstract/33/22/3685/4092933 info:eu-repo/semantics/altIdentifier/url/https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5870575/ |
dc.rights.none.fl_str_mv |
info:eu-repo/semantics/openAccess https://creativecommons.org/licenses/by-nc-sa/2.5/ar/ |
eu_rights_str_mv |
openAccess |
rights_invalid_str_mv |
https://creativecommons.org/licenses/by-nc-sa/2.5/ar/ |
dc.format.none.fl_str_mv |
application/pdf application/pdf application/pdf |
dc.publisher.none.fl_str_mv |
Oxford University Press |
publisher.none.fl_str_mv |
Oxford University Press |
dc.source.none.fl_str_mv |
reponame:CONICET Digital (CONICET) instname:Consejo Nacional de Investigaciones Científicas y Técnicas |
reponame_str |
CONICET Digital (CONICET) |
collection |
CONICET Digital (CONICET) |
instname_str |
Consejo Nacional de Investigaciones Científicas y Técnicas |
repository.name.fl_str_mv |
CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicas |
repository.mail.fl_str_mv |
dasensio@conicet.gov.ar; lcarlino@conicet.gov.ar |
_version_ |
1846083475251134464 |
score |
13.22299 |