Improved pan-specific prediction of MHC class I peptide binding using a novel receptor clustering data partitioning strategy

Autores
Mattsson, Andreas Holm; Kringelum, J.V.; Garde, C.; Nielsen, Morten
Año de publicación
2016
Idioma
inglés
Tipo de recurso
artículo
Estado
versión publicada
Descripción
Pan-specific prediction of receptor–ligand interaction is conventionally done using machine-learning methods that integrates information about both receptor and ligand primary sequences. To achieve optimal performance using machine learning, dealing with overfitting and data redundancy is critical. Most often so-called ligand clustering methods have been used to deal with these issues in the context of pan-specific receptor–ligand predictions, and the MHC system the approach has proven highly effective for extrapolating information from a limited set of receptors with well characterized binding motifs, to others with no or very limited experimental characterization. The success of this approach has however proven to depend strongly on the similarity of the query molecule to the molecules with characterized specificity using in the machine-learning process. Here, we outline an alternative strategy with the aim of altering this and construct data sets optimal for training of pan-specific receptor–ligand predictions focusing on receptor similarity rather than ligand similarity. We show that this receptor clustering method consistently in benchmarks covering affinity predictions, MHC ligand and MHC epitope identification perform better than the conventional ligand clustering method on the alleles with remote similarity to the training set.
Fil: Mattsson, Andreas Holm. Technical University of Denmark; Dinamarca. Evaxion Biotech; Dinamarca
Fil: Kringelum, J.V.. Evaxion Biotech; Dinamarca
Fil: Garde, C.. Universidad de Copenhagen; Dinamarca
Fil: Nielsen, Morten. Technical University of Denmark; Dinamarca. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - La Plata. Instituto de Investigaciones Biotecnológicas. Universidad Nacional de San Martín. Instituto de Investigaciones Biotecnológicas; Argentina
Materia
Artificial Neural Networks
Clustering
Mhc Binding Specificity
Mhc Class I
Peptide–Mhc Binding
T-Cell Epitope
Nivel de accesibilidad
acceso abierto
Condiciones de uso
https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
Repositorio
CONICET Digital (CONICET)
Institución
Consejo Nacional de Investigaciones Científicas y Técnicas
OAI Identificador
oai:ri.conicet.gov.ar:11336/48877

id CONICETDig_3ae0cdca1274f8c119f0ce9514d95e4d
oai_identifier_str oai:ri.conicet.gov.ar:11336/48877
network_acronym_str CONICETDig
repository_id_str 3498
network_name_str CONICET Digital (CONICET)
spelling Improved pan-specific prediction of MHC class I peptide binding using a novel receptor clustering data partitioning strategyMattsson, Andreas HolmKringelum, J.V.Garde, C.Nielsen, MortenArtificial Neural NetworksClusteringMhc Binding SpecificityMhc Class IPeptide–Mhc BindingT-Cell Epitopehttps://purl.org/becyt/ford/3.3https://purl.org/becyt/ford/3Pan-specific prediction of receptor–ligand interaction is conventionally done using machine-learning methods that integrates information about both receptor and ligand primary sequences. To achieve optimal performance using machine learning, dealing with overfitting and data redundancy is critical. Most often so-called ligand clustering methods have been used to deal with these issues in the context of pan-specific receptor–ligand predictions, and the MHC system the approach has proven highly effective for extrapolating information from a limited set of receptors with well characterized binding motifs, to others with no or very limited experimental characterization. The success of this approach has however proven to depend strongly on the similarity of the query molecule to the molecules with characterized specificity using in the machine-learning process. Here, we outline an alternative strategy with the aim of altering this and construct data sets optimal for training of pan-specific receptor–ligand predictions focusing on receptor similarity rather than ligand similarity. We show that this receptor clustering method consistently in benchmarks covering affinity predictions, MHC ligand and MHC epitope identification perform better than the conventional ligand clustering method on the alleles with remote similarity to the training set.Fil: Mattsson, Andreas Holm. Technical University of Denmark; Dinamarca. Evaxion Biotech; DinamarcaFil: Kringelum, J.V.. Evaxion Biotech; DinamarcaFil: Garde, C.. Universidad de Copenhagen; DinamarcaFil: Nielsen, Morten. Technical University of Denmark; Dinamarca. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - La Plata. Instituto de Investigaciones Biotecnológicas. Universidad Nacional de San Martín. Instituto de Investigaciones Biotecnológicas; ArgentinaWiley Blackwell Publishing, Inc2016-12info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionhttp://purl.org/coar/resource_type/c_6501info:ar-repo/semantics/articuloapplication/pdfapplication/pdfhttp://hdl.handle.net/11336/48877Mattsson, Andreas Holm; Kringelum, J.V.; Garde, C.; Nielsen, Morten; Improved pan-specific prediction of MHC class I peptide binding using a novel receptor clustering data partitioning strategy; Wiley Blackwell Publishing, Inc; HLA; 88; 6; 12-2016; 287-2922059-2310CONICET DigitalCONICETenginfo:eu-repo/semantics/altIdentifier/doi/10.1111/tan.12911info:eu-repo/semantics/altIdentifier/url/https://onlinelibrary.wiley.com/doi/abs/10.1111/tan.12911info:eu-repo/semantics/openAccesshttps://creativecommons.org/licenses/by-nc-sa/2.5/ar/reponame:CONICET Digital (CONICET)instname:Consejo Nacional de Investigaciones Científicas y Técnicas2025-09-03T09:56:21Zoai:ri.conicet.gov.ar:11336/48877instacron:CONICETInstitucionalhttp://ri.conicet.gov.ar/Organismo científico-tecnológicoNo correspondehttp://ri.conicet.gov.ar/oai/requestdasensio@conicet.gov.ar; lcarlino@conicet.gov.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:34982025-09-03 09:56:22.156CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicasfalse
dc.title.none.fl_str_mv Improved pan-specific prediction of MHC class I peptide binding using a novel receptor clustering data partitioning strategy
title Improved pan-specific prediction of MHC class I peptide binding using a novel receptor clustering data partitioning strategy
spellingShingle Improved pan-specific prediction of MHC class I peptide binding using a novel receptor clustering data partitioning strategy
Mattsson, Andreas Holm
Artificial Neural Networks
Clustering
Mhc Binding Specificity
Mhc Class I
Peptide–Mhc Binding
T-Cell Epitope
title_short Improved pan-specific prediction of MHC class I peptide binding using a novel receptor clustering data partitioning strategy
title_full Improved pan-specific prediction of MHC class I peptide binding using a novel receptor clustering data partitioning strategy
title_fullStr Improved pan-specific prediction of MHC class I peptide binding using a novel receptor clustering data partitioning strategy
title_full_unstemmed Improved pan-specific prediction of MHC class I peptide binding using a novel receptor clustering data partitioning strategy
title_sort Improved pan-specific prediction of MHC class I peptide binding using a novel receptor clustering data partitioning strategy
dc.creator.none.fl_str_mv Mattsson, Andreas Holm
Kringelum, J.V.
Garde, C.
Nielsen, Morten
author Mattsson, Andreas Holm
author_facet Mattsson, Andreas Holm
Kringelum, J.V.
Garde, C.
Nielsen, Morten
author_role author
author2 Kringelum, J.V.
Garde, C.
Nielsen, Morten
author2_role author
author
author
dc.subject.none.fl_str_mv Artificial Neural Networks
Clustering
Mhc Binding Specificity
Mhc Class I
Peptide–Mhc Binding
T-Cell Epitope
topic Artificial Neural Networks
Clustering
Mhc Binding Specificity
Mhc Class I
Peptide–Mhc Binding
T-Cell Epitope
purl_subject.fl_str_mv https://purl.org/becyt/ford/3.3
https://purl.org/becyt/ford/3
dc.description.none.fl_txt_mv Pan-specific prediction of receptor–ligand interaction is conventionally done using machine-learning methods that integrates information about both receptor and ligand primary sequences. To achieve optimal performance using machine learning, dealing with overfitting and data redundancy is critical. Most often so-called ligand clustering methods have been used to deal with these issues in the context of pan-specific receptor–ligand predictions, and the MHC system the approach has proven highly effective for extrapolating information from a limited set of receptors with well characterized binding motifs, to others with no or very limited experimental characterization. The success of this approach has however proven to depend strongly on the similarity of the query molecule to the molecules with characterized specificity using in the machine-learning process. Here, we outline an alternative strategy with the aim of altering this and construct data sets optimal for training of pan-specific receptor–ligand predictions focusing on receptor similarity rather than ligand similarity. We show that this receptor clustering method consistently in benchmarks covering affinity predictions, MHC ligand and MHC epitope identification perform better than the conventional ligand clustering method on the alleles with remote similarity to the training set.
Fil: Mattsson, Andreas Holm. Technical University of Denmark; Dinamarca. Evaxion Biotech; Dinamarca
Fil: Kringelum, J.V.. Evaxion Biotech; Dinamarca
Fil: Garde, C.. Universidad de Copenhagen; Dinamarca
Fil: Nielsen, Morten. Technical University of Denmark; Dinamarca. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - La Plata. Instituto de Investigaciones Biotecnológicas. Universidad Nacional de San Martín. Instituto de Investigaciones Biotecnológicas; Argentina
description Pan-specific prediction of receptor–ligand interaction is conventionally done using machine-learning methods that integrates information about both receptor and ligand primary sequences. To achieve optimal performance using machine learning, dealing with overfitting and data redundancy is critical. Most often so-called ligand clustering methods have been used to deal with these issues in the context of pan-specific receptor–ligand predictions, and the MHC system the approach has proven highly effective for extrapolating information from a limited set of receptors with well characterized binding motifs, to others with no or very limited experimental characterization. The success of this approach has however proven to depend strongly on the similarity of the query molecule to the molecules with characterized specificity using in the machine-learning process. Here, we outline an alternative strategy with the aim of altering this and construct data sets optimal for training of pan-specific receptor–ligand predictions focusing on receptor similarity rather than ligand similarity. We show that this receptor clustering method consistently in benchmarks covering affinity predictions, MHC ligand and MHC epitope identification perform better than the conventional ligand clustering method on the alleles with remote similarity to the training set.
publishDate 2016
dc.date.none.fl_str_mv 2016-12
dc.type.none.fl_str_mv info:eu-repo/semantics/article
info:eu-repo/semantics/publishedVersion
http://purl.org/coar/resource_type/c_6501
info:ar-repo/semantics/articulo
format article
status_str publishedVersion
dc.identifier.none.fl_str_mv http://hdl.handle.net/11336/48877
Mattsson, Andreas Holm; Kringelum, J.V.; Garde, C.; Nielsen, Morten; Improved pan-specific prediction of MHC class I peptide binding using a novel receptor clustering data partitioning strategy; Wiley Blackwell Publishing, Inc; HLA; 88; 6; 12-2016; 287-292
2059-2310
CONICET Digital
CONICET
url http://hdl.handle.net/11336/48877
identifier_str_mv Mattsson, Andreas Holm; Kringelum, J.V.; Garde, C.; Nielsen, Morten; Improved pan-specific prediction of MHC class I peptide binding using a novel receptor clustering data partitioning strategy; Wiley Blackwell Publishing, Inc; HLA; 88; 6; 12-2016; 287-292
2059-2310
CONICET Digital
CONICET
dc.language.none.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv info:eu-repo/semantics/altIdentifier/doi/10.1111/tan.12911
info:eu-repo/semantics/altIdentifier/url/https://onlinelibrary.wiley.com/doi/abs/10.1111/tan.12911
dc.rights.none.fl_str_mv info:eu-repo/semantics/openAccess
https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
eu_rights_str_mv openAccess
rights_invalid_str_mv https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
dc.format.none.fl_str_mv application/pdf
application/pdf
dc.publisher.none.fl_str_mv Wiley Blackwell Publishing, Inc
publisher.none.fl_str_mv Wiley Blackwell Publishing, Inc
dc.source.none.fl_str_mv reponame:CONICET Digital (CONICET)
instname:Consejo Nacional de Investigaciones Científicas y Técnicas
reponame_str CONICET Digital (CONICET)
collection CONICET Digital (CONICET)
instname_str Consejo Nacional de Investigaciones Científicas y Técnicas
repository.name.fl_str_mv CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicas
repository.mail.fl_str_mv dasensio@conicet.gov.ar; lcarlino@conicet.gov.ar
_version_ 1842269400637374464
score 13.13397