MetaPhinder - Identifying bacteriophage sequences in metagenomic data sets

Autores
Jurtz, Vanessa Isabell; Villarroel, Julia; Lund, Ole; Voldby Larsen, Mette; Nielsen, Morten
Año de publicación
2016
Idioma
inglés
Tipo de recurso
artículo
Estado
versión publicada
Descripción
Bacteriophages are the most abundant biological entity on the planet, but at the same time do not account for much of the genetic material isolated from most environments due to their small genome sizes. They also show great genetic diversity and mosaic genomes making it challenging to analyze and understand them. Here we present MetaPhinder, a method to identify assembled genomic fragments (i.e.contigs) of phage origin in metagenomic data sets. The method is based on a comparison to a database of whole genome bacteriophage sequences, integrating hits to multiple genomes to accomodate for the mosaic genome structure of many bacteriophages. The method is demonstrated to out-perform both BLAST methods based on single hits and methods based on k-mer comparisons. MetaPhinder is available as a web service at the Center for Genomic Epidemiology https://cge.cbs.dtu.dk/services/MetaPhinder/, while the source code can be downloaded from https://bitbucket.org/genomicepidemiology/metaphinder or https://github.com/vanessajurtz/MetaPhinder.
Fil: Jurtz, Vanessa Isabell. Technical University of Denmark; Dinamarca
Fil: Villarroel, Julia. Technical University of Denmark; Dinamarca
Fil: Lund, Ole. Technical University of Denmark; Dinamarca
Fil: Voldby Larsen, Mette. Technical University of Denmark; Dinamarca
Fil: Nielsen, Morten. Technical University of Denmark; Dinamarca. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina
Materia
Phages
Host finder
Machine learning
Nivel de accesibilidad
acceso abierto
Condiciones de uso
https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
Repositorio
CONICET Digital (CONICET)
Institución
Consejo Nacional de Investigaciones Científicas y Técnicas
OAI Identificador
oai:ri.conicet.gov.ar:11336/48629

id CONICETDig_45e9b6c795e4406afb8a27a623240b52
oai_identifier_str oai:ri.conicet.gov.ar:11336/48629
network_acronym_str CONICETDig
repository_id_str 3498
network_name_str CONICET Digital (CONICET)
spelling MetaPhinder - Identifying bacteriophage sequences in metagenomic data setsJurtz, Vanessa IsabellVillarroel, JuliaLund, OleVoldby Larsen, MetteNielsen, MortenPhagesHost finderMachine learninghttps://purl.org/becyt/ford/3.3https://purl.org/becyt/ford/3Bacteriophages are the most abundant biological entity on the planet, but at the same time do not account for much of the genetic material isolated from most environments due to their small genome sizes. They also show great genetic diversity and mosaic genomes making it challenging to analyze and understand them. Here we present MetaPhinder, a method to identify assembled genomic fragments (i.e.contigs) of phage origin in metagenomic data sets. The method is based on a comparison to a database of whole genome bacteriophage sequences, integrating hits to multiple genomes to accomodate for the mosaic genome structure of many bacteriophages. The method is demonstrated to out-perform both BLAST methods based on single hits and methods based on k-mer comparisons. MetaPhinder is available as a web service at the Center for Genomic Epidemiology https://cge.cbs.dtu.dk/services/MetaPhinder/, while the source code can be downloaded from https://bitbucket.org/genomicepidemiology/metaphinder or https://github.com/vanessajurtz/MetaPhinder.Fil: Jurtz, Vanessa Isabell. Technical University of Denmark; DinamarcaFil: Villarroel, Julia. Technical University of Denmark; DinamarcaFil: Lund, Ole. Technical University of Denmark; DinamarcaFil: Voldby Larsen, Mette. Technical University of Denmark; DinamarcaFil: Nielsen, Morten. Technical University of Denmark; Dinamarca. Consejo Nacional de Investigaciones Científicas y Técnicas; ArgentinaPublic Library of Science2016-09info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionhttp://purl.org/coar/resource_type/c_6501info:ar-repo/semantics/articuloapplication/pdfapplication/pdfapplication/pdfhttp://hdl.handle.net/11336/48629Jurtz, Vanessa Isabell; Villarroel, Julia; Lund, Ole; Voldby Larsen, Mette; Nielsen, Morten; MetaPhinder - Identifying bacteriophage sequences in metagenomic data sets; Public Library of Science; Plos One; 11; 9; 9-2016; 1-14; e01631111932-6203CONICET DigitalCONICETenginfo:eu-repo/semantics/altIdentifier/doi/10.1371/journal.pone.0163111info:eu-repo/semantics/altIdentifier/url/http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0163111info:eu-repo/semantics/openAccesshttps://creativecommons.org/licenses/by-nc-sa/2.5/ar/reponame:CONICET Digital (CONICET)instname:Consejo Nacional de Investigaciones Científicas y Técnicas2025-10-15T15:07:25Zoai:ri.conicet.gov.ar:11336/48629instacron:CONICETInstitucionalhttp://ri.conicet.gov.ar/Organismo científico-tecnológicoNo correspondehttp://ri.conicet.gov.ar/oai/requestdasensio@conicet.gov.ar; lcarlino@conicet.gov.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:34982025-10-15 15:07:25.392CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicasfalse
dc.title.none.fl_str_mv MetaPhinder - Identifying bacteriophage sequences in metagenomic data sets
title MetaPhinder - Identifying bacteriophage sequences in metagenomic data sets
spellingShingle MetaPhinder - Identifying bacteriophage sequences in metagenomic data sets
Jurtz, Vanessa Isabell
Phages
Host finder
Machine learning
title_short MetaPhinder - Identifying bacteriophage sequences in metagenomic data sets
title_full MetaPhinder - Identifying bacteriophage sequences in metagenomic data sets
title_fullStr MetaPhinder - Identifying bacteriophage sequences in metagenomic data sets
title_full_unstemmed MetaPhinder - Identifying bacteriophage sequences in metagenomic data sets
title_sort MetaPhinder - Identifying bacteriophage sequences in metagenomic data sets
dc.creator.none.fl_str_mv Jurtz, Vanessa Isabell
Villarroel, Julia
Lund, Ole
Voldby Larsen, Mette
Nielsen, Morten
author Jurtz, Vanessa Isabell
author_facet Jurtz, Vanessa Isabell
Villarroel, Julia
Lund, Ole
Voldby Larsen, Mette
Nielsen, Morten
author_role author
author2 Villarroel, Julia
Lund, Ole
Voldby Larsen, Mette
Nielsen, Morten
author2_role author
author
author
author
dc.subject.none.fl_str_mv Phages
Host finder
Machine learning
topic Phages
Host finder
Machine learning
purl_subject.fl_str_mv https://purl.org/becyt/ford/3.3
https://purl.org/becyt/ford/3
dc.description.none.fl_txt_mv Bacteriophages are the most abundant biological entity on the planet, but at the same time do not account for much of the genetic material isolated from most environments due to their small genome sizes. They also show great genetic diversity and mosaic genomes making it challenging to analyze and understand them. Here we present MetaPhinder, a method to identify assembled genomic fragments (i.e.contigs) of phage origin in metagenomic data sets. The method is based on a comparison to a database of whole genome bacteriophage sequences, integrating hits to multiple genomes to accomodate for the mosaic genome structure of many bacteriophages. The method is demonstrated to out-perform both BLAST methods based on single hits and methods based on k-mer comparisons. MetaPhinder is available as a web service at the Center for Genomic Epidemiology https://cge.cbs.dtu.dk/services/MetaPhinder/, while the source code can be downloaded from https://bitbucket.org/genomicepidemiology/metaphinder or https://github.com/vanessajurtz/MetaPhinder.
Fil: Jurtz, Vanessa Isabell. Technical University of Denmark; Dinamarca
Fil: Villarroel, Julia. Technical University of Denmark; Dinamarca
Fil: Lund, Ole. Technical University of Denmark; Dinamarca
Fil: Voldby Larsen, Mette. Technical University of Denmark; Dinamarca
Fil: Nielsen, Morten. Technical University of Denmark; Dinamarca. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina
description Bacteriophages are the most abundant biological entity on the planet, but at the same time do not account for much of the genetic material isolated from most environments due to their small genome sizes. They also show great genetic diversity and mosaic genomes making it challenging to analyze and understand them. Here we present MetaPhinder, a method to identify assembled genomic fragments (i.e.contigs) of phage origin in metagenomic data sets. The method is based on a comparison to a database of whole genome bacteriophage sequences, integrating hits to multiple genomes to accomodate for the mosaic genome structure of many bacteriophages. The method is demonstrated to out-perform both BLAST methods based on single hits and methods based on k-mer comparisons. MetaPhinder is available as a web service at the Center for Genomic Epidemiology https://cge.cbs.dtu.dk/services/MetaPhinder/, while the source code can be downloaded from https://bitbucket.org/genomicepidemiology/metaphinder or https://github.com/vanessajurtz/MetaPhinder.
publishDate 2016
dc.date.none.fl_str_mv 2016-09
dc.type.none.fl_str_mv info:eu-repo/semantics/article
info:eu-repo/semantics/publishedVersion
http://purl.org/coar/resource_type/c_6501
info:ar-repo/semantics/articulo
format article
status_str publishedVersion
dc.identifier.none.fl_str_mv http://hdl.handle.net/11336/48629
Jurtz, Vanessa Isabell; Villarroel, Julia; Lund, Ole; Voldby Larsen, Mette; Nielsen, Morten; MetaPhinder - Identifying bacteriophage sequences in metagenomic data sets; Public Library of Science; Plos One; 11; 9; 9-2016; 1-14; e0163111
1932-6203
CONICET Digital
CONICET
url http://hdl.handle.net/11336/48629
identifier_str_mv Jurtz, Vanessa Isabell; Villarroel, Julia; Lund, Ole; Voldby Larsen, Mette; Nielsen, Morten; MetaPhinder - Identifying bacteriophage sequences in metagenomic data sets; Public Library of Science; Plos One; 11; 9; 9-2016; 1-14; e0163111
1932-6203
CONICET Digital
CONICET
dc.language.none.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv info:eu-repo/semantics/altIdentifier/doi/10.1371/journal.pone.0163111
info:eu-repo/semantics/altIdentifier/url/http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0163111
dc.rights.none.fl_str_mv info:eu-repo/semantics/openAccess
https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
eu_rights_str_mv openAccess
rights_invalid_str_mv https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
dc.format.none.fl_str_mv application/pdf
application/pdf
application/pdf
dc.publisher.none.fl_str_mv Public Library of Science
publisher.none.fl_str_mv Public Library of Science
dc.source.none.fl_str_mv reponame:CONICET Digital (CONICET)
instname:Consejo Nacional de Investigaciones Científicas y Técnicas
reponame_str CONICET Digital (CONICET)
collection CONICET Digital (CONICET)
instname_str Consejo Nacional de Investigaciones Científicas y Técnicas
repository.name.fl_str_mv CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicas
repository.mail.fl_str_mv dasensio@conicet.gov.ar; lcarlino@conicet.gov.ar
_version_ 1846083218672975872
score 13.22299