A Quantitative Profiling Tool for Diverse Genomic Data Types Reveals Potential Associations between Chromatin and PremRNA Processing

Autores: Kremsky, Isaac; Bellora, Nicolás; Eyras, Eduardo
Año de publicación: 2015
Idioma: inglés
Tipo de recurso: artículo
Estado: versión publicada
Descripción: High-throughput sequencing, and genome-based datasets in general, are often represented as profiles centered at reference points to study the association of protein binding and other signals to particular regulatory mechanisms. Although these profiles often provide compelling evidence of these associations, they do not provide a quantitative assessment of the enrichment, which makes the comparison between signals and conditions difficult. In addition, a number of biases can confound profiles, but are rarely accounted for in the tools currently available. We present a novel computational method, ProfileSeq, for the quantitative assessment of biological profiles to provide an exact, nonparametric test that specific regions of the test profile have higher or lower signal densities than a control set. The method is applicable to high-throughput sequencing data (ChIP-Seq, GRO-Seq, CLIP-Seq, etc.) and to genome-based datasets (motifs, etc.). We validate ProfileSeq by recovering and providing a quantitative assessment of several results reported before in the literature using independent datasets. We show that input signal and mappability have confounding effects on the profile results, but that normalizing the signal by input reads can eliminate these biases while preserving the biological signal. Moreover, we apply ProfileSeq to ChIP-Seq data for transcription factors, as well as for motif and CLIP-Seq data for splicing factors. In all examples considered, the profiles were robust to biases in mappability of sequencing reads. Furthermore, analyses performed with ProfileSeq reveal a number of putative relationships between transcription factor binding to DNA and splicing factor binding to pre-mRNA, adding to the growing body of evidence relating chromatin and pre-mRNA processing. ProfileSeq provides a robust way to quantify genome-wide coordinate-based signal. Software and documentation are freely available for academic use at https://bitbucket.org/regulatorygenomicsupf/profileseq/.
Fil: Kremsky, Isaac . Universitat Pompeu Fabra; España
Fil: Bellora, Nicolás. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Patagonia Norte. Instituto de Investigación En Biodiversidad y Medioambiente; Argentina. Universidad Nacional del Comahue. Centro Regional Universidad de Bariloche. Departamento de Biologia. Laboratorio de Microbiologia Aplicada y Biotecnologia; Argentina
Fil: Eyras, Eduardo . Institució Catalana de Recerca I Estudis Avancats; España
Materia: High-throughput sequencing
genomics
profiling
bioinformatics
Nivel de accesibilidad: acceso abierto
Condiciones de uso: https://creativecommons.org/licenses/by/2.5/ar/
Repositorio
Institución: Consejo Nacional de Investigaciones Científicas y Técnicas
OAI Identificador: oai:ri.conicet.gov.ar:11336/12050

Acceder

id	CONICETDig_44ef56b7db344e5f10625c7cccc0b724
oai_identifier_str	oai:ri.conicet.gov.ar:11336/12050
network_acronym_str	CONICETDig
repository_id_str	3498
network_name_str	CONICET Digital (CONICET)
spelling	A Quantitative Profiling Tool for Diverse Genomic Data Types Reveals Potential Associations between Chromatin and PremRNA ProcessingKremsky, Isaac Bellora, NicolásEyras, Eduardo High-throughput sequencinggenomicsprofilingbioinformaticshttps://purl.org/becyt/ford/1.6https://purl.org/becyt/ford/1High-throughput sequencing, and genome-based datasets in general, are often represented as profiles centered at reference points to study the association of protein binding and other signals to particular regulatory mechanisms. Although these profiles often provide compelling evidence of these associations, they do not provide a quantitative assessment of the enrichment, which makes the comparison between signals and conditions difficult. In addition, a number of biases can confound profiles, but are rarely accounted for in the tools currently available. We present a novel computational method, ProfileSeq, for the quantitative assessment of biological profiles to provide an exact, nonparametric test that specific regions of the test profile have higher or lower signal densities than a control set. The method is applicable to high-throughput sequencing data (ChIP-Seq, GRO-Seq, CLIP-Seq, etc.) and to genome-based datasets (motifs, etc.). We validate ProfileSeq by recovering and providing a quantitative assessment of several results reported before in the literature using independent datasets. We show that input signal and mappability have confounding effects on the profile results, but that normalizing the signal by input reads can eliminate these biases while preserving the biological signal. Moreover, we apply ProfileSeq to ChIP-Seq data for transcription factors, as well as for motif and CLIP-Seq data for splicing factors. In all examples considered, the profiles were robust to biases in mappability of sequencing reads. Furthermore, analyses performed with ProfileSeq reveal a number of putative relationships between transcription factor binding to DNA and splicing factor binding to pre-mRNA, adding to the growing body of evidence relating chromatin and pre-mRNA processing. ProfileSeq provides a robust way to quantify genome-wide coordinate-based signal. Software and documentation are freely available for academic use at https://bitbucket.org/regulatorygenomicsupf/profileseq/.Fil: Kremsky, Isaac . Universitat Pompeu Fabra; EspañaFil: Bellora, Nicolás. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Patagonia Norte. Instituto de Investigación En Biodiversidad y Medioambiente; Argentina. Universidad Nacional del Comahue. Centro Regional Universidad de Bariloche. Departamento de Biologia. Laboratorio de Microbiologia Aplicada y Biotecnologia; ArgentinaFil: Eyras, Eduardo . Institució Catalana de Recerca I Estudis Avancats; EspañaPublic Library Of Science2015-07info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionhttp://purl.org/coar/resource_type/c_6501info:ar-repo/semantics/articuloapplication/pdfapplication/pdfhttp://hdl.handle.net/11336/12050Kremsky, Isaac ; Bellora, Nicolás; Eyras, Eduardo ; A Quantitative Profiling Tool for Diverse Genomic Data Types Reveals Potential Associations between Chromatin and PremRNA Processing; Public Library Of Science; Plos One; 10; 7; 7-2015; 1-291932-6203enginfo:eu-repo/semantics/altIdentifier/url/http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0132448info:eu-repo/semantics/altIdentifier/doi/10.1371/journal.pone.0132448info:eu-repo/semantics/openAccesshttps://creativecommons.org/licenses/by/2.5/ar/reponame:CONICET Digital (CONICET)instname:Consejo Nacional de Investigaciones Científicas y Técnicas2026-02-26T10:05:02Zoai:ri.conicet.gov.ar:11336/12050instacron:CONICETInstitucionalhttp://ri.conicet.gov.ar/Organismo científico-tecnológicoNo correspondehttp://ri.conicet.gov.ar/oai/requestdasensio@conicet.gov.ar; lcarlino@conicet.gov.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:34982026-02-26 10:05:03.119CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicasfalse
dc.title.none.fl_str_mv	A Quantitative Profiling Tool for Diverse Genomic Data Types Reveals Potential Associations between Chromatin and PremRNA Processing
title	A Quantitative Profiling Tool for Diverse Genomic Data Types Reveals Potential Associations between Chromatin and PremRNA Processing
spellingShingle	A Quantitative Profiling Tool for Diverse Genomic Data Types Reveals Potential Associations between Chromatin and PremRNA Processing Kremsky, Isaac High-throughput sequencing genomics profiling bioinformatics
title_short	A Quantitative Profiling Tool for Diverse Genomic Data Types Reveals Potential Associations between Chromatin and PremRNA Processing
title_full	A Quantitative Profiling Tool for Diverse Genomic Data Types Reveals Potential Associations between Chromatin and PremRNA Processing
title_fullStr	A Quantitative Profiling Tool for Diverse Genomic Data Types Reveals Potential Associations between Chromatin and PremRNA Processing
title_full_unstemmed	A Quantitative Profiling Tool for Diverse Genomic Data Types Reveals Potential Associations between Chromatin and PremRNA Processing
title_sort	A Quantitative Profiling Tool for Diverse Genomic Data Types Reveals Potential Associations between Chromatin and PremRNA Processing
dc.creator.none.fl_str_mv	Kremsky, Isaac Bellora, Nicolás Eyras, Eduardo
author	Kremsky, Isaac
author_facet	Kremsky, Isaac Bellora, Nicolás Eyras, Eduardo
author_role	author
author2	Bellora, Nicolás Eyras, Eduardo
author2_role	author author
dc.subject.none.fl_str_mv	High-throughput sequencing genomics profiling bioinformatics
topic	High-throughput sequencing genomics profiling bioinformatics
purl_subject.fl_str_mv	https://purl.org/becyt/ford/1.6 https://purl.org/becyt/ford/1
dc.description.none.fl_txt_mv	High-throughput sequencing, and genome-based datasets in general, are often represented as profiles centered at reference points to study the association of protein binding and other signals to particular regulatory mechanisms. Although these profiles often provide compelling evidence of these associations, they do not provide a quantitative assessment of the enrichment, which makes the comparison between signals and conditions difficult. In addition, a number of biases can confound profiles, but are rarely accounted for in the tools currently available. We present a novel computational method, ProfileSeq, for the quantitative assessment of biological profiles to provide an exact, nonparametric test that specific regions of the test profile have higher or lower signal densities than a control set. The method is applicable to high-throughput sequencing data (ChIP-Seq, GRO-Seq, CLIP-Seq, etc.) and to genome-based datasets (motifs, etc.). We validate ProfileSeq by recovering and providing a quantitative assessment of several results reported before in the literature using independent datasets. We show that input signal and mappability have confounding effects on the profile results, but that normalizing the signal by input reads can eliminate these biases while preserving the biological signal. Moreover, we apply ProfileSeq to ChIP-Seq data for transcription factors, as well as for motif and CLIP-Seq data for splicing factors. In all examples considered, the profiles were robust to biases in mappability of sequencing reads. Furthermore, analyses performed with ProfileSeq reveal a number of putative relationships between transcription factor binding to DNA and splicing factor binding to pre-mRNA, adding to the growing body of evidence relating chromatin and pre-mRNA processing. ProfileSeq provides a robust way to quantify genome-wide coordinate-based signal. Software and documentation are freely available for academic use at https://bitbucket.org/regulatorygenomicsupf/profileseq/. Fil: Kremsky, Isaac . Universitat Pompeu Fabra; España Fil: Bellora, Nicolás. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Patagonia Norte. Instituto de Investigación En Biodiversidad y Medioambiente; Argentina. Universidad Nacional del Comahue. Centro Regional Universidad de Bariloche. Departamento de Biologia. Laboratorio de Microbiologia Aplicada y Biotecnologia; Argentina Fil: Eyras, Eduardo . Institució Catalana de Recerca I Estudis Avancats; España
description	High-throughput sequencing, and genome-based datasets in general, are often represented as profiles centered at reference points to study the association of protein binding and other signals to particular regulatory mechanisms. Although these profiles often provide compelling evidence of these associations, they do not provide a quantitative assessment of the enrichment, which makes the comparison between signals and conditions difficult. In addition, a number of biases can confound profiles, but are rarely accounted for in the tools currently available. We present a novel computational method, ProfileSeq, for the quantitative assessment of biological profiles to provide an exact, nonparametric test that specific regions of the test profile have higher or lower signal densities than a control set. The method is applicable to high-throughput sequencing data (ChIP-Seq, GRO-Seq, CLIP-Seq, etc.) and to genome-based datasets (motifs, etc.). We validate ProfileSeq by recovering and providing a quantitative assessment of several results reported before in the literature using independent datasets. We show that input signal and mappability have confounding effects on the profile results, but that normalizing the signal by input reads can eliminate these biases while preserving the biological signal. Moreover, we apply ProfileSeq to ChIP-Seq data for transcription factors, as well as for motif and CLIP-Seq data for splicing factors. In all examples considered, the profiles were robust to biases in mappability of sequencing reads. Furthermore, analyses performed with ProfileSeq reveal a number of putative relationships between transcription factor binding to DNA and splicing factor binding to pre-mRNA, adding to the growing body of evidence relating chromatin and pre-mRNA processing. ProfileSeq provides a robust way to quantify genome-wide coordinate-based signal. Software and documentation are freely available for academic use at https://bitbucket.org/regulatorygenomicsupf/profileseq/.
publishDate	2015
dc.date.none.fl_str_mv	2015-07
dc.type.none.fl_str_mv	info:eu-repo/semantics/article info:eu-repo/semantics/publishedVersion http://purl.org/coar/resource_type/c_6501 info:ar-repo/semantics/articulo
format	article
status_str	publishedVersion
dc.identifier.none.fl_str_mv	http://hdl.handle.net/11336/12050 Kremsky, Isaac ; Bellora, Nicolás; Eyras, Eduardo ; A Quantitative Profiling Tool for Diverse Genomic Data Types Reveals Potential Associations between Chromatin and PremRNA Processing; Public Library Of Science; Plos One; 10; 7; 7-2015; 1-29 1932-6203
url	http://hdl.handle.net/11336/12050
identifier_str_mv	Kremsky, Isaac ; Bellora, Nicolás; Eyras, Eduardo ; A Quantitative Profiling Tool for Diverse Genomic Data Types Reveals Potential Associations between Chromatin and PremRNA Processing; Public Library Of Science; Plos One; 10; 7; 7-2015; 1-29 1932-6203
dc.language.none.fl_str_mv	eng
language	eng
dc.relation.none.fl_str_mv	info:eu-repo/semantics/altIdentifier/url/http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0132448 info:eu-repo/semantics/altIdentifier/doi/10.1371/journal.pone.0132448
dc.rights.none.fl_str_mv	info:eu-repo/semantics/openAccess https://creativecommons.org/licenses/by/2.5/ar/
eu_rights_str_mv	openAccess
rights_invalid_str_mv	https://creativecommons.org/licenses/by/2.5/ar/
dc.format.none.fl_str_mv	application/pdf application/pdf
dc.publisher.none.fl_str_mv	Public Library Of Science
publisher.none.fl_str_mv	Public Library Of Science
dc.source.none.fl_str_mv	reponame:CONICET Digital (CONICET) instname:Consejo Nacional de Investigaciones Científicas y Técnicas
reponame_str	CONICET Digital (CONICET)
collection	CONICET Digital (CONICET)
instname_str	Consejo Nacional de Investigaciones Científicas y Técnicas
repository.name.fl_str_mv	CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicas
repository.mail.fl_str_mv	dasensio@conicet.gov.ar; lcarlino@conicet.gov.ar
_version_	1858305144299454464
score	13.176822

A Quantitative Profiling Tool for Diverse Genomic Data Types Reveals Potential Associations between Chromatin and PremRNA Processing

Publicaciones similares