Discovering network relations in big time series with application to bioinformatics

Autores
Rubiolo, Mariano; Milone, Diego H.; Stegmayer, Georgina
Año de publicación
2015
Idioma
inglés
Tipo de recurso
documento de conferencia
Estado
versión publicada
Descripción
Big Data concerns large-volume, complex and growing data sets, with multiple and autonomous sources. It is now rapidly expanding in all science and engineering domains. Time series represent an important class of big data that can be obtained from several applications, such as medicine (electrocardiogram), environmental (daily temperature), financial (weekly sales totals, and prices of mutual funds and stocks), as well as from many areas, such as socialnetworks and biology. Bioinformatics seeks to provide tools and analyses that facilitate understanding of living systems, by analyzing and correlating biological information. In particular, as increasingly large amounts of genes information have become available in the last years, more efficient algorithms for dealing with such big data in genomics are required. There is an increasing interest in this field for the discovery of the network of regulations among a group of genes, named Gene Regulation Networks (GRN), by analyzing the genes expression profiles represented as timeseries. In it has been proposed the GRNNminer method, which allows discovering the subyacent GRN among a group of genes, through the proper modeling of the temporal dynamics of the gene expression profiles with artificial neural networks. However, it implies building and training a pool of neural models for each possible gentogen relationship, which derives in executing a very large set of experiments with O( n 2 ) order, where n is the total of involved genes. This work presents a proposal for dramatically reducing such experiments number to O( (n/k)2 ) when big timeseries is involved for reconstructing a GRN from such data, by previously clustering genes profiles in k groups using selforganizing maps (SOM). This way, the GRNNminer can be applied over smaller sets of timeseries, only those appearing in the same cluster.
Sociedad Argentina de Informática e Investigación Operativa (SADIO)
Materia
Ciencias Informáticas
big data
Genes
Neural nets
bioinformatics
Nivel de accesibilidad
acceso abierto
Condiciones de uso
http://creativecommons.org/licenses/by-sa/3.0/
Repositorio
SEDICI (UNLP)
Institución
Universidad Nacional de La Plata
OAI Identificador
oai:sedici.unlp.edu.ar:10915/51977

id SEDICI_4743cc8b9bcf52060999d322bf4af3d9
oai_identifier_str oai:sedici.unlp.edu.ar:10915/51977
network_acronym_str SEDICI
repository_id_str 1329
network_name_str SEDICI (UNLP)
spelling Discovering network relations in big time series with application to bioinformaticsRubiolo, MarianoMilone, Diego H.Stegmayer, GeorginaCiencias Informáticasbig dataGenesNeural netsbioinformaticsBig Data concerns large-volume, complex and growing data sets, with multiple and autonomous sources. It is now rapidly expanding in all science and engineering domains. Time series represent an important class of big data that can be obtained from several applications, such as medicine (electrocardiogram), environmental (daily temperature), financial (weekly sales totals, and prices of mutual funds and stocks), as well as from many areas, such as socialnetworks and biology. Bioinformatics seeks to provide tools and analyses that facilitate understanding of living systems, by analyzing and correlating biological information. In particular, as increasingly large amounts of genes information have become available in the last years, more efficient algorithms for dealing with such big data in genomics are required. There is an increasing interest in this field for the discovery of the network of regulations among a group of genes, named Gene Regulation Networks (GRN), by analyzing the genes expression profiles represented as timeseries. In it has been proposed the GRNNminer method, which allows discovering the subyacent GRN among a group of genes, through the proper modeling of the temporal dynamics of the gene expression profiles with artificial neural networks. However, it implies building and training a pool of neural models for each possible gentogen relationship, which derives in executing a very large set of experiments with O( n 2 ) order, where n is the total of involved genes. This work presents a proposal for dramatically reducing such experiments number to O( (n/k)2 ) when big timeseries is involved for reconstructing a GRN from such data, by previously clustering genes profiles in k groups using selforganizing maps (SOM). This way, the GRNNminer can be applied over smaller sets of timeseries, only those appearing in the same cluster.Sociedad Argentina de Informática e Investigación Operativa (SADIO)2015-09info:eu-repo/semantics/conferenceObjectinfo:eu-repo/semantics/publishedVersionObjeto de conferenciahttp://purl.org/coar/resource_type/c_5794info:ar-repo/semantics/documentoDeConferenciaapplication/pdf41-42http://sedici.unlp.edu.ar/handle/10915/51977enginfo:eu-repo/semantics/altIdentifier/url/http://44jaiio.sadio.org.ar/sites/default/files/agranda41-42.pdfinfo:eu-repo/semantics/altIdentifier/issn/2451-7569info:eu-repo/semantics/openAccesshttp://creativecommons.org/licenses/by-sa/3.0/Creative Commons Attribution-ShareAlike 3.0 Unported (CC BY-SA 3.0)reponame:SEDICI (UNLP)instname:Universidad Nacional de La Platainstacron:UNLP2025-09-29T11:04:30Zoai:sedici.unlp.edu.ar:10915/51977Institucionalhttp://sedici.unlp.edu.ar/Universidad públicaNo correspondehttp://sedici.unlp.edu.ar/oai/snrdalira@sedici.unlp.edu.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:13292025-09-29 11:04:30.506SEDICI (UNLP) - Universidad Nacional de La Platafalse
dc.title.none.fl_str_mv Discovering network relations in big time series with application to bioinformatics
title Discovering network relations in big time series with application to bioinformatics
spellingShingle Discovering network relations in big time series with application to bioinformatics
Rubiolo, Mariano
Ciencias Informáticas
big data
Genes
Neural nets
bioinformatics
title_short Discovering network relations in big time series with application to bioinformatics
title_full Discovering network relations in big time series with application to bioinformatics
title_fullStr Discovering network relations in big time series with application to bioinformatics
title_full_unstemmed Discovering network relations in big time series with application to bioinformatics
title_sort Discovering network relations in big time series with application to bioinformatics
dc.creator.none.fl_str_mv Rubiolo, Mariano
Milone, Diego H.
Stegmayer, Georgina
author Rubiolo, Mariano
author_facet Rubiolo, Mariano
Milone, Diego H.
Stegmayer, Georgina
author_role author
author2 Milone, Diego H.
Stegmayer, Georgina
author2_role author
author
dc.subject.none.fl_str_mv Ciencias Informáticas
big data
Genes
Neural nets
bioinformatics
topic Ciencias Informáticas
big data
Genes
Neural nets
bioinformatics
dc.description.none.fl_txt_mv Big Data concerns large-volume, complex and growing data sets, with multiple and autonomous sources. It is now rapidly expanding in all science and engineering domains. Time series represent an important class of big data that can be obtained from several applications, such as medicine (electrocardiogram), environmental (daily temperature), financial (weekly sales totals, and prices of mutual funds and stocks), as well as from many areas, such as socialnetworks and biology. Bioinformatics seeks to provide tools and analyses that facilitate understanding of living systems, by analyzing and correlating biological information. In particular, as increasingly large amounts of genes information have become available in the last years, more efficient algorithms for dealing with such big data in genomics are required. There is an increasing interest in this field for the discovery of the network of regulations among a group of genes, named Gene Regulation Networks (GRN), by analyzing the genes expression profiles represented as timeseries. In it has been proposed the GRNNminer method, which allows discovering the subyacent GRN among a group of genes, through the proper modeling of the temporal dynamics of the gene expression profiles with artificial neural networks. However, it implies building and training a pool of neural models for each possible gentogen relationship, which derives in executing a very large set of experiments with O( n 2 ) order, where n is the total of involved genes. This work presents a proposal for dramatically reducing such experiments number to O( (n/k)2 ) when big timeseries is involved for reconstructing a GRN from such data, by previously clustering genes profiles in k groups using selforganizing maps (SOM). This way, the GRNNminer can be applied over smaller sets of timeseries, only those appearing in the same cluster.
Sociedad Argentina de Informática e Investigación Operativa (SADIO)
description Big Data concerns large-volume, complex and growing data sets, with multiple and autonomous sources. It is now rapidly expanding in all science and engineering domains. Time series represent an important class of big data that can be obtained from several applications, such as medicine (electrocardiogram), environmental (daily temperature), financial (weekly sales totals, and prices of mutual funds and stocks), as well as from many areas, such as socialnetworks and biology. Bioinformatics seeks to provide tools and analyses that facilitate understanding of living systems, by analyzing and correlating biological information. In particular, as increasingly large amounts of genes information have become available in the last years, more efficient algorithms for dealing with such big data in genomics are required. There is an increasing interest in this field for the discovery of the network of regulations among a group of genes, named Gene Regulation Networks (GRN), by analyzing the genes expression profiles represented as timeseries. In it has been proposed the GRNNminer method, which allows discovering the subyacent GRN among a group of genes, through the proper modeling of the temporal dynamics of the gene expression profiles with artificial neural networks. However, it implies building and training a pool of neural models for each possible gentogen relationship, which derives in executing a very large set of experiments with O( n 2 ) order, where n is the total of involved genes. This work presents a proposal for dramatically reducing such experiments number to O( (n/k)2 ) when big timeseries is involved for reconstructing a GRN from such data, by previously clustering genes profiles in k groups using selforganizing maps (SOM). This way, the GRNNminer can be applied over smaller sets of timeseries, only those appearing in the same cluster.
publishDate 2015
dc.date.none.fl_str_mv 2015-09
dc.type.none.fl_str_mv info:eu-repo/semantics/conferenceObject
info:eu-repo/semantics/publishedVersion
Objeto de conferencia
http://purl.org/coar/resource_type/c_5794
info:ar-repo/semantics/documentoDeConferencia
format conferenceObject
status_str publishedVersion
dc.identifier.none.fl_str_mv http://sedici.unlp.edu.ar/handle/10915/51977
url http://sedici.unlp.edu.ar/handle/10915/51977
dc.language.none.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv info:eu-repo/semantics/altIdentifier/url/http://44jaiio.sadio.org.ar/sites/default/files/agranda41-42.pdf
info:eu-repo/semantics/altIdentifier/issn/2451-7569
dc.rights.none.fl_str_mv info:eu-repo/semantics/openAccess
http://creativecommons.org/licenses/by-sa/3.0/
Creative Commons Attribution-ShareAlike 3.0 Unported (CC BY-SA 3.0)
eu_rights_str_mv openAccess
rights_invalid_str_mv http://creativecommons.org/licenses/by-sa/3.0/
Creative Commons Attribution-ShareAlike 3.0 Unported (CC BY-SA 3.0)
dc.format.none.fl_str_mv application/pdf
41-42
dc.source.none.fl_str_mv reponame:SEDICI (UNLP)
instname:Universidad Nacional de La Plata
instacron:UNLP
reponame_str SEDICI (UNLP)
collection SEDICI (UNLP)
instname_str Universidad Nacional de La Plata
instacron_str UNLP
institution UNLP
repository.name.fl_str_mv SEDICI (UNLP) - Universidad Nacional de La Plata
repository.mail.fl_str_mv alira@sedici.unlp.edu.ar
_version_ 1844615913676996608
score 13.070432