Identification of biological properties in organismsusing Machine Learning techniques on wholegenome sequences

Autores: Ferella, Nicolas; Pizio, Pablo
Año de publicación: 2023
Idioma: español castellano
Tipo de recurso: documento de conferencia
Estado: versión publicada
Descripción: The advance in technology and genome sequencing processes in the recentdecades have made large volumes of biological data available to researchers fromall over the world, which, due to the large scales, are difficult to analyze in theirentirety. Therefore, it is intuitive to think of Artificial Intelligence to work withsuch information.In order to reduce the existing gap between the researchers and the ArtificialIntelligence tools, a software was developed that allows the creation of a works-pace for biological organisms, the processing of its corresponding genomes, andthe creation and training of models of Machine Learning, everything using asimple (yet powerful) graphical interface.The trained models are then analyzed to find which patterns determine theresult of the property that is being investigated on the biological organism,finding in the process the genes with the greatest impact on the model’s predic-tions, allowing the researcher to subsequently analyze the desired genes in thelaboratory, saving time and resources in the process
El avance de la tecnología y los procesos de secuenciación de genomas de las últimas décadas ha logrado poner al alcance de investigadores de todo el mundo grandes volúmenes de datos biológicos, que debido a su gran escala, los mismos resultan difíciles de analizar en su totalidad, por lo cual es intuitivo pensar en Inteligencia Artificial para trabajar con dicha información. Con el objetivo de disminuir la brecha existente entre el investigador y las herramientas de Inteligencia Artificial, se desarrolló un software que permite crear un espacio de trabajo para un organismo biológico, realizar el procesamiento de los genomas correspondientes y permitir la creación y entrenamiento de modelos de Machine Learning desde una interfaz gráfica. Los modelos entrenados luego se analizan para buscar qué patrones determinan el resultado de la propiedad biológica a investigar sobre el organismo biológico en cuestión, y así encontrar los genes de mayor impacto en las predicciones del modelo, permitiendo al investigador el posterior análisis en laboratorio de un gen deseado.
Materia: Ciencias de la Computación e Información
Artificial Intelligence
Genetics
Big Data
DNA
Machine Learning
Inteligencia Artificial
Genética
Nivel de accesibilidad: acceso abierto
Condiciones de uso: http://creativecommons.org/licenses/by-nc-sa/4.0/
Repositorio
Institución: Comisión de Investigaciones Científicas de la Provincia de Buenos Aires
OAI Identificador: oai:digital.cic.gba.gob.ar:11746/12460

Acceder

id	CICBA_2fb1b2b728ab3ccff35d707cd6a06b37
oai_identifier_str	oai:digital.cic.gba.gob.ar:11746/12460
network_acronym_str	CICBA
repository_id_str	9441
network_name_str	CIC Digital (CICBA)
spelling	Identification of biological properties in organismsusing Machine Learning techniques on wholegenome sequencesFerella, NicolasPizio, PabloCiencias de la Computación e InformaciónArtificial IntelligenceGeneticsBig DataDNAMachine LearningInteligencia ArtificialGenéticaThe advance in technology and genome sequencing processes in the recentdecades have made large volumes of biological data available to researchers fromall over the world, which, due to the large scales, are difficult to analyze in theirentirety. Therefore, it is intuitive to think of Artificial Intelligence to work withsuch information.In order to reduce the existing gap between the researchers and the ArtificialIntelligence tools, a software was developed that allows the creation of a works-pace for biological organisms, the processing of its corresponding genomes, andthe creation and training of models of Machine Learning, everything using asimple (yet powerful) graphical interface.The trained models are then analyzed to find which patterns determine theresult of the property that is being investigated on the biological organism,finding in the process the genes with the greatest impact on the model’s predic-tions, allowing the researcher to subsequently analyze the desired genes in thelaboratory, saving time and resources in the processEl avance de la tecnología y los procesos de secuenciación de genomas de las últimas décadas ha logrado poner al alcance de investigadores de todo el mundo grandes volúmenes de datos biológicos, que debido a su gran escala, los mismos resultan difíciles de analizar en su totalidad, por lo cual es intuitivo pensar en Inteligencia Artificial para trabajar con dicha información. Con el objetivo de disminuir la brecha existente entre el investigador y las herramientas de Inteligencia Artificial, se desarrolló un software que permite crear un espacio de trabajo para un organismo biológico, realizar el procesamiento de los genomas correspondientes y permitir la creación y entrenamiento de modelos de Machine Learning desde una interfaz gráfica. Los modelos entrenados luego se analizan para buscar qué patrones determinan el resultado de la propiedad biológica a investigar sobre el organismo biológico en cuestión, y así encontrar los genes de mayor impacto en las predicciones del modelo, permitiendo al investigador el posterior análisis en laboratorio de un gen deseado.2023info:eu-repo/semantics/conferenceObjectinfo:eu-repo/semantics/publishedVersionhttp://purl.org/coar/resource_type/c_5794info:ar-repo/semantics/documentoDeConferenciaapplication/pdfhttps://digital.cic.gba.gob.ar/handle/11746/12460spainfo:eu-repo/semantics/altIdentifier/issn/2451-7496info:eu-repo/semantics/openAccesshttp://creativecommons.org/licenses/by-nc-sa/4.0/reponame:CIC Digital (CICBA)instname:Comisión de Investigaciones Científicas de la Provincia de Buenos Airesinstacron:CICBA2025-10-23T11:14:10Zoai:digital.cic.gba.gob.ar:11746/12460Institucionalhttp://digital.cic.gba.gob.arOrganismo científico-tecnológicoNo correspondehttp://digital.cic.gba.gob.ar/oai/snrdmarisa.degiusti@sedici.unlp.edu.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:94412025-10-23 11:14:10.344CIC Digital (CICBA) - Comisión de Investigaciones Científicas de la Provincia de Buenos Airesfalse
dc.title.none.fl_str_mv	Identification of biological properties in organismsusing Machine Learning techniques on wholegenome sequences
title	Identification of biological properties in organismsusing Machine Learning techniques on wholegenome sequences
spellingShingle	Identification of biological properties in organismsusing Machine Learning techniques on wholegenome sequences Ferella, Nicolas Ciencias de la Computación e Información Artificial Intelligence Genetics Big Data DNA Machine Learning Inteligencia Artificial Genética
title_short	Identification of biological properties in organismsusing Machine Learning techniques on wholegenome sequences
title_full	Identification of biological properties in organismsusing Machine Learning techniques on wholegenome sequences
title_fullStr	Identification of biological properties in organismsusing Machine Learning techniques on wholegenome sequences
title_full_unstemmed	Identification of biological properties in organismsusing Machine Learning techniques on wholegenome sequences
title_sort	Identification of biological properties in organismsusing Machine Learning techniques on wholegenome sequences
dc.creator.none.fl_str_mv	Ferella, Nicolas Pizio, Pablo
author	Ferella, Nicolas
author_facet	Ferella, Nicolas Pizio, Pablo
author_role	author
author2	Pizio, Pablo
author2_role	author
dc.subject.none.fl_str_mv	Ciencias de la Computación e Información Artificial Intelligence Genetics Big Data DNA Machine Learning Inteligencia Artificial Genética
topic	Ciencias de la Computación e Información Artificial Intelligence Genetics Big Data DNA Machine Learning Inteligencia Artificial Genética
dc.description.none.fl_txt_mv	The advance in technology and genome sequencing processes in the recentdecades have made large volumes of biological data available to researchers fromall over the world, which, due to the large scales, are difficult to analyze in theirentirety. Therefore, it is intuitive to think of Artificial Intelligence to work withsuch information.In order to reduce the existing gap between the researchers and the ArtificialIntelligence tools, a software was developed that allows the creation of a works-pace for biological organisms, the processing of its corresponding genomes, andthe creation and training of models of Machine Learning, everything using asimple (yet powerful) graphical interface.The trained models are then analyzed to find which patterns determine theresult of the property that is being investigated on the biological organism,finding in the process the genes with the greatest impact on the model’s predic-tions, allowing the researcher to subsequently analyze the desired genes in thelaboratory, saving time and resources in the process El avance de la tecnología y los procesos de secuenciación de genomas de las últimas décadas ha logrado poner al alcance de investigadores de todo el mundo grandes volúmenes de datos biológicos, que debido a su gran escala, los mismos resultan difíciles de analizar en su totalidad, por lo cual es intuitivo pensar en Inteligencia Artificial para trabajar con dicha información. Con el objetivo de disminuir la brecha existente entre el investigador y las herramientas de Inteligencia Artificial, se desarrolló un software que permite crear un espacio de trabajo para un organismo biológico, realizar el procesamiento de los genomas correspondientes y permitir la creación y entrenamiento de modelos de Machine Learning desde una interfaz gráfica. Los modelos entrenados luego se analizan para buscar qué patrones determinan el resultado de la propiedad biológica a investigar sobre el organismo biológico en cuestión, y así encontrar los genes de mayor impacto en las predicciones del modelo, permitiendo al investigador el posterior análisis en laboratorio de un gen deseado.
description	The advance in technology and genome sequencing processes in the recentdecades have made large volumes of biological data available to researchers fromall over the world, which, due to the large scales, are difficult to analyze in theirentirety. Therefore, it is intuitive to think of Artificial Intelligence to work withsuch information.In order to reduce the existing gap between the researchers and the ArtificialIntelligence tools, a software was developed that allows the creation of a works-pace for biological organisms, the processing of its corresponding genomes, andthe creation and training of models of Machine Learning, everything using asimple (yet powerful) graphical interface.The trained models are then analyzed to find which patterns determine theresult of the property that is being investigated on the biological organism,finding in the process the genes with the greatest impact on the model’s predic-tions, allowing the researcher to subsequently analyze the desired genes in thelaboratory, saving time and resources in the process
publishDate	2023
dc.date.none.fl_str_mv	2023
dc.type.none.fl_str_mv	info:eu-repo/semantics/conferenceObject info:eu-repo/semantics/publishedVersion http://purl.org/coar/resource_type/c_5794 info:ar-repo/semantics/documentoDeConferencia
format	conferenceObject
status_str	publishedVersion
dc.identifier.none.fl_str_mv	https://digital.cic.gba.gob.ar/handle/11746/12460
url	https://digital.cic.gba.gob.ar/handle/11746/12460
dc.language.none.fl_str_mv	spa
language	spa
dc.relation.none.fl_str_mv	info:eu-repo/semantics/altIdentifier/issn/2451-7496
dc.rights.none.fl_str_mv	info:eu-repo/semantics/openAccess http://creativecommons.org/licenses/by-nc-sa/4.0/
eu_rights_str_mv	openAccess
rights_invalid_str_mv	http://creativecommons.org/licenses/by-nc-sa/4.0/
dc.format.none.fl_str_mv	application/pdf
dc.source.none.fl_str_mv	reponame:CIC Digital (CICBA) instname:Comisión de Investigaciones Científicas de la Provincia de Buenos Aires instacron:CICBA
reponame_str	CIC Digital (CICBA)
collection	CIC Digital (CICBA)
instname_str	Comisión de Investigaciones Científicas de la Provincia de Buenos Aires
instacron_str	CICBA
institution	CICBA
repository.name.fl_str_mv	CIC Digital (CICBA) - Comisión de Investigaciones Científicas de la Provincia de Buenos Aires
repository.mail.fl_str_mv	marisa.degiusti@sedici.unlp.edu.ar
_version_	1846783872729088000
score	12.982451

Identification of biological properties in organismsusing Machine Learning techniques on wholegenome sequences

Publicaciones similares