TreeSpark: A Distributed Tool for Progeny Analysis based on Spark
- Autores
- López, Paula; Hasperué, Waldo; Quiroga, Facundo Manuel; Ronchetti, Franco
- Año de publicación
- 2021
- Idioma
- inglés
- Tipo de recurso
- documento de conferencia
- Estado
- versión publicada
- Descripción
- Progeny analyses are useful in biological sciences for various purposes, such as improving individuals in new generations or carrying out molecular analysis of the transmission of genetic characteristics. Analyzing these data by making comparisons between individuals of a generation with their offspring is not a trivial task, and increases in complexity as more and more generations are incorporated. In this article, we present TreeSpark, an open source tool to carry out progeny analysis and provides functionality that allows simple access to the information of the individuals and their relations both as progenitors and descendants. This tool is developed as a Python module, which in turn inherits the distributed processing features of Spark, allowing it to process large volumes of progeny information. TreeSpark is compared with other similar tools, finding TreeSpark much simpler to use.
Workshop: WBDMD - Base de Datos y Minería de Datos
Red de Universidades con Carreras en Informática - Materia
-
Ciencias Informáticas
Spark
Big data
Progeny analysis
Genealogy
Analytics - Nivel de accesibilidad
- acceso abierto
- Condiciones de uso
- http://creativecommons.org/licenses/by-nc-sa/4.0/
- Repositorio
.jpg)
- Institución
- Universidad Nacional de La Plata
- OAI Identificador
- oai:sedici.unlp.edu.ar:10915/130340
Ver los metadatos del registro completo
| id |
SEDICI_ddd658f6baa4e93ac3861391555aa800 |
|---|---|
| oai_identifier_str |
oai:sedici.unlp.edu.ar:10915/130340 |
| network_acronym_str |
SEDICI |
| repository_id_str |
1329 |
| network_name_str |
SEDICI (UNLP) |
| spelling |
TreeSpark: A Distributed Tool for Progeny Analysis based on SparkLópez, PaulaHasperué, WaldoQuiroga, Facundo ManuelRonchetti, FrancoCiencias InformáticasSparkBig dataProgeny analysisGenealogyAnalyticsProgeny analyses are useful in biological sciences for various purposes, such as improving individuals in new generations or carrying out molecular analysis of the transmission of genetic characteristics. Analyzing these data by making comparisons between individuals of a generation with their offspring is not a trivial task, and increases in complexity as more and more generations are incorporated. In this article, we present TreeSpark, an open source tool to carry out progeny analysis and provides functionality that allows simple access to the information of the individuals and their relations both as progenitors and descendants. This tool is developed as a Python module, which in turn inherits the distributed processing features of Spark, allowing it to process large volumes of progeny information. TreeSpark is compared with other similar tools, finding TreeSpark much simpler to use.Workshop: WBDMD - Base de Datos y Minería de DatosRed de Universidades con Carreras en Informática2021-10info:eu-repo/semantics/conferenceObjectinfo:eu-repo/semantics/publishedVersionObjeto de conferenciahttp://purl.org/coar/resource_type/c_5794info:ar-repo/semantics/documentoDeConferenciaapplication/pdf251-260http://sedici.unlp.edu.ar/handle/10915/130340enginfo:eu-repo/semantics/altIdentifier/isbn/978-987-633-574-4info:eu-repo/semantics/reference/hdl/10915/129809info:eu-repo/semantics/openAccesshttp://creativecommons.org/licenses/by-nc-sa/4.0/Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)reponame:SEDICI (UNLP)instname:Universidad Nacional de La Platainstacron:UNLP2025-11-12T10:57:02Zoai:sedici.unlp.edu.ar:10915/130340Institucionalhttp://sedici.unlp.edu.ar/Universidad públicaNo correspondehttp://sedici.unlp.edu.ar/oai/snrdalira@sedici.unlp.edu.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:13292025-11-12 10:57:02.977SEDICI (UNLP) - Universidad Nacional de La Platafalse |
| dc.title.none.fl_str_mv |
TreeSpark: A Distributed Tool for Progeny Analysis based on Spark |
| title |
TreeSpark: A Distributed Tool for Progeny Analysis based on Spark |
| spellingShingle |
TreeSpark: A Distributed Tool for Progeny Analysis based on Spark López, Paula Ciencias Informáticas Spark Big data Progeny analysis Genealogy Analytics |
| title_short |
TreeSpark: A Distributed Tool for Progeny Analysis based on Spark |
| title_full |
TreeSpark: A Distributed Tool for Progeny Analysis based on Spark |
| title_fullStr |
TreeSpark: A Distributed Tool for Progeny Analysis based on Spark |
| title_full_unstemmed |
TreeSpark: A Distributed Tool for Progeny Analysis based on Spark |
| title_sort |
TreeSpark: A Distributed Tool for Progeny Analysis based on Spark |
| dc.creator.none.fl_str_mv |
López, Paula Hasperué, Waldo Quiroga, Facundo Manuel Ronchetti, Franco |
| author |
López, Paula |
| author_facet |
López, Paula Hasperué, Waldo Quiroga, Facundo Manuel Ronchetti, Franco |
| author_role |
author |
| author2 |
Hasperué, Waldo Quiroga, Facundo Manuel Ronchetti, Franco |
| author2_role |
author author author |
| dc.subject.none.fl_str_mv |
Ciencias Informáticas Spark Big data Progeny analysis Genealogy Analytics |
| topic |
Ciencias Informáticas Spark Big data Progeny analysis Genealogy Analytics |
| dc.description.none.fl_txt_mv |
Progeny analyses are useful in biological sciences for various purposes, such as improving individuals in new generations or carrying out molecular analysis of the transmission of genetic characteristics. Analyzing these data by making comparisons between individuals of a generation with their offspring is not a trivial task, and increases in complexity as more and more generations are incorporated. In this article, we present TreeSpark, an open source tool to carry out progeny analysis and provides functionality that allows simple access to the information of the individuals and their relations both as progenitors and descendants. This tool is developed as a Python module, which in turn inherits the distributed processing features of Spark, allowing it to process large volumes of progeny information. TreeSpark is compared with other similar tools, finding TreeSpark much simpler to use. Workshop: WBDMD - Base de Datos y Minería de Datos Red de Universidades con Carreras en Informática |
| description |
Progeny analyses are useful in biological sciences for various purposes, such as improving individuals in new generations or carrying out molecular analysis of the transmission of genetic characteristics. Analyzing these data by making comparisons between individuals of a generation with their offspring is not a trivial task, and increases in complexity as more and more generations are incorporated. In this article, we present TreeSpark, an open source tool to carry out progeny analysis and provides functionality that allows simple access to the information of the individuals and their relations both as progenitors and descendants. This tool is developed as a Python module, which in turn inherits the distributed processing features of Spark, allowing it to process large volumes of progeny information. TreeSpark is compared with other similar tools, finding TreeSpark much simpler to use. |
| publishDate |
2021 |
| dc.date.none.fl_str_mv |
2021-10 |
| dc.type.none.fl_str_mv |
info:eu-repo/semantics/conferenceObject info:eu-repo/semantics/publishedVersion Objeto de conferencia http://purl.org/coar/resource_type/c_5794 info:ar-repo/semantics/documentoDeConferencia |
| format |
conferenceObject |
| status_str |
publishedVersion |
| dc.identifier.none.fl_str_mv |
http://sedici.unlp.edu.ar/handle/10915/130340 |
| url |
http://sedici.unlp.edu.ar/handle/10915/130340 |
| dc.language.none.fl_str_mv |
eng |
| language |
eng |
| dc.relation.none.fl_str_mv |
info:eu-repo/semantics/altIdentifier/isbn/978-987-633-574-4 info:eu-repo/semantics/reference/hdl/10915/129809 |
| dc.rights.none.fl_str_mv |
info:eu-repo/semantics/openAccess http://creativecommons.org/licenses/by-nc-sa/4.0/ Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) |
| eu_rights_str_mv |
openAccess |
| rights_invalid_str_mv |
http://creativecommons.org/licenses/by-nc-sa/4.0/ Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) |
| dc.format.none.fl_str_mv |
application/pdf 251-260 |
| dc.source.none.fl_str_mv |
reponame:SEDICI (UNLP) instname:Universidad Nacional de La Plata instacron:UNLP |
| reponame_str |
SEDICI (UNLP) |
| collection |
SEDICI (UNLP) |
| instname_str |
Universidad Nacional de La Plata |
| instacron_str |
UNLP |
| institution |
UNLP |
| repository.name.fl_str_mv |
SEDICI (UNLP) - Universidad Nacional de La Plata |
| repository.mail.fl_str_mv |
alira@sedici.unlp.edu.ar |
| _version_ |
1848605678774517760 |
| score |
13.24909 |