Comparison of algorithms to infer genetic population structure from unlinked molecular markers
- Autores
- Peña Malavera, Andrea Natalia; Fernandez, Elmer Andres; Bruno, Cecilia Ines; Balzarini, Monica Graciela
- Año de publicación
- 2014
- Idioma
- inglés
- Tipo de recurso
- artículo
- Estado
- versión publicada
- Descripción
- Identifying population genetic structure (PGS) is crucial for breeding and conservation. Several clustering algorithms are available to identify the underlying PGS to be used with genetic data of maize genotypes. In this work, six methods to identify PGS from unlinked molecular marker data were compared using simulated and experimental data consisting of multilocus-biallelic genotypes. Datasets were delineated under different biological scenarios characterized by three levels of genetic divergence among populations (low, medium, and high FST) and two numbers of sub-populations (K=3 and K=5). The relative performance of hierarchical and non-hierarchical clustering, as well as model-based clustering (STRUCTURE) and clustering from neural networks (SOM-RP-Q). We use the clustering error rate of genotypes into discrete sub-populations as comparison criterion. In scenarios with great level of divergence among genotype groups all methods performed well. With moderate level of genetic divergence (FST=0.2), the algorithms SOM-RP-Q and STRUCTURE performed better than hierarchical and non-hierarchical clustering. In all simulated scenarios with low genetic divergence and in the experimental SNP maize panel (largely unlinked), SOM-RP-Q achieved the lowest clustering error rate. The SOM algorithm used here is more effective than other evaluated methods for sparse unlinked genetic data.
Fil: Peña Malavera, Andrea Natalia. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina. Universidad Nacional de Córdoba. Facultad de Ciencias Agropecuarias; Argentina
Fil: Fernandez, Elmer Andres. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina. Universidad Nacional de Córdoba. Facultad de Ciencias Agropecuarias; Argentina
Fil: Bruno, Cecilia Ines. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina. Universidad Catolica de Córdoba. Facultad de Ingeniería; Argentina
Fil: Balzarini, Monica Graciela. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina. Universidad Nacional de Córdoba. Facultad de Ciencias Agropecuarias; Argentina - Materia
-
Cluster Analysis
Multilocus-Biallelic Genotypes
Plant Breeding
Self-Organizing Maps - Nivel de accesibilidad
- acceso abierto
- Condiciones de uso
- https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
- Repositorio
- Institución
- Consejo Nacional de Investigaciones Científicas y Técnicas
- OAI Identificador
- oai:ri.conicet.gov.ar:11336/34261
Ver los metadatos del registro completo
id |
CONICETDig_597cd8e2f5fae6db6203b027713682d0 |
---|---|
oai_identifier_str |
oai:ri.conicet.gov.ar:11336/34261 |
network_acronym_str |
CONICETDig |
repository_id_str |
3498 |
network_name_str |
CONICET Digital (CONICET) |
spelling |
Comparison of algorithms to infer genetic population structure from unlinked molecular markersPeña Malavera, Andrea NataliaFernandez, Elmer AndresBruno, Cecilia InesBalzarini, Monica GracielaCluster AnalysisMultilocus-Biallelic GenotypesPlant BreedingSelf-Organizing Mapshttps://purl.org/becyt/ford/1.2https://purl.org/becyt/ford/1Identifying population genetic structure (PGS) is crucial for breeding and conservation. Several clustering algorithms are available to identify the underlying PGS to be used with genetic data of maize genotypes. In this work, six methods to identify PGS from unlinked molecular marker data were compared using simulated and experimental data consisting of multilocus-biallelic genotypes. Datasets were delineated under different biological scenarios characterized by three levels of genetic divergence among populations (low, medium, and high FST) and two numbers of sub-populations (K=3 and K=5). The relative performance of hierarchical and non-hierarchical clustering, as well as model-based clustering (STRUCTURE) and clustering from neural networks (SOM-RP-Q). We use the clustering error rate of genotypes into discrete sub-populations as comparison criterion. In scenarios with great level of divergence among genotype groups all methods performed well. With moderate level of genetic divergence (FST=0.2), the algorithms SOM-RP-Q and STRUCTURE performed better than hierarchical and non-hierarchical clustering. In all simulated scenarios with low genetic divergence and in the experimental SNP maize panel (largely unlinked), SOM-RP-Q achieved the lowest clustering error rate. The SOM algorithm used here is more effective than other evaluated methods for sparse unlinked genetic data.Fil: Peña Malavera, Andrea Natalia. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina. Universidad Nacional de Córdoba. Facultad de Ciencias Agropecuarias; ArgentinaFil: Fernandez, Elmer Andres. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina. Universidad Nacional de Córdoba. Facultad de Ciencias Agropecuarias; ArgentinaFil: Bruno, Cecilia Ines. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina. Universidad Catolica de Córdoba. Facultad de Ingeniería; ArgentinaFil: Balzarini, Monica Graciela. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina. Universidad Nacional de Córdoba. Facultad de Ciencias Agropecuarias; ArgentinaBerkeley Electronic Press2014-06info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionhttp://purl.org/coar/resource_type/c_6501info:ar-repo/semantics/articuloapplication/pdfapplication/pdfapplication/pdfapplication/pdfhttp://hdl.handle.net/11336/34261Peña Malavera, Andrea Natalia; Fernandez, Elmer Andres; Bruno, Cecilia Ines; Balzarini, Monica Graciela; Comparison of algorithms to infer genetic population structure from unlinked molecular markers; Berkeley Electronic Press; Statistical Applications In Genetics And Molecular Biology; 13; 4; 6-2014; 391-4021544-6115CONICET DigitalCONICETenginfo:eu-repo/semantics/altIdentifier/doi/10.1515/sagmb-2013-0006info:eu-repo/semantics/altIdentifier/url/https://www.degruyter.com/view/j/sagmb.2014.13.issue-4/sagmb-2013-0006/sagmb-2013-0006.xmlinfo:eu-repo/semantics/openAccesshttps://creativecommons.org/licenses/by-nc-sa/2.5/ar/reponame:CONICET Digital (CONICET)instname:Consejo Nacional de Investigaciones Científicas y Técnicas2025-10-15T14:22:21Zoai:ri.conicet.gov.ar:11336/34261instacron:CONICETInstitucionalhttp://ri.conicet.gov.ar/Organismo científico-tecnológicoNo correspondehttp://ri.conicet.gov.ar/oai/requestdasensio@conicet.gov.ar; lcarlino@conicet.gov.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:34982025-10-15 14:22:21.368CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicasfalse |
dc.title.none.fl_str_mv |
Comparison of algorithms to infer genetic population structure from unlinked molecular markers |
title |
Comparison of algorithms to infer genetic population structure from unlinked molecular markers |
spellingShingle |
Comparison of algorithms to infer genetic population structure from unlinked molecular markers Peña Malavera, Andrea Natalia Cluster Analysis Multilocus-Biallelic Genotypes Plant Breeding Self-Organizing Maps |
title_short |
Comparison of algorithms to infer genetic population structure from unlinked molecular markers |
title_full |
Comparison of algorithms to infer genetic population structure from unlinked molecular markers |
title_fullStr |
Comparison of algorithms to infer genetic population structure from unlinked molecular markers |
title_full_unstemmed |
Comparison of algorithms to infer genetic population structure from unlinked molecular markers |
title_sort |
Comparison of algorithms to infer genetic population structure from unlinked molecular markers |
dc.creator.none.fl_str_mv |
Peña Malavera, Andrea Natalia Fernandez, Elmer Andres Bruno, Cecilia Ines Balzarini, Monica Graciela |
author |
Peña Malavera, Andrea Natalia |
author_facet |
Peña Malavera, Andrea Natalia Fernandez, Elmer Andres Bruno, Cecilia Ines Balzarini, Monica Graciela |
author_role |
author |
author2 |
Fernandez, Elmer Andres Bruno, Cecilia Ines Balzarini, Monica Graciela |
author2_role |
author author author |
dc.subject.none.fl_str_mv |
Cluster Analysis Multilocus-Biallelic Genotypes Plant Breeding Self-Organizing Maps |
topic |
Cluster Analysis Multilocus-Biallelic Genotypes Plant Breeding Self-Organizing Maps |
purl_subject.fl_str_mv |
https://purl.org/becyt/ford/1.2 https://purl.org/becyt/ford/1 |
dc.description.none.fl_txt_mv |
Identifying population genetic structure (PGS) is crucial for breeding and conservation. Several clustering algorithms are available to identify the underlying PGS to be used with genetic data of maize genotypes. In this work, six methods to identify PGS from unlinked molecular marker data were compared using simulated and experimental data consisting of multilocus-biallelic genotypes. Datasets were delineated under different biological scenarios characterized by three levels of genetic divergence among populations (low, medium, and high FST) and two numbers of sub-populations (K=3 and K=5). The relative performance of hierarchical and non-hierarchical clustering, as well as model-based clustering (STRUCTURE) and clustering from neural networks (SOM-RP-Q). We use the clustering error rate of genotypes into discrete sub-populations as comparison criterion. In scenarios with great level of divergence among genotype groups all methods performed well. With moderate level of genetic divergence (FST=0.2), the algorithms SOM-RP-Q and STRUCTURE performed better than hierarchical and non-hierarchical clustering. In all simulated scenarios with low genetic divergence and in the experimental SNP maize panel (largely unlinked), SOM-RP-Q achieved the lowest clustering error rate. The SOM algorithm used here is more effective than other evaluated methods for sparse unlinked genetic data. Fil: Peña Malavera, Andrea Natalia. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina. Universidad Nacional de Córdoba. Facultad de Ciencias Agropecuarias; Argentina Fil: Fernandez, Elmer Andres. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina. Universidad Nacional de Córdoba. Facultad de Ciencias Agropecuarias; Argentina Fil: Bruno, Cecilia Ines. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina. Universidad Catolica de Córdoba. Facultad de Ingeniería; Argentina Fil: Balzarini, Monica Graciela. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina. Universidad Nacional de Córdoba. Facultad de Ciencias Agropecuarias; Argentina |
description |
Identifying population genetic structure (PGS) is crucial for breeding and conservation. Several clustering algorithms are available to identify the underlying PGS to be used with genetic data of maize genotypes. In this work, six methods to identify PGS from unlinked molecular marker data were compared using simulated and experimental data consisting of multilocus-biallelic genotypes. Datasets were delineated under different biological scenarios characterized by three levels of genetic divergence among populations (low, medium, and high FST) and two numbers of sub-populations (K=3 and K=5). The relative performance of hierarchical and non-hierarchical clustering, as well as model-based clustering (STRUCTURE) and clustering from neural networks (SOM-RP-Q). We use the clustering error rate of genotypes into discrete sub-populations as comparison criterion. In scenarios with great level of divergence among genotype groups all methods performed well. With moderate level of genetic divergence (FST=0.2), the algorithms SOM-RP-Q and STRUCTURE performed better than hierarchical and non-hierarchical clustering. In all simulated scenarios with low genetic divergence and in the experimental SNP maize panel (largely unlinked), SOM-RP-Q achieved the lowest clustering error rate. The SOM algorithm used here is more effective than other evaluated methods for sparse unlinked genetic data. |
publishDate |
2014 |
dc.date.none.fl_str_mv |
2014-06 |
dc.type.none.fl_str_mv |
info:eu-repo/semantics/article info:eu-repo/semantics/publishedVersion http://purl.org/coar/resource_type/c_6501 info:ar-repo/semantics/articulo |
format |
article |
status_str |
publishedVersion |
dc.identifier.none.fl_str_mv |
http://hdl.handle.net/11336/34261 Peña Malavera, Andrea Natalia; Fernandez, Elmer Andres; Bruno, Cecilia Ines; Balzarini, Monica Graciela; Comparison of algorithms to infer genetic population structure from unlinked molecular markers; Berkeley Electronic Press; Statistical Applications In Genetics And Molecular Biology; 13; 4; 6-2014; 391-402 1544-6115 CONICET Digital CONICET |
url |
http://hdl.handle.net/11336/34261 |
identifier_str_mv |
Peña Malavera, Andrea Natalia; Fernandez, Elmer Andres; Bruno, Cecilia Ines; Balzarini, Monica Graciela; Comparison of algorithms to infer genetic population structure from unlinked molecular markers; Berkeley Electronic Press; Statistical Applications In Genetics And Molecular Biology; 13; 4; 6-2014; 391-402 1544-6115 CONICET Digital CONICET |
dc.language.none.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
info:eu-repo/semantics/altIdentifier/doi/10.1515/sagmb-2013-0006 info:eu-repo/semantics/altIdentifier/url/https://www.degruyter.com/view/j/sagmb.2014.13.issue-4/sagmb-2013-0006/sagmb-2013-0006.xml |
dc.rights.none.fl_str_mv |
info:eu-repo/semantics/openAccess https://creativecommons.org/licenses/by-nc-sa/2.5/ar/ |
eu_rights_str_mv |
openAccess |
rights_invalid_str_mv |
https://creativecommons.org/licenses/by-nc-sa/2.5/ar/ |
dc.format.none.fl_str_mv |
application/pdf application/pdf application/pdf application/pdf |
dc.publisher.none.fl_str_mv |
Berkeley Electronic Press |
publisher.none.fl_str_mv |
Berkeley Electronic Press |
dc.source.none.fl_str_mv |
reponame:CONICET Digital (CONICET) instname:Consejo Nacional de Investigaciones Científicas y Técnicas |
reponame_str |
CONICET Digital (CONICET) |
collection |
CONICET Digital (CONICET) |
instname_str |
Consejo Nacional de Investigaciones Científicas y Técnicas |
repository.name.fl_str_mv |
CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicas |
repository.mail.fl_str_mv |
dasensio@conicet.gov.ar; lcarlino@conicet.gov.ar |
_version_ |
1846082622329978880 |
score |
13.22299 |