To clean or not to clean phenotypic datasets for outlier plants in genetic analyses?
- Autores
- Alvarez Prado, Santiago; Sanchez, Isabelle; Cabrera Bosquet, Llorenç; Grau, Antonin; Welcker, Claude; Tardieu, François; Hilgert, Nadine
- Año de publicación
- 2019
- Idioma
- inglés
- Tipo de recurso
- artículo
- Estado
- versión publicada
- Descripción
- Based on case studies, we discuss the extent to which genome-wide association studies (GWAS) are affected by outlier plants, i.e. those deviating from the expected distribution on a multi-criteria basis. Using a raw dataset consisting of daily measurements of leaf area, biomass, and plant height for thousands of plants, we tested three different cleaning methods for their effects on genetic analyses. No-cleaning resulted in the highest number of dubious quantitative trait loci, especially at loci with highly unbalanced allelic frequencies. A trade-off was identified between the risk of false-positives (with no-cleaning and/or a low threshold for minor allele frequency) and the risk of missing interesting rare alleles. Cleaning can lower the risk of the latter by making it possible to choose a higher threshold in GWAS.
Fil: Alvarez Prado, Santiago. Université Montpellier II; Francia. Institut National de la Recherche Agronomique; Francia. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina. Universidad de Buenos Aires. Facultad de Agronomía. Departamento de Producción Vegetal. Cátedra de Cerealicultura; Argentina
Fil: Sanchez, Isabelle. Université Montpellier II; Francia. Institut National de la Recherche Agronomique; Francia
Fil: Cabrera Bosquet, Llorenç. Université Montpellier II; Francia. Institut National de la Recherche Agronomique; Francia
Fil: Grau, Antonin. Université Montpellier II; Francia. Institut National de la Recherche Agronomique; Francia
Fil: Welcker, Claude. Université Montpellier II; Francia. Institut National de la Recherche Agronomique; Francia
Fil: Tardieu, François. Université Montpellier II; Francia. Institut National de la Recherche Agronomique; Francia
Fil: Hilgert, Nadine. Institut National de la Recherche Agronomique; Francia. Université Montpellier II; Francia - Materia
-
ALLELE FREQUENCY
GENETIC ANALYSIS
OUTLIERS
PHENOMICS
QUANTITATIVE TRAIT LOCI
STATISTICAL ANALYSIS - Nivel de accesibilidad
- acceso abierto
- Condiciones de uso
- https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
- Repositorio
- Institución
- Consejo Nacional de Investigaciones Científicas y Técnicas
- OAI Identificador
- oai:ri.conicet.gov.ar:11336/160536
Ver los metadatos del registro completo
id |
CONICETDig_400d6131448523130dd2df50a2d3b86b |
---|---|
oai_identifier_str |
oai:ri.conicet.gov.ar:11336/160536 |
network_acronym_str |
CONICETDig |
repository_id_str |
3498 |
network_name_str |
CONICET Digital (CONICET) |
spelling |
To clean or not to clean phenotypic datasets for outlier plants in genetic analyses?Alvarez Prado, SantiagoSanchez, IsabelleCabrera Bosquet, LlorençGrau, AntoninWelcker, ClaudeTardieu, FrançoisHilgert, NadineALLELE FREQUENCYGENETIC ANALYSISOUTLIERSPHENOMICSQUANTITATIVE TRAIT LOCISTATISTICAL ANALYSIShttps://purl.org/becyt/ford/4.1https://purl.org/becyt/ford/4Based on case studies, we discuss the extent to which genome-wide association studies (GWAS) are affected by outlier plants, i.e. those deviating from the expected distribution on a multi-criteria basis. Using a raw dataset consisting of daily measurements of leaf area, biomass, and plant height for thousands of plants, we tested three different cleaning methods for their effects on genetic analyses. No-cleaning resulted in the highest number of dubious quantitative trait loci, especially at loci with highly unbalanced allelic frequencies. A trade-off was identified between the risk of false-positives (with no-cleaning and/or a low threshold for minor allele frequency) and the risk of missing interesting rare alleles. Cleaning can lower the risk of the latter by making it possible to choose a higher threshold in GWAS.Fil: Alvarez Prado, Santiago. Université Montpellier II; Francia. Institut National de la Recherche Agronomique; Francia. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina. Universidad de Buenos Aires. Facultad de Agronomía. Departamento de Producción Vegetal. Cátedra de Cerealicultura; ArgentinaFil: Sanchez, Isabelle. Université Montpellier II; Francia. Institut National de la Recherche Agronomique; FranciaFil: Cabrera Bosquet, Llorenç. Université Montpellier II; Francia. Institut National de la Recherche Agronomique; FranciaFil: Grau, Antonin. Université Montpellier II; Francia. Institut National de la Recherche Agronomique; FranciaFil: Welcker, Claude. Université Montpellier II; Francia. Institut National de la Recherche Agronomique; FranciaFil: Tardieu, François. Université Montpellier II; Francia. Institut National de la Recherche Agronomique; FranciaFil: Hilgert, Nadine. Institut National de la Recherche Agronomique; Francia. Université Montpellier II; FranciaOxford University Press2019-04info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionhttp://purl.org/coar/resource_type/c_6501info:ar-repo/semantics/articuloapplication/pdfapplication/pdfapplication/pdfhttp://hdl.handle.net/11336/160536Alvarez Prado, Santiago; Sanchez, Isabelle; Cabrera Bosquet, Llorenç; Grau, Antonin; Welcker, Claude; et al.; To clean or not to clean phenotypic datasets for outlier plants in genetic analyses?; Oxford University Press; Journal of Experimental Botany; 70; 15; 4-2019; 3693-36980022-0957CONICET DigitalCONICETenginfo:eu-repo/semantics/altIdentifier/doi/10.1093/jxb/erz191info:eu-repo/semantics/altIdentifier/url/https://academic.oup.com/jxb/article/70/15/3693/5479455info:eu-repo/semantics/openAccesshttps://creativecommons.org/licenses/by-nc-sa/2.5/ar/reponame:CONICET Digital (CONICET)instname:Consejo Nacional de Investigaciones Científicas y Técnicas2025-09-29T09:34:16Zoai:ri.conicet.gov.ar:11336/160536instacron:CONICETInstitucionalhttp://ri.conicet.gov.ar/Organismo científico-tecnológicoNo correspondehttp://ri.conicet.gov.ar/oai/requestdasensio@conicet.gov.ar; lcarlino@conicet.gov.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:34982025-09-29 09:34:16.335CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicasfalse |
dc.title.none.fl_str_mv |
To clean or not to clean phenotypic datasets for outlier plants in genetic analyses? |
title |
To clean or not to clean phenotypic datasets for outlier plants in genetic analyses? |
spellingShingle |
To clean or not to clean phenotypic datasets for outlier plants in genetic analyses? Alvarez Prado, Santiago ALLELE FREQUENCY GENETIC ANALYSIS OUTLIERS PHENOMICS QUANTITATIVE TRAIT LOCI STATISTICAL ANALYSIS |
title_short |
To clean or not to clean phenotypic datasets for outlier plants in genetic analyses? |
title_full |
To clean or not to clean phenotypic datasets for outlier plants in genetic analyses? |
title_fullStr |
To clean or not to clean phenotypic datasets for outlier plants in genetic analyses? |
title_full_unstemmed |
To clean or not to clean phenotypic datasets for outlier plants in genetic analyses? |
title_sort |
To clean or not to clean phenotypic datasets for outlier plants in genetic analyses? |
dc.creator.none.fl_str_mv |
Alvarez Prado, Santiago Sanchez, Isabelle Cabrera Bosquet, Llorenç Grau, Antonin Welcker, Claude Tardieu, François Hilgert, Nadine |
author |
Alvarez Prado, Santiago |
author_facet |
Alvarez Prado, Santiago Sanchez, Isabelle Cabrera Bosquet, Llorenç Grau, Antonin Welcker, Claude Tardieu, François Hilgert, Nadine |
author_role |
author |
author2 |
Sanchez, Isabelle Cabrera Bosquet, Llorenç Grau, Antonin Welcker, Claude Tardieu, François Hilgert, Nadine |
author2_role |
author author author author author author |
dc.subject.none.fl_str_mv |
ALLELE FREQUENCY GENETIC ANALYSIS OUTLIERS PHENOMICS QUANTITATIVE TRAIT LOCI STATISTICAL ANALYSIS |
topic |
ALLELE FREQUENCY GENETIC ANALYSIS OUTLIERS PHENOMICS QUANTITATIVE TRAIT LOCI STATISTICAL ANALYSIS |
purl_subject.fl_str_mv |
https://purl.org/becyt/ford/4.1 https://purl.org/becyt/ford/4 |
dc.description.none.fl_txt_mv |
Based on case studies, we discuss the extent to which genome-wide association studies (GWAS) are affected by outlier plants, i.e. those deviating from the expected distribution on a multi-criteria basis. Using a raw dataset consisting of daily measurements of leaf area, biomass, and plant height for thousands of plants, we tested three different cleaning methods for their effects on genetic analyses. No-cleaning resulted in the highest number of dubious quantitative trait loci, especially at loci with highly unbalanced allelic frequencies. A trade-off was identified between the risk of false-positives (with no-cleaning and/or a low threshold for minor allele frequency) and the risk of missing interesting rare alleles. Cleaning can lower the risk of the latter by making it possible to choose a higher threshold in GWAS. Fil: Alvarez Prado, Santiago. Université Montpellier II; Francia. Institut National de la Recherche Agronomique; Francia. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina. Universidad de Buenos Aires. Facultad de Agronomía. Departamento de Producción Vegetal. Cátedra de Cerealicultura; Argentina Fil: Sanchez, Isabelle. Université Montpellier II; Francia. Institut National de la Recherche Agronomique; Francia Fil: Cabrera Bosquet, Llorenç. Université Montpellier II; Francia. Institut National de la Recherche Agronomique; Francia Fil: Grau, Antonin. Université Montpellier II; Francia. Institut National de la Recherche Agronomique; Francia Fil: Welcker, Claude. Université Montpellier II; Francia. Institut National de la Recherche Agronomique; Francia Fil: Tardieu, François. Université Montpellier II; Francia. Institut National de la Recherche Agronomique; Francia Fil: Hilgert, Nadine. Institut National de la Recherche Agronomique; Francia. Université Montpellier II; Francia |
description |
Based on case studies, we discuss the extent to which genome-wide association studies (GWAS) are affected by outlier plants, i.e. those deviating from the expected distribution on a multi-criteria basis. Using a raw dataset consisting of daily measurements of leaf area, biomass, and plant height for thousands of plants, we tested three different cleaning methods for their effects on genetic analyses. No-cleaning resulted in the highest number of dubious quantitative trait loci, especially at loci with highly unbalanced allelic frequencies. A trade-off was identified between the risk of false-positives (with no-cleaning and/or a low threshold for minor allele frequency) and the risk of missing interesting rare alleles. Cleaning can lower the risk of the latter by making it possible to choose a higher threshold in GWAS. |
publishDate |
2019 |
dc.date.none.fl_str_mv |
2019-04 |
dc.type.none.fl_str_mv |
info:eu-repo/semantics/article info:eu-repo/semantics/publishedVersion http://purl.org/coar/resource_type/c_6501 info:ar-repo/semantics/articulo |
format |
article |
status_str |
publishedVersion |
dc.identifier.none.fl_str_mv |
http://hdl.handle.net/11336/160536 Alvarez Prado, Santiago; Sanchez, Isabelle; Cabrera Bosquet, Llorenç; Grau, Antonin; Welcker, Claude; et al.; To clean or not to clean phenotypic datasets for outlier plants in genetic analyses?; Oxford University Press; Journal of Experimental Botany; 70; 15; 4-2019; 3693-3698 0022-0957 CONICET Digital CONICET |
url |
http://hdl.handle.net/11336/160536 |
identifier_str_mv |
Alvarez Prado, Santiago; Sanchez, Isabelle; Cabrera Bosquet, Llorenç; Grau, Antonin; Welcker, Claude; et al.; To clean or not to clean phenotypic datasets for outlier plants in genetic analyses?; Oxford University Press; Journal of Experimental Botany; 70; 15; 4-2019; 3693-3698 0022-0957 CONICET Digital CONICET |
dc.language.none.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
info:eu-repo/semantics/altIdentifier/doi/10.1093/jxb/erz191 info:eu-repo/semantics/altIdentifier/url/https://academic.oup.com/jxb/article/70/15/3693/5479455 |
dc.rights.none.fl_str_mv |
info:eu-repo/semantics/openAccess https://creativecommons.org/licenses/by-nc-sa/2.5/ar/ |
eu_rights_str_mv |
openAccess |
rights_invalid_str_mv |
https://creativecommons.org/licenses/by-nc-sa/2.5/ar/ |
dc.format.none.fl_str_mv |
application/pdf application/pdf application/pdf |
dc.publisher.none.fl_str_mv |
Oxford University Press |
publisher.none.fl_str_mv |
Oxford University Press |
dc.source.none.fl_str_mv |
reponame:CONICET Digital (CONICET) instname:Consejo Nacional de Investigaciones Científicas y Técnicas |
reponame_str |
CONICET Digital (CONICET) |
collection |
CONICET Digital (CONICET) |
instname_str |
Consejo Nacional de Investigaciones Científicas y Técnicas |
repository.name.fl_str_mv |
CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicas |
repository.mail.fl_str_mv |
dasensio@conicet.gov.ar; lcarlino@conicet.gov.ar |
_version_ |
1844613059496116224 |
score |
13.070432 |