To clean or not to clean phenotypic datasets for outlier plants in genetic analyses?

Autores
Alvarez Prado, Santiago; Sanchez, Isabelle; Cabrera Bosquet, Llorenç; Grau, Antonin; Welcker, Claude; Tardieu, François; Hilgert, Nadine
Año de publicación
2019
Idioma
inglés
Tipo de recurso
artículo
Estado
versión publicada
Descripción
Based on case studies, we discuss the extent to which genome-wide association studies (GWAS) are affected by outlier plants, i.e. those deviating from the expected distribution on a multi-criteria basis. Using a raw dataset consisting of daily measurements of leaf area, biomass, and plant height for thousands of plants, we tested three different cleaning methods for their effects on genetic analyses. No-cleaning resulted in the highest number of dubious quantitative trait loci, especially at loci with highly unbalanced allelic frequencies. A trade-off was identified between the risk of false-positives (with no-cleaning and/or a low threshold for minor allele frequency) and the risk of missing interesting rare alleles. Cleaning can lower the risk of the latter by making it possible to choose a higher threshold in GWAS.
Fil: Alvarez Prado, Santiago. Université Montpellier II; Francia. Institut National de la Recherche Agronomique; Francia. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina. Universidad de Buenos Aires. Facultad de Agronomía. Departamento de Producción Vegetal. Cátedra de Cerealicultura; Argentina
Fil: Sanchez, Isabelle. Université Montpellier II; Francia. Institut National de la Recherche Agronomique; Francia
Fil: Cabrera Bosquet, Llorenç. Université Montpellier II; Francia. Institut National de la Recherche Agronomique; Francia
Fil: Grau, Antonin. Université Montpellier II; Francia. Institut National de la Recherche Agronomique; Francia
Fil: Welcker, Claude. Université Montpellier II; Francia. Institut National de la Recherche Agronomique; Francia
Fil: Tardieu, François. Université Montpellier II; Francia. Institut National de la Recherche Agronomique; Francia
Fil: Hilgert, Nadine. Institut National de la Recherche Agronomique; Francia. Université Montpellier II; Francia
Materia
ALLELE FREQUENCY
GENETIC ANALYSIS
OUTLIERS
PHENOMICS
QUANTITATIVE TRAIT LOCI
STATISTICAL ANALYSIS
Nivel de accesibilidad
acceso abierto
Condiciones de uso
https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
Repositorio
CONICET Digital (CONICET)
Institución
Consejo Nacional de Investigaciones Científicas y Técnicas
OAI Identificador
oai:ri.conicet.gov.ar:11336/160536

id CONICETDig_400d6131448523130dd2df50a2d3b86b
oai_identifier_str oai:ri.conicet.gov.ar:11336/160536
network_acronym_str CONICETDig
repository_id_str 3498
network_name_str CONICET Digital (CONICET)
spelling To clean or not to clean phenotypic datasets for outlier plants in genetic analyses?Alvarez Prado, SantiagoSanchez, IsabelleCabrera Bosquet, LlorençGrau, AntoninWelcker, ClaudeTardieu, FrançoisHilgert, NadineALLELE FREQUENCYGENETIC ANALYSISOUTLIERSPHENOMICSQUANTITATIVE TRAIT LOCISTATISTICAL ANALYSIShttps://purl.org/becyt/ford/4.1https://purl.org/becyt/ford/4Based on case studies, we discuss the extent to which genome-wide association studies (GWAS) are affected by outlier plants, i.e. those deviating from the expected distribution on a multi-criteria basis. Using a raw dataset consisting of daily measurements of leaf area, biomass, and plant height for thousands of plants, we tested three different cleaning methods for their effects on genetic analyses. No-cleaning resulted in the highest number of dubious quantitative trait loci, especially at loci with highly unbalanced allelic frequencies. A trade-off was identified between the risk of false-positives (with no-cleaning and/or a low threshold for minor allele frequency) and the risk of missing interesting rare alleles. Cleaning can lower the risk of the latter by making it possible to choose a higher threshold in GWAS.Fil: Alvarez Prado, Santiago. Université Montpellier II; Francia. Institut National de la Recherche Agronomique; Francia. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina. Universidad de Buenos Aires. Facultad de Agronomía. Departamento de Producción Vegetal. Cátedra de Cerealicultura; ArgentinaFil: Sanchez, Isabelle. Université Montpellier II; Francia. Institut National de la Recherche Agronomique; FranciaFil: Cabrera Bosquet, Llorenç. Université Montpellier II; Francia. Institut National de la Recherche Agronomique; FranciaFil: Grau, Antonin. Université Montpellier II; Francia. Institut National de la Recherche Agronomique; FranciaFil: Welcker, Claude. Université Montpellier II; Francia. Institut National de la Recherche Agronomique; FranciaFil: Tardieu, François. Université Montpellier II; Francia. Institut National de la Recherche Agronomique; FranciaFil: Hilgert, Nadine. Institut National de la Recherche Agronomique; Francia. Université Montpellier II; FranciaOxford University Press2019-04info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionhttp://purl.org/coar/resource_type/c_6501info:ar-repo/semantics/articuloapplication/pdfapplication/pdfapplication/pdfhttp://hdl.handle.net/11336/160536Alvarez Prado, Santiago; Sanchez, Isabelle; Cabrera Bosquet, Llorenç; Grau, Antonin; Welcker, Claude; et al.; To clean or not to clean phenotypic datasets for outlier plants in genetic analyses?; Oxford University Press; Journal of Experimental Botany; 70; 15; 4-2019; 3693-36980022-0957CONICET DigitalCONICETenginfo:eu-repo/semantics/altIdentifier/doi/10.1093/jxb/erz191info:eu-repo/semantics/altIdentifier/url/https://academic.oup.com/jxb/article/70/15/3693/5479455info:eu-repo/semantics/openAccesshttps://creativecommons.org/licenses/by-nc-sa/2.5/ar/reponame:CONICET Digital (CONICET)instname:Consejo Nacional de Investigaciones Científicas y Técnicas2025-09-29T09:34:16Zoai:ri.conicet.gov.ar:11336/160536instacron:CONICETInstitucionalhttp://ri.conicet.gov.ar/Organismo científico-tecnológicoNo correspondehttp://ri.conicet.gov.ar/oai/requestdasensio@conicet.gov.ar; lcarlino@conicet.gov.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:34982025-09-29 09:34:16.335CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicasfalse
dc.title.none.fl_str_mv To clean or not to clean phenotypic datasets for outlier plants in genetic analyses?
title To clean or not to clean phenotypic datasets for outlier plants in genetic analyses?
spellingShingle To clean or not to clean phenotypic datasets for outlier plants in genetic analyses?
Alvarez Prado, Santiago
ALLELE FREQUENCY
GENETIC ANALYSIS
OUTLIERS
PHENOMICS
QUANTITATIVE TRAIT LOCI
STATISTICAL ANALYSIS
title_short To clean or not to clean phenotypic datasets for outlier plants in genetic analyses?
title_full To clean or not to clean phenotypic datasets for outlier plants in genetic analyses?
title_fullStr To clean or not to clean phenotypic datasets for outlier plants in genetic analyses?
title_full_unstemmed To clean or not to clean phenotypic datasets for outlier plants in genetic analyses?
title_sort To clean or not to clean phenotypic datasets for outlier plants in genetic analyses?
dc.creator.none.fl_str_mv Alvarez Prado, Santiago
Sanchez, Isabelle
Cabrera Bosquet, Llorenç
Grau, Antonin
Welcker, Claude
Tardieu, François
Hilgert, Nadine
author Alvarez Prado, Santiago
author_facet Alvarez Prado, Santiago
Sanchez, Isabelle
Cabrera Bosquet, Llorenç
Grau, Antonin
Welcker, Claude
Tardieu, François
Hilgert, Nadine
author_role author
author2 Sanchez, Isabelle
Cabrera Bosquet, Llorenç
Grau, Antonin
Welcker, Claude
Tardieu, François
Hilgert, Nadine
author2_role author
author
author
author
author
author
dc.subject.none.fl_str_mv ALLELE FREQUENCY
GENETIC ANALYSIS
OUTLIERS
PHENOMICS
QUANTITATIVE TRAIT LOCI
STATISTICAL ANALYSIS
topic ALLELE FREQUENCY
GENETIC ANALYSIS
OUTLIERS
PHENOMICS
QUANTITATIVE TRAIT LOCI
STATISTICAL ANALYSIS
purl_subject.fl_str_mv https://purl.org/becyt/ford/4.1
https://purl.org/becyt/ford/4
dc.description.none.fl_txt_mv Based on case studies, we discuss the extent to which genome-wide association studies (GWAS) are affected by outlier plants, i.e. those deviating from the expected distribution on a multi-criteria basis. Using a raw dataset consisting of daily measurements of leaf area, biomass, and plant height for thousands of plants, we tested three different cleaning methods for their effects on genetic analyses. No-cleaning resulted in the highest number of dubious quantitative trait loci, especially at loci with highly unbalanced allelic frequencies. A trade-off was identified between the risk of false-positives (with no-cleaning and/or a low threshold for minor allele frequency) and the risk of missing interesting rare alleles. Cleaning can lower the risk of the latter by making it possible to choose a higher threshold in GWAS.
Fil: Alvarez Prado, Santiago. Université Montpellier II; Francia. Institut National de la Recherche Agronomique; Francia. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina. Universidad de Buenos Aires. Facultad de Agronomía. Departamento de Producción Vegetal. Cátedra de Cerealicultura; Argentina
Fil: Sanchez, Isabelle. Université Montpellier II; Francia. Institut National de la Recherche Agronomique; Francia
Fil: Cabrera Bosquet, Llorenç. Université Montpellier II; Francia. Institut National de la Recherche Agronomique; Francia
Fil: Grau, Antonin. Université Montpellier II; Francia. Institut National de la Recherche Agronomique; Francia
Fil: Welcker, Claude. Université Montpellier II; Francia. Institut National de la Recherche Agronomique; Francia
Fil: Tardieu, François. Université Montpellier II; Francia. Institut National de la Recherche Agronomique; Francia
Fil: Hilgert, Nadine. Institut National de la Recherche Agronomique; Francia. Université Montpellier II; Francia
description Based on case studies, we discuss the extent to which genome-wide association studies (GWAS) are affected by outlier plants, i.e. those deviating from the expected distribution on a multi-criteria basis. Using a raw dataset consisting of daily measurements of leaf area, biomass, and plant height for thousands of plants, we tested three different cleaning methods for their effects on genetic analyses. No-cleaning resulted in the highest number of dubious quantitative trait loci, especially at loci with highly unbalanced allelic frequencies. A trade-off was identified between the risk of false-positives (with no-cleaning and/or a low threshold for minor allele frequency) and the risk of missing interesting rare alleles. Cleaning can lower the risk of the latter by making it possible to choose a higher threshold in GWAS.
publishDate 2019
dc.date.none.fl_str_mv 2019-04
dc.type.none.fl_str_mv info:eu-repo/semantics/article
info:eu-repo/semantics/publishedVersion
http://purl.org/coar/resource_type/c_6501
info:ar-repo/semantics/articulo
format article
status_str publishedVersion
dc.identifier.none.fl_str_mv http://hdl.handle.net/11336/160536
Alvarez Prado, Santiago; Sanchez, Isabelle; Cabrera Bosquet, Llorenç; Grau, Antonin; Welcker, Claude; et al.; To clean or not to clean phenotypic datasets for outlier plants in genetic analyses?; Oxford University Press; Journal of Experimental Botany; 70; 15; 4-2019; 3693-3698
0022-0957
CONICET Digital
CONICET
url http://hdl.handle.net/11336/160536
identifier_str_mv Alvarez Prado, Santiago; Sanchez, Isabelle; Cabrera Bosquet, Llorenç; Grau, Antonin; Welcker, Claude; et al.; To clean or not to clean phenotypic datasets for outlier plants in genetic analyses?; Oxford University Press; Journal of Experimental Botany; 70; 15; 4-2019; 3693-3698
0022-0957
CONICET Digital
CONICET
dc.language.none.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv info:eu-repo/semantics/altIdentifier/doi/10.1093/jxb/erz191
info:eu-repo/semantics/altIdentifier/url/https://academic.oup.com/jxb/article/70/15/3693/5479455
dc.rights.none.fl_str_mv info:eu-repo/semantics/openAccess
https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
eu_rights_str_mv openAccess
rights_invalid_str_mv https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
dc.format.none.fl_str_mv application/pdf
application/pdf
application/pdf
dc.publisher.none.fl_str_mv Oxford University Press
publisher.none.fl_str_mv Oxford University Press
dc.source.none.fl_str_mv reponame:CONICET Digital (CONICET)
instname:Consejo Nacional de Investigaciones Científicas y Técnicas
reponame_str CONICET Digital (CONICET)
collection CONICET Digital (CONICET)
instname_str Consejo Nacional de Investigaciones Científicas y Técnicas
repository.name.fl_str_mv CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicas
repository.mail.fl_str_mv dasensio@conicet.gov.ar; lcarlino@conicet.gov.ar
_version_ 1844613059496116224
score 13.070432