To clean or not to clean phenotypic datasets for outlier plants in genetic analyses?
- Autores
- Alvarez Prado, Santiago; Sanchez, Isabelle; Cabrera Bosquet, Llorenç; Grau, Antonin; Welcker, Claude; Tardieu, François; Hilgert, Nadine
- Año de publicación
- 2019
- Idioma
- inglés
- Tipo de recurso
- artículo
- Estado
- versión publicada
- Descripción
- Based on case studies, we discuss the extent to which genome-wide association studies (GWAS) are affected by outlier plants, i.e. those deviating from the expected distribution on a multi-criteria basis. Using a raw dataset consisting of daily measurements of leaf area, biomass, and plant height for thousands of plants, we tested three different cleaning methods for their effects on genetic analyses. No-cleaning resulted in the highest number of dubious quantitative trait loci, especially at loci with highly unbalanced allelic frequencies. A trade-off was identified between the risk of false-positives (with no-cleaning and/or a low threshold for minor allele frequency) and the risk of missing interesting rare alleles. Cleaning can lower the risk of the latter by making it possible to choose a higher threshold in GWAS.
Fil: Alvarez Prado, Santiago. Université Montpellier II; Francia. Institut National de la Recherche Agronomique; Francia. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina. Universidad de Buenos Aires. Facultad de Agronomía. Departamento de Producción Vegetal. Cátedra de Cerealicultura; Argentina
Fil: Sanchez, Isabelle. Université Montpellier II; Francia. Institut National de la Recherche Agronomique; Francia
Fil: Cabrera Bosquet, Llorenç. Université Montpellier II; Francia. Institut National de la Recherche Agronomique; Francia
Fil: Grau, Antonin. Université Montpellier II; Francia. Institut National de la Recherche Agronomique; Francia
Fil: Welcker, Claude. Université Montpellier II; Francia. Institut National de la Recherche Agronomique; Francia
Fil: Tardieu, François. Université Montpellier II; Francia. Institut National de la Recherche Agronomique; Francia
Fil: Hilgert, Nadine. Institut National de la Recherche Agronomique; Francia. Université Montpellier II; Francia - Materia
-
ALLELE FREQUENCY
GENETIC ANALYSIS
OUTLIERS
PHENOMICS
QUANTITATIVE TRAIT LOCI
STATISTICAL ANALYSIS - Nivel de accesibilidad
- acceso abierto
- Condiciones de uso
- https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
- Repositorio
.jpg)
- Institución
- Consejo Nacional de Investigaciones Científicas y Técnicas
- OAI Identificador
- oai:ri.conicet.gov.ar:11336/160536
Ver los metadatos del registro completo
| id |
CONICETDig_400d6131448523130dd2df50a2d3b86b |
|---|---|
| oai_identifier_str |
oai:ri.conicet.gov.ar:11336/160536 |
| network_acronym_str |
CONICETDig |
| repository_id_str |
3498 |
| network_name_str |
CONICET Digital (CONICET) |
| spelling |
To clean or not to clean phenotypic datasets for outlier plants in genetic analyses?Alvarez Prado, SantiagoSanchez, IsabelleCabrera Bosquet, LlorençGrau, AntoninWelcker, ClaudeTardieu, FrançoisHilgert, NadineALLELE FREQUENCYGENETIC ANALYSISOUTLIERSPHENOMICSQUANTITATIVE TRAIT LOCISTATISTICAL ANALYSIShttps://purl.org/becyt/ford/4.1https://purl.org/becyt/ford/4Based on case studies, we discuss the extent to which genome-wide association studies (GWAS) are affected by outlier plants, i.e. those deviating from the expected distribution on a multi-criteria basis. Using a raw dataset consisting of daily measurements of leaf area, biomass, and plant height for thousands of plants, we tested three different cleaning methods for their effects on genetic analyses. No-cleaning resulted in the highest number of dubious quantitative trait loci, especially at loci with highly unbalanced allelic frequencies. A trade-off was identified between the risk of false-positives (with no-cleaning and/or a low threshold for minor allele frequency) and the risk of missing interesting rare alleles. Cleaning can lower the risk of the latter by making it possible to choose a higher threshold in GWAS.Fil: Alvarez Prado, Santiago. Université Montpellier II; Francia. Institut National de la Recherche Agronomique; Francia. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina. Universidad de Buenos Aires. Facultad de Agronomía. Departamento de Producción Vegetal. Cátedra de Cerealicultura; ArgentinaFil: Sanchez, Isabelle. Université Montpellier II; Francia. Institut National de la Recherche Agronomique; FranciaFil: Cabrera Bosquet, Llorenç. Université Montpellier II; Francia. Institut National de la Recherche Agronomique; FranciaFil: Grau, Antonin. Université Montpellier II; Francia. Institut National de la Recherche Agronomique; FranciaFil: Welcker, Claude. Université Montpellier II; Francia. Institut National de la Recherche Agronomique; FranciaFil: Tardieu, François. Université Montpellier II; Francia. Institut National de la Recherche Agronomique; FranciaFil: Hilgert, Nadine. Institut National de la Recherche Agronomique; Francia. Université Montpellier II; FranciaOxford University Press2019-04info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionhttp://purl.org/coar/resource_type/c_6501info:ar-repo/semantics/articuloapplication/pdfapplication/pdfapplication/pdfhttp://hdl.handle.net/11336/160536Alvarez Prado, Santiago; Sanchez, Isabelle; Cabrera Bosquet, Llorenç; Grau, Antonin; Welcker, Claude; et al.; To clean or not to clean phenotypic datasets for outlier plants in genetic analyses?; Oxford University Press; Journal of Experimental Botany; 70; 15; 4-2019; 3693-36980022-0957CONICET DigitalCONICETenginfo:eu-repo/semantics/altIdentifier/doi/10.1093/jxb/erz191info:eu-repo/semantics/altIdentifier/url/https://academic.oup.com/jxb/article/70/15/3693/5479455info:eu-repo/semantics/openAccesshttps://creativecommons.org/licenses/by-nc-sa/2.5/ar/reponame:CONICET Digital (CONICET)instname:Consejo Nacional de Investigaciones Científicas y Técnicas2025-10-22T11:02:18Zoai:ri.conicet.gov.ar:11336/160536instacron:CONICETInstitucionalhttp://ri.conicet.gov.ar/Organismo científico-tecnológicoNo correspondehttp://ri.conicet.gov.ar/oai/requestdasensio@conicet.gov.ar; lcarlino@conicet.gov.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:34982025-10-22 11:02:18.575CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicasfalse |
| dc.title.none.fl_str_mv |
To clean or not to clean phenotypic datasets for outlier plants in genetic analyses? |
| title |
To clean or not to clean phenotypic datasets for outlier plants in genetic analyses? |
| spellingShingle |
To clean or not to clean phenotypic datasets for outlier plants in genetic analyses? Alvarez Prado, Santiago ALLELE FREQUENCY GENETIC ANALYSIS OUTLIERS PHENOMICS QUANTITATIVE TRAIT LOCI STATISTICAL ANALYSIS |
| title_short |
To clean or not to clean phenotypic datasets for outlier plants in genetic analyses? |
| title_full |
To clean or not to clean phenotypic datasets for outlier plants in genetic analyses? |
| title_fullStr |
To clean or not to clean phenotypic datasets for outlier plants in genetic analyses? |
| title_full_unstemmed |
To clean or not to clean phenotypic datasets for outlier plants in genetic analyses? |
| title_sort |
To clean or not to clean phenotypic datasets for outlier plants in genetic analyses? |
| dc.creator.none.fl_str_mv |
Alvarez Prado, Santiago Sanchez, Isabelle Cabrera Bosquet, Llorenç Grau, Antonin Welcker, Claude Tardieu, François Hilgert, Nadine |
| author |
Alvarez Prado, Santiago |
| author_facet |
Alvarez Prado, Santiago Sanchez, Isabelle Cabrera Bosquet, Llorenç Grau, Antonin Welcker, Claude Tardieu, François Hilgert, Nadine |
| author_role |
author |
| author2 |
Sanchez, Isabelle Cabrera Bosquet, Llorenç Grau, Antonin Welcker, Claude Tardieu, François Hilgert, Nadine |
| author2_role |
author author author author author author |
| dc.subject.none.fl_str_mv |
ALLELE FREQUENCY GENETIC ANALYSIS OUTLIERS PHENOMICS QUANTITATIVE TRAIT LOCI STATISTICAL ANALYSIS |
| topic |
ALLELE FREQUENCY GENETIC ANALYSIS OUTLIERS PHENOMICS QUANTITATIVE TRAIT LOCI STATISTICAL ANALYSIS |
| purl_subject.fl_str_mv |
https://purl.org/becyt/ford/4.1 https://purl.org/becyt/ford/4 |
| dc.description.none.fl_txt_mv |
Based on case studies, we discuss the extent to which genome-wide association studies (GWAS) are affected by outlier plants, i.e. those deviating from the expected distribution on a multi-criteria basis. Using a raw dataset consisting of daily measurements of leaf area, biomass, and plant height for thousands of plants, we tested three different cleaning methods for their effects on genetic analyses. No-cleaning resulted in the highest number of dubious quantitative trait loci, especially at loci with highly unbalanced allelic frequencies. A trade-off was identified between the risk of false-positives (with no-cleaning and/or a low threshold for minor allele frequency) and the risk of missing interesting rare alleles. Cleaning can lower the risk of the latter by making it possible to choose a higher threshold in GWAS. Fil: Alvarez Prado, Santiago. Université Montpellier II; Francia. Institut National de la Recherche Agronomique; Francia. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina. Universidad de Buenos Aires. Facultad de Agronomía. Departamento de Producción Vegetal. Cátedra de Cerealicultura; Argentina Fil: Sanchez, Isabelle. Université Montpellier II; Francia. Institut National de la Recherche Agronomique; Francia Fil: Cabrera Bosquet, Llorenç. Université Montpellier II; Francia. Institut National de la Recherche Agronomique; Francia Fil: Grau, Antonin. Université Montpellier II; Francia. Institut National de la Recherche Agronomique; Francia Fil: Welcker, Claude. Université Montpellier II; Francia. Institut National de la Recherche Agronomique; Francia Fil: Tardieu, François. Université Montpellier II; Francia. Institut National de la Recherche Agronomique; Francia Fil: Hilgert, Nadine. Institut National de la Recherche Agronomique; Francia. Université Montpellier II; Francia |
| description |
Based on case studies, we discuss the extent to which genome-wide association studies (GWAS) are affected by outlier plants, i.e. those deviating from the expected distribution on a multi-criteria basis. Using a raw dataset consisting of daily measurements of leaf area, biomass, and plant height for thousands of plants, we tested three different cleaning methods for their effects on genetic analyses. No-cleaning resulted in the highest number of dubious quantitative trait loci, especially at loci with highly unbalanced allelic frequencies. A trade-off was identified between the risk of false-positives (with no-cleaning and/or a low threshold for minor allele frequency) and the risk of missing interesting rare alleles. Cleaning can lower the risk of the latter by making it possible to choose a higher threshold in GWAS. |
| publishDate |
2019 |
| dc.date.none.fl_str_mv |
2019-04 |
| dc.type.none.fl_str_mv |
info:eu-repo/semantics/article info:eu-repo/semantics/publishedVersion http://purl.org/coar/resource_type/c_6501 info:ar-repo/semantics/articulo |
| format |
article |
| status_str |
publishedVersion |
| dc.identifier.none.fl_str_mv |
http://hdl.handle.net/11336/160536 Alvarez Prado, Santiago; Sanchez, Isabelle; Cabrera Bosquet, Llorenç; Grau, Antonin; Welcker, Claude; et al.; To clean or not to clean phenotypic datasets for outlier plants in genetic analyses?; Oxford University Press; Journal of Experimental Botany; 70; 15; 4-2019; 3693-3698 0022-0957 CONICET Digital CONICET |
| url |
http://hdl.handle.net/11336/160536 |
| identifier_str_mv |
Alvarez Prado, Santiago; Sanchez, Isabelle; Cabrera Bosquet, Llorenç; Grau, Antonin; Welcker, Claude; et al.; To clean or not to clean phenotypic datasets for outlier plants in genetic analyses?; Oxford University Press; Journal of Experimental Botany; 70; 15; 4-2019; 3693-3698 0022-0957 CONICET Digital CONICET |
| dc.language.none.fl_str_mv |
eng |
| language |
eng |
| dc.relation.none.fl_str_mv |
info:eu-repo/semantics/altIdentifier/doi/10.1093/jxb/erz191 info:eu-repo/semantics/altIdentifier/url/https://academic.oup.com/jxb/article/70/15/3693/5479455 |
| dc.rights.none.fl_str_mv |
info:eu-repo/semantics/openAccess https://creativecommons.org/licenses/by-nc-sa/2.5/ar/ |
| eu_rights_str_mv |
openAccess |
| rights_invalid_str_mv |
https://creativecommons.org/licenses/by-nc-sa/2.5/ar/ |
| dc.format.none.fl_str_mv |
application/pdf application/pdf application/pdf |
| dc.publisher.none.fl_str_mv |
Oxford University Press |
| publisher.none.fl_str_mv |
Oxford University Press |
| dc.source.none.fl_str_mv |
reponame:CONICET Digital (CONICET) instname:Consejo Nacional de Investigaciones Científicas y Técnicas |
| reponame_str |
CONICET Digital (CONICET) |
| collection |
CONICET Digital (CONICET) |
| instname_str |
Consejo Nacional de Investigaciones Científicas y Técnicas |
| repository.name.fl_str_mv |
CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicas |
| repository.mail.fl_str_mv |
dasensio@conicet.gov.ar; lcarlino@conicet.gov.ar |
| _version_ |
1846781227857608704 |
| score |
12.982451 |