Random forest in plant genetics and breeding: an application in tomato as a model crop

Autores
Faviere, G.; Vitelleschi, María Susana; Pratta, Guillermo Raúl
Año de publicación
2024
Idioma
inglés
Tipo de recurso
artículo
Estado
versión publicada
Descripción
Random Forest approaches have been used in phenotyping at both morphological and metabolic levels and in genomics studies, but direct applications in practical situations of plant genetics and breeding are scarce. Random Forest was compared with Discriminant Analysis for its ability in classifying tomato individuals belonging to different breeding populations, exclusively based on phenotypic fruit quality traits. In order to take into account different steps in breeding programs, two populations were assayed. One was composed by a set of RILs derived from an interspecific tomato cross, and the other was composed by two of these RILs and the corresponding F1, F2 and backcross generations. Being tomato an autogamous species, the first population was considered a final step in breeding programs because promising genotypes are being evaluated for putative commercial release as new cultivars. Meanwhile, the second one, in which new variation is being generated, was considered as an initial step. Both Random Forest and Discriminant Analysis were able to classify populations with the aim of evaluating general variability and identifying the traits that most contribute to this variability. However, overall errors in classification were lower for Random Forest. When comparing the adequacy of classification between populations, errors of both statistical analyses were greater in the second population than in the first one, though Random Forest was more precise than Discriminant Analysis even in this initial step of plant breeding programs. Random Forest allowed breeders to get a reliable classification of tomato individuals belonging to different breeding populations.
Fil: Faviere, G.. Universidad Nacional de Rosario. Facultad de Ciencias Económicas y Estadística. Escuela de Estadística; Argentina
Fil: Vitelleschi, María Susana. Universidad Nacional de Rosario. Facultad de Ciencias Económicas y Estadística. Escuela de Estadística; Argentina. Universidad Nacional de Rosario. Consejo de Investigaciones de la Universidad de Rosario; Argentina
Fil: Pratta, Guillermo Raúl. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Rosario. Instituto de Investigaciones en Ciencias Agrarias de Rosario. Universidad Nacional de Rosario. Facultad de Ciencias Agrarias. Instituto de Investigaciones en Ciencias Agrarias de Rosario; Argentina
Materia
Discriminant Analysis
Machime Learning
Parametric and non-parametric classification techniques
Phenotype identification
Traits categorization
Nivel de accesibilidad
acceso abierto
Condiciones de uso
https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
Repositorio
CONICET Digital (CONICET)
Institución
Consejo Nacional de Investigaciones Científicas y Técnicas
OAI Identificador
oai:ri.conicet.gov.ar:11336/262678

id CONICETDig_df6c9049c4ff2517cef382f9081d6e0d
oai_identifier_str oai:ri.conicet.gov.ar:11336/262678
network_acronym_str CONICETDig
repository_id_str 3498
network_name_str CONICET Digital (CONICET)
spelling Random forest in plant genetics and breeding: an application in tomato as a model cropRandom forest en genética y mejoramiento genético de plantas: una aplicación en tomate como cultivo modeloFaviere, G.Vitelleschi, María SusanaPratta, Guillermo RaúlDiscriminant AnalysisMachime LearningParametric and non-parametric classification techniquesPhenotype identificationTraits categorizationhttps://purl.org/becyt/ford/4.1https://purl.org/becyt/ford/4https://purl.org/becyt/ford/1.1https://purl.org/becyt/ford/1https://purl.org/becyt/ford/1.2https://purl.org/becyt/ford/1Random Forest approaches have been used in phenotyping at both morphological and metabolic levels and in genomics studies, but direct applications in practical situations of plant genetics and breeding are scarce. Random Forest was compared with Discriminant Analysis for its ability in classifying tomato individuals belonging to different breeding populations, exclusively based on phenotypic fruit quality traits. In order to take into account different steps in breeding programs, two populations were assayed. One was composed by a set of RILs derived from an interspecific tomato cross, and the other was composed by two of these RILs and the corresponding F1, F2 and backcross generations. Being tomato an autogamous species, the first population was considered a final step in breeding programs because promising genotypes are being evaluated for putative commercial release as new cultivars. Meanwhile, the second one, in which new variation is being generated, was considered as an initial step. Both Random Forest and Discriminant Analysis were able to classify populations with the aim of evaluating general variability and identifying the traits that most contribute to this variability. However, overall errors in classification were lower for Random Forest. When comparing the adequacy of classification between populations, errors of both statistical analyses were greater in the second population than in the first one, though Random Forest was more precise than Discriminant Analysis even in this initial step of plant breeding programs. Random Forest allowed breeders to get a reliable classification of tomato individuals belonging to different breeding populations.Fil: Faviere, G.. Universidad Nacional de Rosario. Facultad de Ciencias Económicas y Estadística. Escuela de Estadística; ArgentinaFil: Vitelleschi, María Susana. Universidad Nacional de Rosario. Facultad de Ciencias Económicas y Estadística. Escuela de Estadística; Argentina. Universidad Nacional de Rosario. Consejo de Investigaciones de la Universidad de Rosario; ArgentinaFil: Pratta, Guillermo Raúl. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Rosario. Instituto de Investigaciones en Ciencias Agrarias de Rosario. Universidad Nacional de Rosario. Facultad de Ciencias Agrarias. Instituto de Investigaciones en Ciencias Agrarias de Rosario; ArgentinaSociedad Argentina de Genética2024-07info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionhttp://purl.org/coar/resource_type/c_6501info:ar-repo/semantics/articuloapplication/pdfapplication/pdfhttp://hdl.handle.net/11336/262678Faviere, G.; Vitelleschi, María Susana; Pratta, Guillermo Raúl; Random forest in plant genetics and breeding: an application in tomato as a model crop; Sociedad Argentina de Genética; Basic and Applied Genetics; 35; 1; 7-2024; 39-511666-03901852-6233CONICET DigitalCONICETenginfo:eu-repo/semantics/altIdentifier/url/https://sag.org.ar/jbag/en/project/vol-xxxv-issue-1-2/info:eu-repo/semantics/altIdentifier/doi/10.35407/bag.2024.35.01.03info:eu-repo/semantics/openAccesshttps://creativecommons.org/licenses/by-nc-sa/2.5/ar/reponame:CONICET Digital (CONICET)instname:Consejo Nacional de Investigaciones Científicas y Técnicas2025-10-22T11:16:42Zoai:ri.conicet.gov.ar:11336/262678instacron:CONICETInstitucionalhttp://ri.conicet.gov.ar/Organismo científico-tecnológicoNo correspondehttp://ri.conicet.gov.ar/oai/requestdasensio@conicet.gov.ar; lcarlino@conicet.gov.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:34982025-10-22 11:16:43.181CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicasfalse
dc.title.none.fl_str_mv Random forest in plant genetics and breeding: an application in tomato as a model crop
Random forest en genética y mejoramiento genético de plantas: una aplicación en tomate como cultivo modelo
title Random forest in plant genetics and breeding: an application in tomato as a model crop
spellingShingle Random forest in plant genetics and breeding: an application in tomato as a model crop
Faviere, G.
Discriminant Analysis
Machime Learning
Parametric and non-parametric classification techniques
Phenotype identification
Traits categorization
title_short Random forest in plant genetics and breeding: an application in tomato as a model crop
title_full Random forest in plant genetics and breeding: an application in tomato as a model crop
title_fullStr Random forest in plant genetics and breeding: an application in tomato as a model crop
title_full_unstemmed Random forest in plant genetics and breeding: an application in tomato as a model crop
title_sort Random forest in plant genetics and breeding: an application in tomato as a model crop
dc.creator.none.fl_str_mv Faviere, G.
Vitelleschi, María Susana
Pratta, Guillermo Raúl
author Faviere, G.
author_facet Faviere, G.
Vitelleschi, María Susana
Pratta, Guillermo Raúl
author_role author
author2 Vitelleschi, María Susana
Pratta, Guillermo Raúl
author2_role author
author
dc.subject.none.fl_str_mv Discriminant Analysis
Machime Learning
Parametric and non-parametric classification techniques
Phenotype identification
Traits categorization
topic Discriminant Analysis
Machime Learning
Parametric and non-parametric classification techniques
Phenotype identification
Traits categorization
purl_subject.fl_str_mv https://purl.org/becyt/ford/4.1
https://purl.org/becyt/ford/4
https://purl.org/becyt/ford/1.1
https://purl.org/becyt/ford/1
https://purl.org/becyt/ford/1.2
https://purl.org/becyt/ford/1
dc.description.none.fl_txt_mv Random Forest approaches have been used in phenotyping at both morphological and metabolic levels and in genomics studies, but direct applications in practical situations of plant genetics and breeding are scarce. Random Forest was compared with Discriminant Analysis for its ability in classifying tomato individuals belonging to different breeding populations, exclusively based on phenotypic fruit quality traits. In order to take into account different steps in breeding programs, two populations were assayed. One was composed by a set of RILs derived from an interspecific tomato cross, and the other was composed by two of these RILs and the corresponding F1, F2 and backcross generations. Being tomato an autogamous species, the first population was considered a final step in breeding programs because promising genotypes are being evaluated for putative commercial release as new cultivars. Meanwhile, the second one, in which new variation is being generated, was considered as an initial step. Both Random Forest and Discriminant Analysis were able to classify populations with the aim of evaluating general variability and identifying the traits that most contribute to this variability. However, overall errors in classification were lower for Random Forest. When comparing the adequacy of classification between populations, errors of both statistical analyses were greater in the second population than in the first one, though Random Forest was more precise than Discriminant Analysis even in this initial step of plant breeding programs. Random Forest allowed breeders to get a reliable classification of tomato individuals belonging to different breeding populations.
Fil: Faviere, G.. Universidad Nacional de Rosario. Facultad de Ciencias Económicas y Estadística. Escuela de Estadística; Argentina
Fil: Vitelleschi, María Susana. Universidad Nacional de Rosario. Facultad de Ciencias Económicas y Estadística. Escuela de Estadística; Argentina. Universidad Nacional de Rosario. Consejo de Investigaciones de la Universidad de Rosario; Argentina
Fil: Pratta, Guillermo Raúl. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Rosario. Instituto de Investigaciones en Ciencias Agrarias de Rosario. Universidad Nacional de Rosario. Facultad de Ciencias Agrarias. Instituto de Investigaciones en Ciencias Agrarias de Rosario; Argentina
description Random Forest approaches have been used in phenotyping at both morphological and metabolic levels and in genomics studies, but direct applications in practical situations of plant genetics and breeding are scarce. Random Forest was compared with Discriminant Analysis for its ability in classifying tomato individuals belonging to different breeding populations, exclusively based on phenotypic fruit quality traits. In order to take into account different steps in breeding programs, two populations were assayed. One was composed by a set of RILs derived from an interspecific tomato cross, and the other was composed by two of these RILs and the corresponding F1, F2 and backcross generations. Being tomato an autogamous species, the first population was considered a final step in breeding programs because promising genotypes are being evaluated for putative commercial release as new cultivars. Meanwhile, the second one, in which new variation is being generated, was considered as an initial step. Both Random Forest and Discriminant Analysis were able to classify populations with the aim of evaluating general variability and identifying the traits that most contribute to this variability. However, overall errors in classification were lower for Random Forest. When comparing the adequacy of classification between populations, errors of both statistical analyses were greater in the second population than in the first one, though Random Forest was more precise than Discriminant Analysis even in this initial step of plant breeding programs. Random Forest allowed breeders to get a reliable classification of tomato individuals belonging to different breeding populations.
publishDate 2024
dc.date.none.fl_str_mv 2024-07
dc.type.none.fl_str_mv info:eu-repo/semantics/article
info:eu-repo/semantics/publishedVersion
http://purl.org/coar/resource_type/c_6501
info:ar-repo/semantics/articulo
format article
status_str publishedVersion
dc.identifier.none.fl_str_mv http://hdl.handle.net/11336/262678
Faviere, G.; Vitelleschi, María Susana; Pratta, Guillermo Raúl; Random forest in plant genetics and breeding: an application in tomato as a model crop; Sociedad Argentina de Genética; Basic and Applied Genetics; 35; 1; 7-2024; 39-51
1666-0390
1852-6233
CONICET Digital
CONICET
url http://hdl.handle.net/11336/262678
identifier_str_mv Faviere, G.; Vitelleschi, María Susana; Pratta, Guillermo Raúl; Random forest in plant genetics and breeding: an application in tomato as a model crop; Sociedad Argentina de Genética; Basic and Applied Genetics; 35; 1; 7-2024; 39-51
1666-0390
1852-6233
CONICET Digital
CONICET
dc.language.none.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv info:eu-repo/semantics/altIdentifier/url/https://sag.org.ar/jbag/en/project/vol-xxxv-issue-1-2/
info:eu-repo/semantics/altIdentifier/doi/10.35407/bag.2024.35.01.03
dc.rights.none.fl_str_mv info:eu-repo/semantics/openAccess
https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
eu_rights_str_mv openAccess
rights_invalid_str_mv https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
dc.format.none.fl_str_mv application/pdf
application/pdf
dc.publisher.none.fl_str_mv Sociedad Argentina de Genética
publisher.none.fl_str_mv Sociedad Argentina de Genética
dc.source.none.fl_str_mv reponame:CONICET Digital (CONICET)
instname:Consejo Nacional de Investigaciones Científicas y Técnicas
reponame_str CONICET Digital (CONICET)
collection CONICET Digital (CONICET)
instname_str Consejo Nacional de Investigaciones Científicas y Técnicas
repository.name.fl_str_mv CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicas
repository.mail.fl_str_mv dasensio@conicet.gov.ar; lcarlino@conicet.gov.ar
_version_ 1846781620355334144
score 12.982451