A Machine Learning- Based Approach for the Discovery of Climate Smart Peaches

Autores
Aballay, Maximiliano Martín; Chirino, Julian Santiago; Valentini, Gabriel Hugo; Sanchez, Gerardo
Año de publicación
2022
Idioma
inglés
Tipo de recurso
documento de conferencia
Estado
versión publicada
Descripción
Poster
The temperate climate fruit trees are proposed as the crops that will be more affected by climate change. This is due to that these tree species alternate a dormancy period induced by chilling requirement with a vegetative growth that is promoted by heating conditions. In addition to being plants of considerable size, are cultivated in open fields influenced by weather conditions. Besides its economic importance, peach is a model for rosacea species due to a small and diploid genome (2n=2x=16, 256 Mb) that is sequenced. In this work a novel Machine Learning based approach for traits forecast from genomic and climate variable is presented. A data set was obtained from the analysis of the EEA San Pedro peach collection composed of 237 accessions genotyped (13,584 genetic variants) by an in-house developed ddRAD-seq platform and phenotyped for Flowering date (FD), Harvest date (HD), Fruit weight (FW) and Soluble Solid Content (SSC) during 5 seasons (2017-2021). The climate variables (Chilling hours, Chilling units, Global radiation, Relative humidity, Pluviometric precipitation and Daily mean temperature) obtained from an automatic weather station were included resulting in combined (genomic, phenotype and climate) data set of 3.8 M data points. A Random Forest model was trained (using 80% of the data) for each trait resulting in performance (R2) higher than 90%. The cross validation indicated that the models forecast the FD and HD with R2 higher (84% and 79%) than FW and SSC (50% and 41%). To test the ability of the models to forecast extreme seasons, the season with less chilling accumulation of our region for the last 65 years was predicted. The R2 were in agreement with the results of cross validation indicating that the models performed well for climate scenarios that were not used in the training. Hypothetical future weather sceneries were simulated decreasing the daily chill hours and chill units (20%, 40% and 60%). The traits were forecasted for each scenery finding that when the daily chill hours and chill units decrease, 16% of genotypes will flower early (2-20 days) and 31% of genotypes will flower late (2-23 days). For harvest date 28% of genotypes showed an early harvest (2-40 days) and 42% of genotypes a late harvest (2-21 days). Furthermore 114 genotypes were stable for flowering date, and 58 genotypes were stable for harvest date under these conditions. From these genotypes a set of 35 that keep the two variables stable at the same time were identified. The final model trained with all seasons allowed us to identify the importance of variables (genetic and climatic) for make the predictions. Unexpectedly, some climate variables presented a high level of importance to the trained model suggesting novel phenomena. Here we present a the proof of concept that encourage the use of machine learning tools for the identification and selection of peach genotypes able to face climatic change. Currently, we need to collect more data to improve the performance of our predictions and extend these machine learning tools to other traits.
EEA San Pedro
Fil: Aballay, Maximiliano Martín. Instituto Nacional de Tecnología Agropecuaria (INTA). Estación Experimental Agropecuaria San Pedro; Argentina
Fil: Chirino, Julián Santiago. Instituto Nacional de Tecnología Agropecuaria (INTA). Estación Experimental Agropecuaria San Pedro; Argentina
Fil: Valentini, Gabriel Hugo. Instituto Nacional de Tecnología Agropecuaria (INTA). Estación Experimental Agropecuaria San Pedro; Argentina
Fil: Sánchez, Gerardo. Instituto Nacional de Tecnología Agropecuaria (INTA). Estación Experimental Agropecuaria San Pedro; Argentina
Fuente
PAG Asia 2022. Plant and animal genome Conference. June 22-24. 2022
Materia
Frutas de Clima Templado
Durazno
Prunus persica
Cambio Climático
Técnicas de Predicción
Machine Learning
Biotecnología
Fitomejoramiento
Temperate Fruits
Peaches
Forecasting
Biotechnology
Plant Breeding
Climate Change
Aprendizaje Electrónico
Nivel de accesibilidad
acceso restringido
Condiciones de uso
Repositorio
INTA Digital (INTA)
Institución
Instituto Nacional de Tecnología Agropecuaria
OAI Identificador
oai:localhost:20.500.12123/12807

id INTADig_4e8bfae2f5960a9ec4a96fc93cbc2e20
oai_identifier_str oai:localhost:20.500.12123/12807
network_acronym_str INTADig
repository_id_str l
network_name_str INTA Digital (INTA)
spelling A Machine Learning- Based Approach for the Discovery of Climate Smart PeachesAballay, Maximiliano MartínChirino, Julian SantiagoValentini, Gabriel HugoSanchez, GerardoFrutas de Clima TempladoDuraznoPrunus persicaCambio ClimáticoTécnicas de PredicciónMachine LearningBiotecnologíaFitomejoramientoTemperate FruitsPeachesForecastingBiotechnologyPlant BreedingClimate ChangeAprendizaje ElectrónicoPosterThe temperate climate fruit trees are proposed as the crops that will be more affected by climate change. This is due to that these tree species alternate a dormancy period induced by chilling requirement with a vegetative growth that is promoted by heating conditions. In addition to being plants of considerable size, are cultivated in open fields influenced by weather conditions. Besides its economic importance, peach is a model for rosacea species due to a small and diploid genome (2n=2x=16, 256 Mb) that is sequenced. In this work a novel Machine Learning based approach for traits forecast from genomic and climate variable is presented. A data set was obtained from the analysis of the EEA San Pedro peach collection composed of 237 accessions genotyped (13,584 genetic variants) by an in-house developed ddRAD-seq platform and phenotyped for Flowering date (FD), Harvest date (HD), Fruit weight (FW) and Soluble Solid Content (SSC) during 5 seasons (2017-2021). The climate variables (Chilling hours, Chilling units, Global radiation, Relative humidity, Pluviometric precipitation and Daily mean temperature) obtained from an automatic weather station were included resulting in combined (genomic, phenotype and climate) data set of 3.8 M data points. A Random Forest model was trained (using 80% of the data) for each trait resulting in performance (R2) higher than 90%. The cross validation indicated that the models forecast the FD and HD with R2 higher (84% and 79%) than FW and SSC (50% and 41%). To test the ability of the models to forecast extreme seasons, the season with less chilling accumulation of our region for the last 65 years was predicted. The R2 were in agreement with the results of cross validation indicating that the models performed well for climate scenarios that were not used in the training. Hypothetical future weather sceneries were simulated decreasing the daily chill hours and chill units (20%, 40% and 60%). The traits were forecasted for each scenery finding that when the daily chill hours and chill units decrease, 16% of genotypes will flower early (2-20 days) and 31% of genotypes will flower late (2-23 days). For harvest date 28% of genotypes showed an early harvest (2-40 days) and 42% of genotypes a late harvest (2-21 days). Furthermore 114 genotypes were stable for flowering date, and 58 genotypes were stable for harvest date under these conditions. From these genotypes a set of 35 that keep the two variables stable at the same time were identified. The final model trained with all seasons allowed us to identify the importance of variables (genetic and climatic) for make the predictions. Unexpectedly, some climate variables presented a high level of importance to the trained model suggesting novel phenomena. Here we present a the proof of concept that encourage the use of machine learning tools for the identification and selection of peach genotypes able to face climatic change. Currently, we need to collect more data to improve the performance of our predictions and extend these machine learning tools to other traits.EEA San PedroFil: Aballay, Maximiliano Martín. Instituto Nacional de Tecnología Agropecuaria (INTA). Estación Experimental Agropecuaria San Pedro; ArgentinaFil: Chirino, Julián Santiago. Instituto Nacional de Tecnología Agropecuaria (INTA). Estación Experimental Agropecuaria San Pedro; ArgentinaFil: Valentini, Gabriel Hugo. Instituto Nacional de Tecnología Agropecuaria (INTA). Estación Experimental Agropecuaria San Pedro; ArgentinaFil: Sánchez, Gerardo. Instituto Nacional de Tecnología Agropecuaria (INTA). Estación Experimental Agropecuaria San Pedro; Argentina2022-09-07T12:19:17Z2022-09-07T12:19:17Z2022info:eu-repo/semantics/conferenceObjectinfo:eu-repo/semantics/publishedVersionhttp://purl.org/coar/resource_type/c_5794info:ar-repo/semantics/documentoDeConferenciaapplication/pdfhttp://hdl.handle.net/20.500.12123/12807PAG Asia 2022. Plant and animal genome Conference. June 22-24. 2022reponame:INTA Digital (INTA)instname:Instituto Nacional de Tecnología Agropecuariaenginfo:eu-repograntAgreement/INTA/2019-PE-E6-I114-001/2019-PE-E6-I114-001/AR./Caracterización de la diversidad genética de plantas, animales y microorganismos mediante herramientas de genómica aplicada.info:eu-repo/semantics/restrictedAccess2025-11-27T08:39:15Zoai:localhost:20.500.12123/12807instacron:INTAInstitucionalhttp://repositorio.inta.gob.ar/Organismo científico-tecnológicoNo correspondehttp://repositorio.inta.gob.ar/oai/requesttripaldi.nicolas@inta.gob.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:l2025-11-27 08:39:15.977INTA Digital (INTA) - Instituto Nacional de Tecnología Agropecuariafalse
dc.title.none.fl_str_mv A Machine Learning- Based Approach for the Discovery of Climate Smart Peaches
title A Machine Learning- Based Approach for the Discovery of Climate Smart Peaches
spellingShingle A Machine Learning- Based Approach for the Discovery of Climate Smart Peaches
Aballay, Maximiliano Martín
Frutas de Clima Templado
Durazno
Prunus persica
Cambio Climático
Técnicas de Predicción
Machine Learning
Biotecnología
Fitomejoramiento
Temperate Fruits
Peaches
Forecasting
Biotechnology
Plant Breeding
Climate Change
Aprendizaje Electrónico
title_short A Machine Learning- Based Approach for the Discovery of Climate Smart Peaches
title_full A Machine Learning- Based Approach for the Discovery of Climate Smart Peaches
title_fullStr A Machine Learning- Based Approach for the Discovery of Climate Smart Peaches
title_full_unstemmed A Machine Learning- Based Approach for the Discovery of Climate Smart Peaches
title_sort A Machine Learning- Based Approach for the Discovery of Climate Smart Peaches
dc.creator.none.fl_str_mv Aballay, Maximiliano Martín
Chirino, Julian Santiago
Valentini, Gabriel Hugo
Sanchez, Gerardo
author Aballay, Maximiliano Martín
author_facet Aballay, Maximiliano Martín
Chirino, Julian Santiago
Valentini, Gabriel Hugo
Sanchez, Gerardo
author_role author
author2 Chirino, Julian Santiago
Valentini, Gabriel Hugo
Sanchez, Gerardo
author2_role author
author
author
dc.subject.none.fl_str_mv Frutas de Clima Templado
Durazno
Prunus persica
Cambio Climático
Técnicas de Predicción
Machine Learning
Biotecnología
Fitomejoramiento
Temperate Fruits
Peaches
Forecasting
Biotechnology
Plant Breeding
Climate Change
Aprendizaje Electrónico
topic Frutas de Clima Templado
Durazno
Prunus persica
Cambio Climático
Técnicas de Predicción
Machine Learning
Biotecnología
Fitomejoramiento
Temperate Fruits
Peaches
Forecasting
Biotechnology
Plant Breeding
Climate Change
Aprendizaje Electrónico
dc.description.none.fl_txt_mv Poster
The temperate climate fruit trees are proposed as the crops that will be more affected by climate change. This is due to that these tree species alternate a dormancy period induced by chilling requirement with a vegetative growth that is promoted by heating conditions. In addition to being plants of considerable size, are cultivated in open fields influenced by weather conditions. Besides its economic importance, peach is a model for rosacea species due to a small and diploid genome (2n=2x=16, 256 Mb) that is sequenced. In this work a novel Machine Learning based approach for traits forecast from genomic and climate variable is presented. A data set was obtained from the analysis of the EEA San Pedro peach collection composed of 237 accessions genotyped (13,584 genetic variants) by an in-house developed ddRAD-seq platform and phenotyped for Flowering date (FD), Harvest date (HD), Fruit weight (FW) and Soluble Solid Content (SSC) during 5 seasons (2017-2021). The climate variables (Chilling hours, Chilling units, Global radiation, Relative humidity, Pluviometric precipitation and Daily mean temperature) obtained from an automatic weather station were included resulting in combined (genomic, phenotype and climate) data set of 3.8 M data points. A Random Forest model was trained (using 80% of the data) for each trait resulting in performance (R2) higher than 90%. The cross validation indicated that the models forecast the FD and HD with R2 higher (84% and 79%) than FW and SSC (50% and 41%). To test the ability of the models to forecast extreme seasons, the season with less chilling accumulation of our region for the last 65 years was predicted. The R2 were in agreement with the results of cross validation indicating that the models performed well for climate scenarios that were not used in the training. Hypothetical future weather sceneries were simulated decreasing the daily chill hours and chill units (20%, 40% and 60%). The traits were forecasted for each scenery finding that when the daily chill hours and chill units decrease, 16% of genotypes will flower early (2-20 days) and 31% of genotypes will flower late (2-23 days). For harvest date 28% of genotypes showed an early harvest (2-40 days) and 42% of genotypes a late harvest (2-21 days). Furthermore 114 genotypes were stable for flowering date, and 58 genotypes were stable for harvest date under these conditions. From these genotypes a set of 35 that keep the two variables stable at the same time were identified. The final model trained with all seasons allowed us to identify the importance of variables (genetic and climatic) for make the predictions. Unexpectedly, some climate variables presented a high level of importance to the trained model suggesting novel phenomena. Here we present a the proof of concept that encourage the use of machine learning tools for the identification and selection of peach genotypes able to face climatic change. Currently, we need to collect more data to improve the performance of our predictions and extend these machine learning tools to other traits.
EEA San Pedro
Fil: Aballay, Maximiliano Martín. Instituto Nacional de Tecnología Agropecuaria (INTA). Estación Experimental Agropecuaria San Pedro; Argentina
Fil: Chirino, Julián Santiago. Instituto Nacional de Tecnología Agropecuaria (INTA). Estación Experimental Agropecuaria San Pedro; Argentina
Fil: Valentini, Gabriel Hugo. Instituto Nacional de Tecnología Agropecuaria (INTA). Estación Experimental Agropecuaria San Pedro; Argentina
Fil: Sánchez, Gerardo. Instituto Nacional de Tecnología Agropecuaria (INTA). Estación Experimental Agropecuaria San Pedro; Argentina
description Poster
publishDate 2022
dc.date.none.fl_str_mv 2022-09-07T12:19:17Z
2022-09-07T12:19:17Z
2022
dc.type.none.fl_str_mv info:eu-repo/semantics/conferenceObject
info:eu-repo/semantics/publishedVersion
http://purl.org/coar/resource_type/c_5794
info:ar-repo/semantics/documentoDeConferencia
format conferenceObject
status_str publishedVersion
dc.identifier.none.fl_str_mv http://hdl.handle.net/20.500.12123/12807
url http://hdl.handle.net/20.500.12123/12807
dc.language.none.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv info:eu-repograntAgreement/INTA/2019-PE-E6-I114-001/2019-PE-E6-I114-001/AR./Caracterización de la diversidad genética de plantas, animales y microorganismos mediante herramientas de genómica aplicada.
dc.rights.none.fl_str_mv info:eu-repo/semantics/restrictedAccess
eu_rights_str_mv restrictedAccess
dc.format.none.fl_str_mv application/pdf
dc.source.none.fl_str_mv PAG Asia 2022. Plant and animal genome Conference. June 22-24. 2022
reponame:INTA Digital (INTA)
instname:Instituto Nacional de Tecnología Agropecuaria
reponame_str INTA Digital (INTA)
collection INTA Digital (INTA)
instname_str Instituto Nacional de Tecnología Agropecuaria
repository.name.fl_str_mv INTA Digital (INTA) - Instituto Nacional de Tecnología Agropecuaria
repository.mail.fl_str_mv tripaldi.nicolas@inta.gob.ar
_version_ 1849949299990855680
score 13.011256