Discovery of food identity markers by metabolomics and machine learning technology

Autores
Erban, Alexander; Fehrle, Ines; Martinez-Seidel, Federico; Brigante, Federico Iván; Lucini Mas, Agustín; Baroni, María Verónica; Wunderlin, Daniel Alberto; Kopka, Joachim
Año de publicación
2019
Idioma
inglés
Tipo de recurso
artículo
Estado
versión publicada
Descripción
Verification of food authenticity establishes consumer trust in food ingredients and components of processed food. Next to genetic or protein markers, chemicals are unique identifiers of food components. Non-targeted metabolomics is ideally suited to screen food markers when coupled to efficient data analysis. This study explored feasibility of random forest (RF) machine learning, specifically its inherent feature extraction for non-targeted metabolic marker discovery. The distinction of chia, linseed, and sesame that have gained attention as “superfoods” served as test case. Chemical fractions of non-processed seeds and of wheat cookies with seed ingredients were profiled. RF technology classified original seeds unambiguously but appeared overdesigned for material with unique secondary metabolites, like sesamol or rosmarinic acid in the Lamiaceae, chia. Most unique metabolites were diluted or lost during cookie production but RF technology classified the presence of the seed ingredients in cookies with 6.7% overall error and revealed food processing markers, like 4-hydroxybenzaldehyde for chia and succinic acid monomethylester for linseed additions. RF based feature extraction was adequate for difficult classifications but marker selection should not be without human supervision. Combination with alternative data analysis technologies is advised and further testing of a wide range of seeds and food processing methods.
Fil: Erban, Alexander. Max-Planck-Institute of Molecular Plant Physiology. Department of Molecular Physiology; Alemania
Fil: Fehrle, Ines. Max-Planck-Institute of Molecular Plant Physiology. Department of Molecular Physiology; Alemania
Fil: Martinez-Seidel, Federico. Max-Planck-Institute of Molecular Plant Physiology. Department of Molecular Physiology; Alemania
Fil: Brigante, Federico Iván. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Córdoba. Instituto de Ciencia y Tecnología de Alimentos Córdoba. Universidad Nacional de Córdoba. Facultad de Ciencias Químicas. Instituto de Ciencia y Tecnología de Alimentos Córdoba; Argentina
Fil: Lucini Mas, Agustín. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Córdoba. Instituto de Ciencia y Tecnología de Alimentos Córdoba. Universidad Nacional de Córdoba. Facultad de Ciencias Químicas. Instituto de Ciencia y Tecnología de Alimentos Córdoba; Argentina
Fil: Baroni, María Verónica. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Córdoba. Instituto de Ciencia y Tecnología de Alimentos Córdoba. Universidad Nacional de Córdoba. Facultad de Ciencias Químicas. Instituto de Ciencia y Tecnología de Alimentos Córdoba; Argentina
Fil: Wunderlin, Daniel Alberto. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Córdoba. Instituto de Ciencia y Tecnología de Alimentos Córdoba. Universidad Nacional de Córdoba. Facultad de Ciencias Químicas. Instituto de Ciencia y Tecnología de Alimentos Córdoba; Argentina
Fil: Kopka, Joachim. Max-Planck-Institute of Molecular Plant Physiology. Department of Molecular Physiology; Alemania
Materia
METABOLOMICS
GCMS
MACHINE LEARNING
SEEDS
Nivel de accesibilidad
acceso abierto
Condiciones de uso
https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
Repositorio
CONICET Digital (CONICET)
Institución
Consejo Nacional de Investigaciones Científicas y Técnicas
OAI Identificador
oai:ri.conicet.gov.ar:11336/125719

id CONICETDig_fab251c24aa06cef42d427a4ea32ce95
oai_identifier_str oai:ri.conicet.gov.ar:11336/125719
network_acronym_str CONICETDig
repository_id_str 3498
network_name_str CONICET Digital (CONICET)
spelling Discovery of food identity markers by metabolomics and machine learning technologyErban, AlexanderFehrle, InesMartinez-Seidel, FedericoBrigante, Federico IvánLucini Mas, AgustínBaroni, María VerónicaWunderlin, Daniel AlbertoKopka, JoachimMETABOLOMICSGCMSMACHINE LEARNINGSEEDShttps://purl.org/becyt/ford/1.4https://purl.org/becyt/ford/1Verification of food authenticity establishes consumer trust in food ingredients and components of processed food. Next to genetic or protein markers, chemicals are unique identifiers of food components. Non-targeted metabolomics is ideally suited to screen food markers when coupled to efficient data analysis. This study explored feasibility of random forest (RF) machine learning, specifically its inherent feature extraction for non-targeted metabolic marker discovery. The distinction of chia, linseed, and sesame that have gained attention as “superfoods” served as test case. Chemical fractions of non-processed seeds and of wheat cookies with seed ingredients were profiled. RF technology classified original seeds unambiguously but appeared overdesigned for material with unique secondary metabolites, like sesamol or rosmarinic acid in the Lamiaceae, chia. Most unique metabolites were diluted or lost during cookie production but RF technology classified the presence of the seed ingredients in cookies with 6.7% overall error and revealed food processing markers, like 4-hydroxybenzaldehyde for chia and succinic acid monomethylester for linseed additions. RF based feature extraction was adequate for difficult classifications but marker selection should not be without human supervision. Combination with alternative data analysis technologies is advised and further testing of a wide range of seeds and food processing methods.Fil: Erban, Alexander. Max-Planck-Institute of Molecular Plant Physiology. Department of Molecular Physiology; AlemaniaFil: Fehrle, Ines. Max-Planck-Institute of Molecular Plant Physiology. Department of Molecular Physiology; AlemaniaFil: Martinez-Seidel, Federico. Max-Planck-Institute of Molecular Plant Physiology. Department of Molecular Physiology; AlemaniaFil: Brigante, Federico Iván. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Córdoba. Instituto de Ciencia y Tecnología de Alimentos Córdoba. Universidad Nacional de Córdoba. Facultad de Ciencias Químicas. Instituto de Ciencia y Tecnología de Alimentos Córdoba; ArgentinaFil: Lucini Mas, Agustín. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Córdoba. Instituto de Ciencia y Tecnología de Alimentos Córdoba. Universidad Nacional de Córdoba. Facultad de Ciencias Químicas. Instituto de Ciencia y Tecnología de Alimentos Córdoba; ArgentinaFil: Baroni, María Verónica. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Córdoba. Instituto de Ciencia y Tecnología de Alimentos Córdoba. Universidad Nacional de Córdoba. Facultad de Ciencias Químicas. Instituto de Ciencia y Tecnología de Alimentos Córdoba; ArgentinaFil: Wunderlin, Daniel Alberto. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Córdoba. Instituto de Ciencia y Tecnología de Alimentos Córdoba. Universidad Nacional de Córdoba. Facultad de Ciencias Químicas. Instituto de Ciencia y Tecnología de Alimentos Córdoba; ArgentinaFil: Kopka, Joachim. Max-Planck-Institute of Molecular Plant Physiology. Department of Molecular Physiology; AlemaniaNature Publishing Group2019-12info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionhttp://purl.org/coar/resource_type/c_6501info:ar-repo/semantics/articuloapplication/pdfapplication/pdfhttp://hdl.handle.net/11336/125719Erban, Alexander; Fehrle, Ines; Martinez-Seidel, Federico; Brigante, Federico Iván; Lucini Mas, Agustín; et al.; Discovery of food identity markers by metabolomics and machine learning technology; Nature Publishing Group; Scientific Reports; 9; 9697; 12-20192045-2322CONICET DigitalCONICETenginfo:eu-repo/semantics/altIdentifier/doi/10.1038/s41598-019-46113-yinfo:eu-repo/semantics/altIdentifier/url/https://www.nature.com/articles/s41598-019-46113-yinfo:eu-repo/semantics/openAccesshttps://creativecommons.org/licenses/by-nc-sa/2.5/ar/reponame:CONICET Digital (CONICET)instname:Consejo Nacional de Investigaciones Científicas y Técnicas2025-09-29T10:09:41Zoai:ri.conicet.gov.ar:11336/125719instacron:CONICETInstitucionalhttp://ri.conicet.gov.ar/Organismo científico-tecnológicoNo correspondehttp://ri.conicet.gov.ar/oai/requestdasensio@conicet.gov.ar; lcarlino@conicet.gov.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:34982025-09-29 10:09:42.212CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicasfalse
dc.title.none.fl_str_mv Discovery of food identity markers by metabolomics and machine learning technology
title Discovery of food identity markers by metabolomics and machine learning technology
spellingShingle Discovery of food identity markers by metabolomics and machine learning technology
Erban, Alexander
METABOLOMICS
GCMS
MACHINE LEARNING
SEEDS
title_short Discovery of food identity markers by metabolomics and machine learning technology
title_full Discovery of food identity markers by metabolomics and machine learning technology
title_fullStr Discovery of food identity markers by metabolomics and machine learning technology
title_full_unstemmed Discovery of food identity markers by metabolomics and machine learning technology
title_sort Discovery of food identity markers by metabolomics and machine learning technology
dc.creator.none.fl_str_mv Erban, Alexander
Fehrle, Ines
Martinez-Seidel, Federico
Brigante, Federico Iván
Lucini Mas, Agustín
Baroni, María Verónica
Wunderlin, Daniel Alberto
Kopka, Joachim
author Erban, Alexander
author_facet Erban, Alexander
Fehrle, Ines
Martinez-Seidel, Federico
Brigante, Federico Iván
Lucini Mas, Agustín
Baroni, María Verónica
Wunderlin, Daniel Alberto
Kopka, Joachim
author_role author
author2 Fehrle, Ines
Martinez-Seidel, Federico
Brigante, Federico Iván
Lucini Mas, Agustín
Baroni, María Verónica
Wunderlin, Daniel Alberto
Kopka, Joachim
author2_role author
author
author
author
author
author
author
dc.subject.none.fl_str_mv METABOLOMICS
GCMS
MACHINE LEARNING
SEEDS
topic METABOLOMICS
GCMS
MACHINE LEARNING
SEEDS
purl_subject.fl_str_mv https://purl.org/becyt/ford/1.4
https://purl.org/becyt/ford/1
dc.description.none.fl_txt_mv Verification of food authenticity establishes consumer trust in food ingredients and components of processed food. Next to genetic or protein markers, chemicals are unique identifiers of food components. Non-targeted metabolomics is ideally suited to screen food markers when coupled to efficient data analysis. This study explored feasibility of random forest (RF) machine learning, specifically its inherent feature extraction for non-targeted metabolic marker discovery. The distinction of chia, linseed, and sesame that have gained attention as “superfoods” served as test case. Chemical fractions of non-processed seeds and of wheat cookies with seed ingredients were profiled. RF technology classified original seeds unambiguously but appeared overdesigned for material with unique secondary metabolites, like sesamol or rosmarinic acid in the Lamiaceae, chia. Most unique metabolites were diluted or lost during cookie production but RF technology classified the presence of the seed ingredients in cookies with 6.7% overall error and revealed food processing markers, like 4-hydroxybenzaldehyde for chia and succinic acid monomethylester for linseed additions. RF based feature extraction was adequate for difficult classifications but marker selection should not be without human supervision. Combination with alternative data analysis technologies is advised and further testing of a wide range of seeds and food processing methods.
Fil: Erban, Alexander. Max-Planck-Institute of Molecular Plant Physiology. Department of Molecular Physiology; Alemania
Fil: Fehrle, Ines. Max-Planck-Institute of Molecular Plant Physiology. Department of Molecular Physiology; Alemania
Fil: Martinez-Seidel, Federico. Max-Planck-Institute of Molecular Plant Physiology. Department of Molecular Physiology; Alemania
Fil: Brigante, Federico Iván. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Córdoba. Instituto de Ciencia y Tecnología de Alimentos Córdoba. Universidad Nacional de Córdoba. Facultad de Ciencias Químicas. Instituto de Ciencia y Tecnología de Alimentos Córdoba; Argentina
Fil: Lucini Mas, Agustín. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Córdoba. Instituto de Ciencia y Tecnología de Alimentos Córdoba. Universidad Nacional de Córdoba. Facultad de Ciencias Químicas. Instituto de Ciencia y Tecnología de Alimentos Córdoba; Argentina
Fil: Baroni, María Verónica. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Córdoba. Instituto de Ciencia y Tecnología de Alimentos Córdoba. Universidad Nacional de Córdoba. Facultad de Ciencias Químicas. Instituto de Ciencia y Tecnología de Alimentos Córdoba; Argentina
Fil: Wunderlin, Daniel Alberto. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Córdoba. Instituto de Ciencia y Tecnología de Alimentos Córdoba. Universidad Nacional de Córdoba. Facultad de Ciencias Químicas. Instituto de Ciencia y Tecnología de Alimentos Córdoba; Argentina
Fil: Kopka, Joachim. Max-Planck-Institute of Molecular Plant Physiology. Department of Molecular Physiology; Alemania
description Verification of food authenticity establishes consumer trust in food ingredients and components of processed food. Next to genetic or protein markers, chemicals are unique identifiers of food components. Non-targeted metabolomics is ideally suited to screen food markers when coupled to efficient data analysis. This study explored feasibility of random forest (RF) machine learning, specifically its inherent feature extraction for non-targeted metabolic marker discovery. The distinction of chia, linseed, and sesame that have gained attention as “superfoods” served as test case. Chemical fractions of non-processed seeds and of wheat cookies with seed ingredients were profiled. RF technology classified original seeds unambiguously but appeared overdesigned for material with unique secondary metabolites, like sesamol or rosmarinic acid in the Lamiaceae, chia. Most unique metabolites were diluted or lost during cookie production but RF technology classified the presence of the seed ingredients in cookies with 6.7% overall error and revealed food processing markers, like 4-hydroxybenzaldehyde for chia and succinic acid monomethylester for linseed additions. RF based feature extraction was adequate for difficult classifications but marker selection should not be without human supervision. Combination with alternative data analysis technologies is advised and further testing of a wide range of seeds and food processing methods.
publishDate 2019
dc.date.none.fl_str_mv 2019-12
dc.type.none.fl_str_mv info:eu-repo/semantics/article
info:eu-repo/semantics/publishedVersion
http://purl.org/coar/resource_type/c_6501
info:ar-repo/semantics/articulo
format article
status_str publishedVersion
dc.identifier.none.fl_str_mv http://hdl.handle.net/11336/125719
Erban, Alexander; Fehrle, Ines; Martinez-Seidel, Federico; Brigante, Federico Iván; Lucini Mas, Agustín; et al.; Discovery of food identity markers by metabolomics and machine learning technology; Nature Publishing Group; Scientific Reports; 9; 9697; 12-2019
2045-2322
CONICET Digital
CONICET
url http://hdl.handle.net/11336/125719
identifier_str_mv Erban, Alexander; Fehrle, Ines; Martinez-Seidel, Federico; Brigante, Federico Iván; Lucini Mas, Agustín; et al.; Discovery of food identity markers by metabolomics and machine learning technology; Nature Publishing Group; Scientific Reports; 9; 9697; 12-2019
2045-2322
CONICET Digital
CONICET
dc.language.none.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv info:eu-repo/semantics/altIdentifier/doi/10.1038/s41598-019-46113-y
info:eu-repo/semantics/altIdentifier/url/https://www.nature.com/articles/s41598-019-46113-y
dc.rights.none.fl_str_mv info:eu-repo/semantics/openAccess
https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
eu_rights_str_mv openAccess
rights_invalid_str_mv https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
dc.format.none.fl_str_mv application/pdf
application/pdf
dc.publisher.none.fl_str_mv Nature Publishing Group
publisher.none.fl_str_mv Nature Publishing Group
dc.source.none.fl_str_mv reponame:CONICET Digital (CONICET)
instname:Consejo Nacional de Investigaciones Científicas y Técnicas
reponame_str CONICET Digital (CONICET)
collection CONICET Digital (CONICET)
instname_str Consejo Nacional de Investigaciones Científicas y Técnicas
repository.name.fl_str_mv CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicas
repository.mail.fl_str_mv dasensio@conicet.gov.ar; lcarlino@conicet.gov.ar
_version_ 1844613978201784320
score 13.070432