A python-based pipeline for preprocessing lc–ms data for untargeted metabolomics workflows

Autores
Riquelme, Gabriel; Zabalegui, Nicolás; Marchi, Pablo Gabriel; Jones, Christina M.; Monge, Maria Eugenia
Año de publicación
2020
Idioma
inglés
Tipo de recurso
artículo
Estado
versión publicada
Descripción
Preprocessing data in a reproducible and robust way is one of the current challenges in untargeted metabolomics workflows. Data curation in liquid chromatography–mass spectrometry (LC–MS) involves the removal of biologically non-relevant features (retention time, m/z pairs) to retain only high-quality data for subsequent analysis and interpretation. The present work introduces TidyMS, a package for the Python programming language for preprocessing LC–MS data for quality control (QC) procedures in untargeted metabolomics workflows. It is a versatile strategy that can be customized or fit for purpose according to the specific metabolomics application. It allows performing quality control procedures to ensure accuracy and reliability in LC–MS measurements, and it allows preprocessing metabolomics data to obtain cleaned matrices for subsequent statistical analysis. The capabilities of the package are shown with pipelines for an LC–MS system suitability check, system conditioning, signal drift evaluation, and data curation. These applications were implemented to preprocess data corresponding to a new suite of candidate plasma reference materials developed by the National Institute of Standards and Technology (NIST; hypertriglyceridemic, diabetic, and African-American plasma pools) to be used in untargeted metabolomics studies in addition to NIST SRM 1950 Metabolites in Frozen Human Plasma. The package offers a rapid and reproducible workflow that can be used in an automated or semi-automated fashion, and it is an open and free tool available to all users.
Fil: Riquelme, Gabriel. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Centro de Investigaciones en Bionanociencias "Elizabeth Jares Erijman"; Argentina. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Departamento de Química Inorgánica, Analítica y Química Física; Argentina
Fil: Zabalegui, Nicolás. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Centro de Investigaciones en Bionanociencias "Elizabeth Jares Erijman"; Argentina. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Departamento de Química Inorgánica, Analítica y Química Física; Argentina
Fil: Marchi, Pablo Gabriel. Universidad de Buenos Aires. Facultad de Ingeniería; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina
Fil: Jones, Christina M.. National Institute Of Standards And Technology; Estados Unidos
Fil: Monge, Maria Eugenia. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Centro de Investigaciones en Bionanociencias "Elizabeth Jares Erijman"; Argentina
Materia
DATA CLEANING
DATA CURATION
PREPROCESSING
PYTHON
QUALITY CONTROL
REFERENCE MATERIALS
SIGNAL DRIFT
SYSTEM SUITABILITY
UNTARGETED METABOLOMICS
Nivel de accesibilidad
acceso abierto
Condiciones de uso
https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
Repositorio
CONICET Digital (CONICET)
Institución
Consejo Nacional de Investigaciones Científicas y Técnicas
OAI Identificador
oai:ri.conicet.gov.ar:11336/129950

id CONICETDig_4cd215ea61b9499b6ad032cb7516ae6e
oai_identifier_str oai:ri.conicet.gov.ar:11336/129950
network_acronym_str CONICETDig
repository_id_str 3498
network_name_str CONICET Digital (CONICET)
spelling A python-based pipeline for preprocessing lc–ms data for untargeted metabolomics workflowsRiquelme, GabrielZabalegui, NicolásMarchi, Pablo GabrielJones, Christina M.Monge, Maria EugeniaDATA CLEANINGDATA CURATIONPREPROCESSINGPYTHONQUALITY CONTROLREFERENCE MATERIALSSIGNAL DRIFTSYSTEM SUITABILITYUNTARGETED METABOLOMICShttps://purl.org/becyt/ford/1.4https://purl.org/becyt/ford/1Preprocessing data in a reproducible and robust way is one of the current challenges in untargeted metabolomics workflows. Data curation in liquid chromatography–mass spectrometry (LC–MS) involves the removal of biologically non-relevant features (retention time, m/z pairs) to retain only high-quality data for subsequent analysis and interpretation. The present work introduces TidyMS, a package for the Python programming language for preprocessing LC–MS data for quality control (QC) procedures in untargeted metabolomics workflows. It is a versatile strategy that can be customized or fit for purpose according to the specific metabolomics application. It allows performing quality control procedures to ensure accuracy and reliability in LC–MS measurements, and it allows preprocessing metabolomics data to obtain cleaned matrices for subsequent statistical analysis. The capabilities of the package are shown with pipelines for an LC–MS system suitability check, system conditioning, signal drift evaluation, and data curation. These applications were implemented to preprocess data corresponding to a new suite of candidate plasma reference materials developed by the National Institute of Standards and Technology (NIST; hypertriglyceridemic, diabetic, and African-American plasma pools) to be used in untargeted metabolomics studies in addition to NIST SRM 1950 Metabolites in Frozen Human Plasma. The package offers a rapid and reproducible workflow that can be used in an automated or semi-automated fashion, and it is an open and free tool available to all users.Fil: Riquelme, Gabriel. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Centro de Investigaciones en Bionanociencias "Elizabeth Jares Erijman"; Argentina. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Departamento de Química Inorgánica, Analítica y Química Física; ArgentinaFil: Zabalegui, Nicolás. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Centro de Investigaciones en Bionanociencias "Elizabeth Jares Erijman"; Argentina. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Departamento de Química Inorgánica, Analítica y Química Física; ArgentinaFil: Marchi, Pablo Gabriel. Universidad de Buenos Aires. Facultad de Ingeniería; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; ArgentinaFil: Jones, Christina M.. National Institute Of Standards And Technology; Estados UnidosFil: Monge, Maria Eugenia. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Centro de Investigaciones en Bionanociencias "Elizabeth Jares Erijman"; ArgentinaMolecular Diversity Preservation International2020-10info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionhttp://purl.org/coar/resource_type/c_6501info:ar-repo/semantics/articuloapplication/pdfapplication/pdfhttp://hdl.handle.net/11336/129950Riquelme, Gabriel; Zabalegui, Nicolás; Marchi, Pablo Gabriel; Jones, Christina M.; Monge, Maria Eugenia; A python-based pipeline for preprocessing lc–ms data for untargeted metabolomics workflows; Molecular Diversity Preservation International; Metabolites; 10; 10; 10-2020; 1-142218-1989CONICET DigitalCONICETenginfo:eu-repo/semantics/altIdentifier/doi/10.3390/metabo10100416info:eu-repo/semantics/altIdentifier/url/https://www.mdpi.com/2218-1989/10/10/416info:eu-repo/semantics/openAccesshttps://creativecommons.org/licenses/by-nc-sa/2.5/ar/reponame:CONICET Digital (CONICET)instname:Consejo Nacional de Investigaciones Científicas y Técnicas2025-09-10T13:05:51Zoai:ri.conicet.gov.ar:11336/129950instacron:CONICETInstitucionalhttp://ri.conicet.gov.ar/Organismo científico-tecnológicoNo correspondehttp://ri.conicet.gov.ar/oai/requestdasensio@conicet.gov.ar; lcarlino@conicet.gov.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:34982025-09-10 13:05:52.12CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicasfalse
dc.title.none.fl_str_mv A python-based pipeline for preprocessing lc–ms data for untargeted metabolomics workflows
title A python-based pipeline for preprocessing lc–ms data for untargeted metabolomics workflows
spellingShingle A python-based pipeline for preprocessing lc–ms data for untargeted metabolomics workflows
Riquelme, Gabriel
DATA CLEANING
DATA CURATION
PREPROCESSING
PYTHON
QUALITY CONTROL
REFERENCE MATERIALS
SIGNAL DRIFT
SYSTEM SUITABILITY
UNTARGETED METABOLOMICS
title_short A python-based pipeline for preprocessing lc–ms data for untargeted metabolomics workflows
title_full A python-based pipeline for preprocessing lc–ms data for untargeted metabolomics workflows
title_fullStr A python-based pipeline for preprocessing lc–ms data for untargeted metabolomics workflows
title_full_unstemmed A python-based pipeline for preprocessing lc–ms data for untargeted metabolomics workflows
title_sort A python-based pipeline for preprocessing lc–ms data for untargeted metabolomics workflows
dc.creator.none.fl_str_mv Riquelme, Gabriel
Zabalegui, Nicolás
Marchi, Pablo Gabriel
Jones, Christina M.
Monge, Maria Eugenia
author Riquelme, Gabriel
author_facet Riquelme, Gabriel
Zabalegui, Nicolás
Marchi, Pablo Gabriel
Jones, Christina M.
Monge, Maria Eugenia
author_role author
author2 Zabalegui, Nicolás
Marchi, Pablo Gabriel
Jones, Christina M.
Monge, Maria Eugenia
author2_role author
author
author
author
dc.subject.none.fl_str_mv DATA CLEANING
DATA CURATION
PREPROCESSING
PYTHON
QUALITY CONTROL
REFERENCE MATERIALS
SIGNAL DRIFT
SYSTEM SUITABILITY
UNTARGETED METABOLOMICS
topic DATA CLEANING
DATA CURATION
PREPROCESSING
PYTHON
QUALITY CONTROL
REFERENCE MATERIALS
SIGNAL DRIFT
SYSTEM SUITABILITY
UNTARGETED METABOLOMICS
purl_subject.fl_str_mv https://purl.org/becyt/ford/1.4
https://purl.org/becyt/ford/1
dc.description.none.fl_txt_mv Preprocessing data in a reproducible and robust way is one of the current challenges in untargeted metabolomics workflows. Data curation in liquid chromatography–mass spectrometry (LC–MS) involves the removal of biologically non-relevant features (retention time, m/z pairs) to retain only high-quality data for subsequent analysis and interpretation. The present work introduces TidyMS, a package for the Python programming language for preprocessing LC–MS data for quality control (QC) procedures in untargeted metabolomics workflows. It is a versatile strategy that can be customized or fit for purpose according to the specific metabolomics application. It allows performing quality control procedures to ensure accuracy and reliability in LC–MS measurements, and it allows preprocessing metabolomics data to obtain cleaned matrices for subsequent statistical analysis. The capabilities of the package are shown with pipelines for an LC–MS system suitability check, system conditioning, signal drift evaluation, and data curation. These applications were implemented to preprocess data corresponding to a new suite of candidate plasma reference materials developed by the National Institute of Standards and Technology (NIST; hypertriglyceridemic, diabetic, and African-American plasma pools) to be used in untargeted metabolomics studies in addition to NIST SRM 1950 Metabolites in Frozen Human Plasma. The package offers a rapid and reproducible workflow that can be used in an automated or semi-automated fashion, and it is an open and free tool available to all users.
Fil: Riquelme, Gabriel. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Centro de Investigaciones en Bionanociencias "Elizabeth Jares Erijman"; Argentina. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Departamento de Química Inorgánica, Analítica y Química Física; Argentina
Fil: Zabalegui, Nicolás. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Centro de Investigaciones en Bionanociencias "Elizabeth Jares Erijman"; Argentina. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Departamento de Química Inorgánica, Analítica y Química Física; Argentina
Fil: Marchi, Pablo Gabriel. Universidad de Buenos Aires. Facultad de Ingeniería; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina
Fil: Jones, Christina M.. National Institute Of Standards And Technology; Estados Unidos
Fil: Monge, Maria Eugenia. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Centro de Investigaciones en Bionanociencias "Elizabeth Jares Erijman"; Argentina
description Preprocessing data in a reproducible and robust way is one of the current challenges in untargeted metabolomics workflows. Data curation in liquid chromatography–mass spectrometry (LC–MS) involves the removal of biologically non-relevant features (retention time, m/z pairs) to retain only high-quality data for subsequent analysis and interpretation. The present work introduces TidyMS, a package for the Python programming language for preprocessing LC–MS data for quality control (QC) procedures in untargeted metabolomics workflows. It is a versatile strategy that can be customized or fit for purpose according to the specific metabolomics application. It allows performing quality control procedures to ensure accuracy and reliability in LC–MS measurements, and it allows preprocessing metabolomics data to obtain cleaned matrices for subsequent statistical analysis. The capabilities of the package are shown with pipelines for an LC–MS system suitability check, system conditioning, signal drift evaluation, and data curation. These applications were implemented to preprocess data corresponding to a new suite of candidate plasma reference materials developed by the National Institute of Standards and Technology (NIST; hypertriglyceridemic, diabetic, and African-American plasma pools) to be used in untargeted metabolomics studies in addition to NIST SRM 1950 Metabolites in Frozen Human Plasma. The package offers a rapid and reproducible workflow that can be used in an automated or semi-automated fashion, and it is an open and free tool available to all users.
publishDate 2020
dc.date.none.fl_str_mv 2020-10
dc.type.none.fl_str_mv info:eu-repo/semantics/article
info:eu-repo/semantics/publishedVersion
http://purl.org/coar/resource_type/c_6501
info:ar-repo/semantics/articulo
format article
status_str publishedVersion
dc.identifier.none.fl_str_mv http://hdl.handle.net/11336/129950
Riquelme, Gabriel; Zabalegui, Nicolás; Marchi, Pablo Gabriel; Jones, Christina M.; Monge, Maria Eugenia; A python-based pipeline for preprocessing lc–ms data for untargeted metabolomics workflows; Molecular Diversity Preservation International; Metabolites; 10; 10; 10-2020; 1-14
2218-1989
CONICET Digital
CONICET
url http://hdl.handle.net/11336/129950
identifier_str_mv Riquelme, Gabriel; Zabalegui, Nicolás; Marchi, Pablo Gabriel; Jones, Christina M.; Monge, Maria Eugenia; A python-based pipeline for preprocessing lc–ms data for untargeted metabolomics workflows; Molecular Diversity Preservation International; Metabolites; 10; 10; 10-2020; 1-14
2218-1989
CONICET Digital
CONICET
dc.language.none.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv info:eu-repo/semantics/altIdentifier/doi/10.3390/metabo10100416
info:eu-repo/semantics/altIdentifier/url/https://www.mdpi.com/2218-1989/10/10/416
dc.rights.none.fl_str_mv info:eu-repo/semantics/openAccess
https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
eu_rights_str_mv openAccess
rights_invalid_str_mv https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
dc.format.none.fl_str_mv application/pdf
application/pdf
dc.publisher.none.fl_str_mv Molecular Diversity Preservation International
publisher.none.fl_str_mv Molecular Diversity Preservation International
dc.source.none.fl_str_mv reponame:CONICET Digital (CONICET)
instname:Consejo Nacional de Investigaciones Científicas y Técnicas
reponame_str CONICET Digital (CONICET)
collection CONICET Digital (CONICET)
instname_str Consejo Nacional de Investigaciones Científicas y Técnicas
repository.name.fl_str_mv CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicas
repository.mail.fl_str_mv dasensio@conicet.gov.ar; lcarlino@conicet.gov.ar
_version_ 1842980227903389696
score 12.993085