A python-based pipeline for preprocessing lc–ms data for untargeted metabolomics workflows
- Autores
- Riquelme, Gabriel; Zabalegui, Nicolás; Marchi, Pablo Gabriel; Jones, Christina M.; Monge, Maria Eugenia
- Año de publicación
- 2020
- Idioma
- inglés
- Tipo de recurso
- artículo
- Estado
- versión publicada
- Descripción
- Preprocessing data in a reproducible and robust way is one of the current challenges in untargeted metabolomics workflows. Data curation in liquid chromatography–mass spectrometry (LC–MS) involves the removal of biologically non-relevant features (retention time, m/z pairs) to retain only high-quality data for subsequent analysis and interpretation. The present work introduces TidyMS, a package for the Python programming language for preprocessing LC–MS data for quality control (QC) procedures in untargeted metabolomics workflows. It is a versatile strategy that can be customized or fit for purpose according to the specific metabolomics application. It allows performing quality control procedures to ensure accuracy and reliability in LC–MS measurements, and it allows preprocessing metabolomics data to obtain cleaned matrices for subsequent statistical analysis. The capabilities of the package are shown with pipelines for an LC–MS system suitability check, system conditioning, signal drift evaluation, and data curation. These applications were implemented to preprocess data corresponding to a new suite of candidate plasma reference materials developed by the National Institute of Standards and Technology (NIST; hypertriglyceridemic, diabetic, and African-American plasma pools) to be used in untargeted metabolomics studies in addition to NIST SRM 1950 Metabolites in Frozen Human Plasma. The package offers a rapid and reproducible workflow that can be used in an automated or semi-automated fashion, and it is an open and free tool available to all users.
Fil: Riquelme, Gabriel. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Centro de Investigaciones en Bionanociencias "Elizabeth Jares Erijman"; Argentina. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Departamento de Química Inorgánica, Analítica y Química Física; Argentina
Fil: Zabalegui, Nicolás. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Centro de Investigaciones en Bionanociencias "Elizabeth Jares Erijman"; Argentina. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Departamento de Química Inorgánica, Analítica y Química Física; Argentina
Fil: Marchi, Pablo Gabriel. Universidad de Buenos Aires. Facultad de Ingeniería; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina
Fil: Jones, Christina M.. National Institute Of Standards And Technology; Estados Unidos
Fil: Monge, Maria Eugenia. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Centro de Investigaciones en Bionanociencias "Elizabeth Jares Erijman"; Argentina - Materia
-
DATA CLEANING
DATA CURATION
PREPROCESSING
PYTHON
QUALITY CONTROL
REFERENCE MATERIALS
SIGNAL DRIFT
SYSTEM SUITABILITY
UNTARGETED METABOLOMICS - Nivel de accesibilidad
- acceso abierto
- Condiciones de uso
- https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
- Repositorio
- Institución
- Consejo Nacional de Investigaciones Científicas y Técnicas
- OAI Identificador
- oai:ri.conicet.gov.ar:11336/129950
Ver los metadatos del registro completo
id |
CONICETDig_4cd215ea61b9499b6ad032cb7516ae6e |
---|---|
oai_identifier_str |
oai:ri.conicet.gov.ar:11336/129950 |
network_acronym_str |
CONICETDig |
repository_id_str |
3498 |
network_name_str |
CONICET Digital (CONICET) |
spelling |
A python-based pipeline for preprocessing lc–ms data for untargeted metabolomics workflowsRiquelme, GabrielZabalegui, NicolásMarchi, Pablo GabrielJones, Christina M.Monge, Maria EugeniaDATA CLEANINGDATA CURATIONPREPROCESSINGPYTHONQUALITY CONTROLREFERENCE MATERIALSSIGNAL DRIFTSYSTEM SUITABILITYUNTARGETED METABOLOMICShttps://purl.org/becyt/ford/1.4https://purl.org/becyt/ford/1Preprocessing data in a reproducible and robust way is one of the current challenges in untargeted metabolomics workflows. Data curation in liquid chromatography–mass spectrometry (LC–MS) involves the removal of biologically non-relevant features (retention time, m/z pairs) to retain only high-quality data for subsequent analysis and interpretation. The present work introduces TidyMS, a package for the Python programming language for preprocessing LC–MS data for quality control (QC) procedures in untargeted metabolomics workflows. It is a versatile strategy that can be customized or fit for purpose according to the specific metabolomics application. It allows performing quality control procedures to ensure accuracy and reliability in LC–MS measurements, and it allows preprocessing metabolomics data to obtain cleaned matrices for subsequent statistical analysis. The capabilities of the package are shown with pipelines for an LC–MS system suitability check, system conditioning, signal drift evaluation, and data curation. These applications were implemented to preprocess data corresponding to a new suite of candidate plasma reference materials developed by the National Institute of Standards and Technology (NIST; hypertriglyceridemic, diabetic, and African-American plasma pools) to be used in untargeted metabolomics studies in addition to NIST SRM 1950 Metabolites in Frozen Human Plasma. The package offers a rapid and reproducible workflow that can be used in an automated or semi-automated fashion, and it is an open and free tool available to all users.Fil: Riquelme, Gabriel. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Centro de Investigaciones en Bionanociencias "Elizabeth Jares Erijman"; Argentina. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Departamento de Química Inorgánica, Analítica y Química Física; ArgentinaFil: Zabalegui, Nicolás. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Centro de Investigaciones en Bionanociencias "Elizabeth Jares Erijman"; Argentina. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Departamento de Química Inorgánica, Analítica y Química Física; ArgentinaFil: Marchi, Pablo Gabriel. Universidad de Buenos Aires. Facultad de Ingeniería; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; ArgentinaFil: Jones, Christina M.. National Institute Of Standards And Technology; Estados UnidosFil: Monge, Maria Eugenia. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Centro de Investigaciones en Bionanociencias "Elizabeth Jares Erijman"; ArgentinaMolecular Diversity Preservation International2020-10info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionhttp://purl.org/coar/resource_type/c_6501info:ar-repo/semantics/articuloapplication/pdfapplication/pdfhttp://hdl.handle.net/11336/129950Riquelme, Gabriel; Zabalegui, Nicolás; Marchi, Pablo Gabriel; Jones, Christina M.; Monge, Maria Eugenia; A python-based pipeline for preprocessing lc–ms data for untargeted metabolomics workflows; Molecular Diversity Preservation International; Metabolites; 10; 10; 10-2020; 1-142218-1989CONICET DigitalCONICETenginfo:eu-repo/semantics/altIdentifier/doi/10.3390/metabo10100416info:eu-repo/semantics/altIdentifier/url/https://www.mdpi.com/2218-1989/10/10/416info:eu-repo/semantics/openAccesshttps://creativecommons.org/licenses/by-nc-sa/2.5/ar/reponame:CONICET Digital (CONICET)instname:Consejo Nacional de Investigaciones Científicas y Técnicas2025-09-10T13:05:51Zoai:ri.conicet.gov.ar:11336/129950instacron:CONICETInstitucionalhttp://ri.conicet.gov.ar/Organismo científico-tecnológicoNo correspondehttp://ri.conicet.gov.ar/oai/requestdasensio@conicet.gov.ar; lcarlino@conicet.gov.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:34982025-09-10 13:05:52.12CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicasfalse |
dc.title.none.fl_str_mv |
A python-based pipeline for preprocessing lc–ms data for untargeted metabolomics workflows |
title |
A python-based pipeline for preprocessing lc–ms data for untargeted metabolomics workflows |
spellingShingle |
A python-based pipeline for preprocessing lc–ms data for untargeted metabolomics workflows Riquelme, Gabriel DATA CLEANING DATA CURATION PREPROCESSING PYTHON QUALITY CONTROL REFERENCE MATERIALS SIGNAL DRIFT SYSTEM SUITABILITY UNTARGETED METABOLOMICS |
title_short |
A python-based pipeline for preprocessing lc–ms data for untargeted metabolomics workflows |
title_full |
A python-based pipeline for preprocessing lc–ms data for untargeted metabolomics workflows |
title_fullStr |
A python-based pipeline for preprocessing lc–ms data for untargeted metabolomics workflows |
title_full_unstemmed |
A python-based pipeline for preprocessing lc–ms data for untargeted metabolomics workflows |
title_sort |
A python-based pipeline for preprocessing lc–ms data for untargeted metabolomics workflows |
dc.creator.none.fl_str_mv |
Riquelme, Gabriel Zabalegui, Nicolás Marchi, Pablo Gabriel Jones, Christina M. Monge, Maria Eugenia |
author |
Riquelme, Gabriel |
author_facet |
Riquelme, Gabriel Zabalegui, Nicolás Marchi, Pablo Gabriel Jones, Christina M. Monge, Maria Eugenia |
author_role |
author |
author2 |
Zabalegui, Nicolás Marchi, Pablo Gabriel Jones, Christina M. Monge, Maria Eugenia |
author2_role |
author author author author |
dc.subject.none.fl_str_mv |
DATA CLEANING DATA CURATION PREPROCESSING PYTHON QUALITY CONTROL REFERENCE MATERIALS SIGNAL DRIFT SYSTEM SUITABILITY UNTARGETED METABOLOMICS |
topic |
DATA CLEANING DATA CURATION PREPROCESSING PYTHON QUALITY CONTROL REFERENCE MATERIALS SIGNAL DRIFT SYSTEM SUITABILITY UNTARGETED METABOLOMICS |
purl_subject.fl_str_mv |
https://purl.org/becyt/ford/1.4 https://purl.org/becyt/ford/1 |
dc.description.none.fl_txt_mv |
Preprocessing data in a reproducible and robust way is one of the current challenges in untargeted metabolomics workflows. Data curation in liquid chromatography–mass spectrometry (LC–MS) involves the removal of biologically non-relevant features (retention time, m/z pairs) to retain only high-quality data for subsequent analysis and interpretation. The present work introduces TidyMS, a package for the Python programming language for preprocessing LC–MS data for quality control (QC) procedures in untargeted metabolomics workflows. It is a versatile strategy that can be customized or fit for purpose according to the specific metabolomics application. It allows performing quality control procedures to ensure accuracy and reliability in LC–MS measurements, and it allows preprocessing metabolomics data to obtain cleaned matrices for subsequent statistical analysis. The capabilities of the package are shown with pipelines for an LC–MS system suitability check, system conditioning, signal drift evaluation, and data curation. These applications were implemented to preprocess data corresponding to a new suite of candidate plasma reference materials developed by the National Institute of Standards and Technology (NIST; hypertriglyceridemic, diabetic, and African-American plasma pools) to be used in untargeted metabolomics studies in addition to NIST SRM 1950 Metabolites in Frozen Human Plasma. The package offers a rapid and reproducible workflow that can be used in an automated or semi-automated fashion, and it is an open and free tool available to all users. Fil: Riquelme, Gabriel. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Centro de Investigaciones en Bionanociencias "Elizabeth Jares Erijman"; Argentina. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Departamento de Química Inorgánica, Analítica y Química Física; Argentina Fil: Zabalegui, Nicolás. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Centro de Investigaciones en Bionanociencias "Elizabeth Jares Erijman"; Argentina. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Departamento de Química Inorgánica, Analítica y Química Física; Argentina Fil: Marchi, Pablo Gabriel. Universidad de Buenos Aires. Facultad de Ingeniería; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina Fil: Jones, Christina M.. National Institute Of Standards And Technology; Estados Unidos Fil: Monge, Maria Eugenia. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Centro de Investigaciones en Bionanociencias "Elizabeth Jares Erijman"; Argentina |
description |
Preprocessing data in a reproducible and robust way is one of the current challenges in untargeted metabolomics workflows. Data curation in liquid chromatography–mass spectrometry (LC–MS) involves the removal of biologically non-relevant features (retention time, m/z pairs) to retain only high-quality data for subsequent analysis and interpretation. The present work introduces TidyMS, a package for the Python programming language for preprocessing LC–MS data for quality control (QC) procedures in untargeted metabolomics workflows. It is a versatile strategy that can be customized or fit for purpose according to the specific metabolomics application. It allows performing quality control procedures to ensure accuracy and reliability in LC–MS measurements, and it allows preprocessing metabolomics data to obtain cleaned matrices for subsequent statistical analysis. The capabilities of the package are shown with pipelines for an LC–MS system suitability check, system conditioning, signal drift evaluation, and data curation. These applications were implemented to preprocess data corresponding to a new suite of candidate plasma reference materials developed by the National Institute of Standards and Technology (NIST; hypertriglyceridemic, diabetic, and African-American plasma pools) to be used in untargeted metabolomics studies in addition to NIST SRM 1950 Metabolites in Frozen Human Plasma. The package offers a rapid and reproducible workflow that can be used in an automated or semi-automated fashion, and it is an open and free tool available to all users. |
publishDate |
2020 |
dc.date.none.fl_str_mv |
2020-10 |
dc.type.none.fl_str_mv |
info:eu-repo/semantics/article info:eu-repo/semantics/publishedVersion http://purl.org/coar/resource_type/c_6501 info:ar-repo/semantics/articulo |
format |
article |
status_str |
publishedVersion |
dc.identifier.none.fl_str_mv |
http://hdl.handle.net/11336/129950 Riquelme, Gabriel; Zabalegui, Nicolás; Marchi, Pablo Gabriel; Jones, Christina M.; Monge, Maria Eugenia; A python-based pipeline for preprocessing lc–ms data for untargeted metabolomics workflows; Molecular Diversity Preservation International; Metabolites; 10; 10; 10-2020; 1-14 2218-1989 CONICET Digital CONICET |
url |
http://hdl.handle.net/11336/129950 |
identifier_str_mv |
Riquelme, Gabriel; Zabalegui, Nicolás; Marchi, Pablo Gabriel; Jones, Christina M.; Monge, Maria Eugenia; A python-based pipeline for preprocessing lc–ms data for untargeted metabolomics workflows; Molecular Diversity Preservation International; Metabolites; 10; 10; 10-2020; 1-14 2218-1989 CONICET Digital CONICET |
dc.language.none.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
info:eu-repo/semantics/altIdentifier/doi/10.3390/metabo10100416 info:eu-repo/semantics/altIdentifier/url/https://www.mdpi.com/2218-1989/10/10/416 |
dc.rights.none.fl_str_mv |
info:eu-repo/semantics/openAccess https://creativecommons.org/licenses/by-nc-sa/2.5/ar/ |
eu_rights_str_mv |
openAccess |
rights_invalid_str_mv |
https://creativecommons.org/licenses/by-nc-sa/2.5/ar/ |
dc.format.none.fl_str_mv |
application/pdf application/pdf |
dc.publisher.none.fl_str_mv |
Molecular Diversity Preservation International |
publisher.none.fl_str_mv |
Molecular Diversity Preservation International |
dc.source.none.fl_str_mv |
reponame:CONICET Digital (CONICET) instname:Consejo Nacional de Investigaciones Científicas y Técnicas |
reponame_str |
CONICET Digital (CONICET) |
collection |
CONICET Digital (CONICET) |
instname_str |
Consejo Nacional de Investigaciones Científicas y Técnicas |
repository.name.fl_str_mv |
CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicas |
repository.mail.fl_str_mv |
dasensio@conicet.gov.ar; lcarlino@conicet.gov.ar |
_version_ |
1842980227903389696 |
score |
12.993085 |