Application of k-means clustering, linear discriminant analysis and multivariate linear regression for the development of a predictive QSAR model on 5-lipoxygenase inhibitors

Autores: Andrada, Matias Fernando; Vega Hissi, Esteban Gabriel; Estrada, Mario Rinaldo; Garro Martinez, Juan Ceferino
Año de publicación: 2015
Idioma: inglés
Tipo de recurso: artículo
Estado: versión publicada
Descripción: In this work, we performed a quantitative structure activity relationship (QSAR) model for a family of 5-lipoxygenase (5-LOX) inhibitors using k-means clustering and linear discriminant analysis (LDA) for the selection of training and test sets and multivariate linear regression (MLR) for the independent variable selection. With the k-means clustering method, the total set of compounds (58 derivatives of 5-Benzylidene-2-phenylthiazolinones) was divided in two clusters according to a simple discriminant function. We found that piID (conventional bond order ID number) molecular descriptor discriminates correctly 100% of the compounds of each clusters. Thirty different models divided in three series were analyzed and the series with representative training and test sets (series 3) had the most predictive models. The statistical parameters of the best model are Rtrain=0.811 and Rtest=0.801. We found that a rational selection in the setting-up of training and test sets allows to obtain the most predictive models and the random selection is sometimes unsuitable, especially, when the total set of compounds can be classified in different clusters according to structural features.
Fil: Andrada, Matias Fernando. Universidad Nacional de San Luis. Facultad de Química, Bioquímica y Farmacia; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina
Fil: Vega Hissi, Esteban Gabriel. Universidad Nacional de San Luis. Facultad de Química, Bioquímica y Farmacia; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina
Fil: Estrada, Mario Rinaldo. Universidad Nacional de San Luis. Facultad de Química, Bioquímica y Farmacia; Argentina
Fil: Garro Martinez, Juan Ceferino. Universidad Nacional de San Luis. Facultad de Química, Bioquímica y Farmacia; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina
Materia: 5-Lipoxygenase Inhibitors
K-Means Clustering
Linear Discriminant Analysis
Multivariate Linear Regression
Qsar
Nivel de accesibilidad: acceso abierto
Condiciones de uso: https://creativecommons.org/licenses/by-nc-nd/2.5/ar/
Repositorio
Institución: Consejo Nacional de Investigaciones Científicas y Técnicas
OAI Identificador: oai:ri.conicet.gov.ar:11336/60452

Acceder

id	CONICETDig_4172bd4169f9ea96d71c4ecd75f6db5e
oai_identifier_str	oai:ri.conicet.gov.ar:11336/60452
network_acronym_str	CONICETDig
repository_id_str	3498
network_name_str	CONICET Digital (CONICET)
spelling	Application of k-means clustering, linear discriminant analysis and multivariate linear regression for the development of a predictive QSAR model on 5-lipoxygenase inhibitorsAndrada, Matias FernandoVega Hissi, Esteban GabrielEstrada, Mario RinaldoGarro Martinez, Juan Ceferino5-Lipoxygenase InhibitorsK-Means ClusteringLinear Discriminant AnalysisMultivariate Linear RegressionQsarhttps://purl.org/becyt/ford/1.4https://purl.org/becyt/ford/1In this work, we performed a quantitative structure activity relationship (QSAR) model for a family of 5-lipoxygenase (5-LOX) inhibitors using k-means clustering and linear discriminant analysis (LDA) for the selection of training and test sets and multivariate linear regression (MLR) for the independent variable selection. With the k-means clustering method, the total set of compounds (58 derivatives of 5-Benzylidene-2-phenylthiazolinones) was divided in two clusters according to a simple discriminant function. We found that piID (conventional bond order ID number) molecular descriptor discriminates correctly 100% of the compounds of each clusters. Thirty different models divided in three series were analyzed and the series with representative training and test sets (series 3) had the most predictive models. The statistical parameters of the best model are Rtrain=0.811 and Rtest=0.801. We found that a rational selection in the setting-up of training and test sets allows to obtain the most predictive models and the random selection is sometimes unsuitable, especially, when the total set of compounds can be classified in different clusters according to structural features.Fil: Andrada, Matias Fernando. Universidad Nacional de San Luis. Facultad de Química, Bioquímica y Farmacia; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; ArgentinaFil: Vega Hissi, Esteban Gabriel. Universidad Nacional de San Luis. Facultad de Química, Bioquímica y Farmacia; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; ArgentinaFil: Estrada, Mario Rinaldo. Universidad Nacional de San Luis. Facultad de Química, Bioquímica y Farmacia; ArgentinaFil: Garro Martinez, Juan Ceferino. Universidad Nacional de San Luis. Facultad de Química, Bioquímica y Farmacia; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; ArgentinaElsevier Science2015-04info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionhttp://purl.org/coar/resource_type/c_6501info:ar-repo/semantics/articuloapplication/pdfapplication/pdfapplication/pdfapplication/pdfapplication/pdfhttp://hdl.handle.net/11336/60452Andrada, Matias Fernando; Vega Hissi, Esteban Gabriel; Estrada, Mario Rinaldo; Garro Martinez, Juan Ceferino; Application of k-means clustering, linear discriminant analysis and multivariate linear regression for the development of a predictive QSAR model on 5-lipoxygenase inhibitors; Elsevier Science; Chemometrics and Intelligent Laboratory Systems; 143; 4-2015; 122-1290169-7439CONICET DigitalCONICETenginfo:eu-repo/semantics/altIdentifier/doi/10.1016/j.chemolab.2015.03.001info:eu-repo/semantics/altIdentifier/url/https://www.sciencedirect.com/science/article/pii/S0169743915000593info:eu-repo/semantics/openAccesshttps://creativecommons.org/licenses/by-nc-nd/2.5/ar/reponame:CONICET Digital (CONICET)instname:Consejo Nacional de Investigaciones Científicas y Técnicas2026-02-26T10:32:48Zoai:ri.conicet.gov.ar:11336/60452instacron:CONICETInstitucionalhttp://ri.conicet.gov.ar/Organismo científico-tecnológicoNo correspondehttp://ri.conicet.gov.ar/oai/requestdasensio@conicet.gov.ar; lcarlino@conicet.gov.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:34982026-02-26 10:32:49.19CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicasfalse
dc.title.none.fl_str_mv	Application of k-means clustering, linear discriminant analysis and multivariate linear regression for the development of a predictive QSAR model on 5-lipoxygenase inhibitors
title	Application of k-means clustering, linear discriminant analysis and multivariate linear regression for the development of a predictive QSAR model on 5-lipoxygenase inhibitors
spellingShingle	Application of k-means clustering, linear discriminant analysis and multivariate linear regression for the development of a predictive QSAR model on 5-lipoxygenase inhibitors Andrada, Matias Fernando 5-Lipoxygenase Inhibitors K-Means Clustering Linear Discriminant Analysis Multivariate Linear Regression Qsar
title_short	Application of k-means clustering, linear discriminant analysis and multivariate linear regression for the development of a predictive QSAR model on 5-lipoxygenase inhibitors
title_full	Application of k-means clustering, linear discriminant analysis and multivariate linear regression for the development of a predictive QSAR model on 5-lipoxygenase inhibitors
title_fullStr	Application of k-means clustering, linear discriminant analysis and multivariate linear regression for the development of a predictive QSAR model on 5-lipoxygenase inhibitors
title_full_unstemmed	Application of k-means clustering, linear discriminant analysis and multivariate linear regression for the development of a predictive QSAR model on 5-lipoxygenase inhibitors
title_sort	Application of k-means clustering, linear discriminant analysis and multivariate linear regression for the development of a predictive QSAR model on 5-lipoxygenase inhibitors
dc.creator.none.fl_str_mv	Andrada, Matias Fernando Vega Hissi, Esteban Gabriel Estrada, Mario Rinaldo Garro Martinez, Juan Ceferino
author	Andrada, Matias Fernando
author_facet	Andrada, Matias Fernando Vega Hissi, Esteban Gabriel Estrada, Mario Rinaldo Garro Martinez, Juan Ceferino
author_role	author
author2	Vega Hissi, Esteban Gabriel Estrada, Mario Rinaldo Garro Martinez, Juan Ceferino
author2_role	author author author
dc.subject.none.fl_str_mv	5-Lipoxygenase Inhibitors K-Means Clustering Linear Discriminant Analysis Multivariate Linear Regression Qsar
topic	5-Lipoxygenase Inhibitors K-Means Clustering Linear Discriminant Analysis Multivariate Linear Regression Qsar
purl_subject.fl_str_mv	https://purl.org/becyt/ford/1.4 https://purl.org/becyt/ford/1
dc.description.none.fl_txt_mv	In this work, we performed a quantitative structure activity relationship (QSAR) model for a family of 5-lipoxygenase (5-LOX) inhibitors using k-means clustering and linear discriminant analysis (LDA) for the selection of training and test sets and multivariate linear regression (MLR) for the independent variable selection. With the k-means clustering method, the total set of compounds (58 derivatives of 5-Benzylidene-2-phenylthiazolinones) was divided in two clusters according to a simple discriminant function. We found that piID (conventional bond order ID number) molecular descriptor discriminates correctly 100% of the compounds of each clusters. Thirty different models divided in three series were analyzed and the series with representative training and test sets (series 3) had the most predictive models. The statistical parameters of the best model are Rtrain=0.811 and Rtest=0.801. We found that a rational selection in the setting-up of training and test sets allows to obtain the most predictive models and the random selection is sometimes unsuitable, especially, when the total set of compounds can be classified in different clusters according to structural features. Fil: Andrada, Matias Fernando. Universidad Nacional de San Luis. Facultad de Química, Bioquímica y Farmacia; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina Fil: Vega Hissi, Esteban Gabriel. Universidad Nacional de San Luis. Facultad de Química, Bioquímica y Farmacia; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina Fil: Estrada, Mario Rinaldo. Universidad Nacional de San Luis. Facultad de Química, Bioquímica y Farmacia; Argentina Fil: Garro Martinez, Juan Ceferino. Universidad Nacional de San Luis. Facultad de Química, Bioquímica y Farmacia; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina
description	In this work, we performed a quantitative structure activity relationship (QSAR) model for a family of 5-lipoxygenase (5-LOX) inhibitors using k-means clustering and linear discriminant analysis (LDA) for the selection of training and test sets and multivariate linear regression (MLR) for the independent variable selection. With the k-means clustering method, the total set of compounds (58 derivatives of 5-Benzylidene-2-phenylthiazolinones) was divided in two clusters according to a simple discriminant function. We found that piID (conventional bond order ID number) molecular descriptor discriminates correctly 100% of the compounds of each clusters. Thirty different models divided in three series were analyzed and the series with representative training and test sets (series 3) had the most predictive models. The statistical parameters of the best model are Rtrain=0.811 and Rtest=0.801. We found that a rational selection in the setting-up of training and test sets allows to obtain the most predictive models and the random selection is sometimes unsuitable, especially, when the total set of compounds can be classified in different clusters according to structural features.
publishDate	2015
dc.date.none.fl_str_mv	2015-04
dc.type.none.fl_str_mv	info:eu-repo/semantics/article info:eu-repo/semantics/publishedVersion http://purl.org/coar/resource_type/c_6501 info:ar-repo/semantics/articulo
format	article
status_str	publishedVersion
dc.identifier.none.fl_str_mv	http://hdl.handle.net/11336/60452 Andrada, Matias Fernando; Vega Hissi, Esteban Gabriel; Estrada, Mario Rinaldo; Garro Martinez, Juan Ceferino; Application of k-means clustering, linear discriminant analysis and multivariate linear regression for the development of a predictive QSAR model on 5-lipoxygenase inhibitors; Elsevier Science; Chemometrics and Intelligent Laboratory Systems; 143; 4-2015; 122-129 0169-7439 CONICET Digital CONICET
url	http://hdl.handle.net/11336/60452
identifier_str_mv	Andrada, Matias Fernando; Vega Hissi, Esteban Gabriel; Estrada, Mario Rinaldo; Garro Martinez, Juan Ceferino; Application of k-means clustering, linear discriminant analysis and multivariate linear regression for the development of a predictive QSAR model on 5-lipoxygenase inhibitors; Elsevier Science; Chemometrics and Intelligent Laboratory Systems; 143; 4-2015; 122-129 0169-7439 CONICET Digital CONICET
dc.language.none.fl_str_mv	eng
language	eng
dc.relation.none.fl_str_mv	info:eu-repo/semantics/altIdentifier/doi/10.1016/j.chemolab.2015.03.001 info:eu-repo/semantics/altIdentifier/url/https://www.sciencedirect.com/science/article/pii/S0169743915000593
dc.rights.none.fl_str_mv	info:eu-repo/semantics/openAccess https://creativecommons.org/licenses/by-nc-nd/2.5/ar/
eu_rights_str_mv	openAccess
rights_invalid_str_mv	https://creativecommons.org/licenses/by-nc-nd/2.5/ar/
dc.format.none.fl_str_mv	application/pdf application/pdf application/pdf application/pdf application/pdf
dc.publisher.none.fl_str_mv	Elsevier Science
publisher.none.fl_str_mv	Elsevier Science
dc.source.none.fl_str_mv	reponame:CONICET Digital (CONICET) instname:Consejo Nacional de Investigaciones Científicas y Técnicas
reponame_str	CONICET Digital (CONICET)
collection	CONICET Digital (CONICET)
instname_str	Consejo Nacional de Investigaciones Científicas y Técnicas
repository.name.fl_str_mv	CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicas
repository.mail.fl_str_mv	dasensio@conicet.gov.ar; lcarlino@conicet.gov.ar
_version_	1858306163391594496
score	12.665996

Application of k-means clustering, linear discriminant analysis and multivariate linear regression for the development of a predictive QSAR model on 5-lipoxygenase inhibitors

Publicaciones similares