Application of k-means clustering, linear discriminant analysis and multivariate linear regression for the development of a predictive QSAR model on 5-lipoxygenase inhibitors
- Autores
- Andrada, Matias Fernando; Vega Hissi, Esteban Gabriel; Estrada, Mario Rinaldo; Garro Martinez, Juan Ceferino
- Año de publicación
- 2015
- Idioma
- inglés
- Tipo de recurso
- artículo
- Estado
- versión publicada
- Descripción
- In this work, we performed a quantitative structure activity relationship (QSAR) model for a family of 5-lipoxygenase (5-LOX) inhibitors using k-means clustering and linear discriminant analysis (LDA) for the selection of training and test sets and multivariate linear regression (MLR) for the independent variable selection. With the k-means clustering method, the total set of compounds (58 derivatives of 5-Benzylidene-2-phenylthiazolinones) was divided in two clusters according to a simple discriminant function. We found that piID (conventional bond order ID number) molecular descriptor discriminates correctly 100% of the compounds of each clusters. Thirty different models divided in three series were analyzed and the series with representative training and test sets (series 3) had the most predictive models. The statistical parameters of the best model are Rtrain=0.811 and Rtest=0.801. We found that a rational selection in the setting-up of training and test sets allows to obtain the most predictive models and the random selection is sometimes unsuitable, especially, when the total set of compounds can be classified in different clusters according to structural features.
Fil: Andrada, Matias Fernando. Universidad Nacional de San Luis. Facultad de Química, Bioquímica y Farmacia; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina
Fil: Vega Hissi, Esteban Gabriel. Universidad Nacional de San Luis. Facultad de Química, Bioquímica y Farmacia; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina
Fil: Estrada, Mario Rinaldo. Universidad Nacional de San Luis. Facultad de Química, Bioquímica y Farmacia; Argentina
Fil: Garro Martinez, Juan Ceferino. Universidad Nacional de San Luis. Facultad de Química, Bioquímica y Farmacia; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina - Materia
-
5-Lipoxygenase Inhibitors
K-Means Clustering
Linear Discriminant Analysis
Multivariate Linear Regression
Qsar - Nivel de accesibilidad
- acceso abierto
- Condiciones de uso
- https://creativecommons.org/licenses/by-nc-nd/2.5/ar/
- Repositorio
.jpg)
- Institución
- Consejo Nacional de Investigaciones Científicas y Técnicas
- OAI Identificador
- oai:ri.conicet.gov.ar:11336/60452
Ver los metadatos del registro completo
| id |
CONICETDig_4172bd4169f9ea96d71c4ecd75f6db5e |
|---|---|
| oai_identifier_str |
oai:ri.conicet.gov.ar:11336/60452 |
| network_acronym_str |
CONICETDig |
| repository_id_str |
3498 |
| network_name_str |
CONICET Digital (CONICET) |
| spelling |
Application of k-means clustering, linear discriminant analysis and multivariate linear regression for the development of a predictive QSAR model on 5-lipoxygenase inhibitorsAndrada, Matias FernandoVega Hissi, Esteban GabrielEstrada, Mario RinaldoGarro Martinez, Juan Ceferino5-Lipoxygenase InhibitorsK-Means ClusteringLinear Discriminant AnalysisMultivariate Linear RegressionQsarhttps://purl.org/becyt/ford/1.4https://purl.org/becyt/ford/1In this work, we performed a quantitative structure activity relationship (QSAR) model for a family of 5-lipoxygenase (5-LOX) inhibitors using k-means clustering and linear discriminant analysis (LDA) for the selection of training and test sets and multivariate linear regression (MLR) for the independent variable selection. With the k-means clustering method, the total set of compounds (58 derivatives of 5-Benzylidene-2-phenylthiazolinones) was divided in two clusters according to a simple discriminant function. We found that piID (conventional bond order ID number) molecular descriptor discriminates correctly 100% of the compounds of each clusters. Thirty different models divided in three series were analyzed and the series with representative training and test sets (series 3) had the most predictive models. The statistical parameters of the best model are Rtrain=0.811 and Rtest=0.801. We found that a rational selection in the setting-up of training and test sets allows to obtain the most predictive models and the random selection is sometimes unsuitable, especially, when the total set of compounds can be classified in different clusters according to structural features.Fil: Andrada, Matias Fernando. Universidad Nacional de San Luis. Facultad de Química, Bioquímica y Farmacia; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; ArgentinaFil: Vega Hissi, Esteban Gabriel. Universidad Nacional de San Luis. Facultad de Química, Bioquímica y Farmacia; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; ArgentinaFil: Estrada, Mario Rinaldo. Universidad Nacional de San Luis. Facultad de Química, Bioquímica y Farmacia; ArgentinaFil: Garro Martinez, Juan Ceferino. Universidad Nacional de San Luis. Facultad de Química, Bioquímica y Farmacia; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; ArgentinaElsevier Science2015-04info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionhttp://purl.org/coar/resource_type/c_6501info:ar-repo/semantics/articuloapplication/pdfapplication/pdfapplication/pdfapplication/pdfapplication/pdfhttp://hdl.handle.net/11336/60452Andrada, Matias Fernando; Vega Hissi, Esteban Gabriel; Estrada, Mario Rinaldo; Garro Martinez, Juan Ceferino; Application of k-means clustering, linear discriminant analysis and multivariate linear regression for the development of a predictive QSAR model on 5-lipoxygenase inhibitors; Elsevier Science; Chemometrics and Intelligent Laboratory Systems; 143; 4-2015; 122-1290169-7439CONICET DigitalCONICETenginfo:eu-repo/semantics/altIdentifier/doi/10.1016/j.chemolab.2015.03.001info:eu-repo/semantics/altIdentifier/url/https://www.sciencedirect.com/science/article/pii/S0169743915000593info:eu-repo/semantics/openAccesshttps://creativecommons.org/licenses/by-nc-nd/2.5/ar/reponame:CONICET Digital (CONICET)instname:Consejo Nacional de Investigaciones Científicas y Técnicas2025-11-12T09:58:00Zoai:ri.conicet.gov.ar:11336/60452instacron:CONICETInstitucionalhttp://ri.conicet.gov.ar/Organismo científico-tecnológicoNo correspondehttp://ri.conicet.gov.ar/oai/requestdasensio@conicet.gov.ar; lcarlino@conicet.gov.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:34982025-11-12 09:58:00.349CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicasfalse |
| dc.title.none.fl_str_mv |
Application of k-means clustering, linear discriminant analysis and multivariate linear regression for the development of a predictive QSAR model on 5-lipoxygenase inhibitors |
| title |
Application of k-means clustering, linear discriminant analysis and multivariate linear regression for the development of a predictive QSAR model on 5-lipoxygenase inhibitors |
| spellingShingle |
Application of k-means clustering, linear discriminant analysis and multivariate linear regression for the development of a predictive QSAR model on 5-lipoxygenase inhibitors Andrada, Matias Fernando 5-Lipoxygenase Inhibitors K-Means Clustering Linear Discriminant Analysis Multivariate Linear Regression Qsar |
| title_short |
Application of k-means clustering, linear discriminant analysis and multivariate linear regression for the development of a predictive QSAR model on 5-lipoxygenase inhibitors |
| title_full |
Application of k-means clustering, linear discriminant analysis and multivariate linear regression for the development of a predictive QSAR model on 5-lipoxygenase inhibitors |
| title_fullStr |
Application of k-means clustering, linear discriminant analysis and multivariate linear regression for the development of a predictive QSAR model on 5-lipoxygenase inhibitors |
| title_full_unstemmed |
Application of k-means clustering, linear discriminant analysis and multivariate linear regression for the development of a predictive QSAR model on 5-lipoxygenase inhibitors |
| title_sort |
Application of k-means clustering, linear discriminant analysis and multivariate linear regression for the development of a predictive QSAR model on 5-lipoxygenase inhibitors |
| dc.creator.none.fl_str_mv |
Andrada, Matias Fernando Vega Hissi, Esteban Gabriel Estrada, Mario Rinaldo Garro Martinez, Juan Ceferino |
| author |
Andrada, Matias Fernando |
| author_facet |
Andrada, Matias Fernando Vega Hissi, Esteban Gabriel Estrada, Mario Rinaldo Garro Martinez, Juan Ceferino |
| author_role |
author |
| author2 |
Vega Hissi, Esteban Gabriel Estrada, Mario Rinaldo Garro Martinez, Juan Ceferino |
| author2_role |
author author author |
| dc.subject.none.fl_str_mv |
5-Lipoxygenase Inhibitors K-Means Clustering Linear Discriminant Analysis Multivariate Linear Regression Qsar |
| topic |
5-Lipoxygenase Inhibitors K-Means Clustering Linear Discriminant Analysis Multivariate Linear Regression Qsar |
| purl_subject.fl_str_mv |
https://purl.org/becyt/ford/1.4 https://purl.org/becyt/ford/1 |
| dc.description.none.fl_txt_mv |
In this work, we performed a quantitative structure activity relationship (QSAR) model for a family of 5-lipoxygenase (5-LOX) inhibitors using k-means clustering and linear discriminant analysis (LDA) for the selection of training and test sets and multivariate linear regression (MLR) for the independent variable selection. With the k-means clustering method, the total set of compounds (58 derivatives of 5-Benzylidene-2-phenylthiazolinones) was divided in two clusters according to a simple discriminant function. We found that piID (conventional bond order ID number) molecular descriptor discriminates correctly 100% of the compounds of each clusters. Thirty different models divided in three series were analyzed and the series with representative training and test sets (series 3) had the most predictive models. The statistical parameters of the best model are Rtrain=0.811 and Rtest=0.801. We found that a rational selection in the setting-up of training and test sets allows to obtain the most predictive models and the random selection is sometimes unsuitable, especially, when the total set of compounds can be classified in different clusters according to structural features. Fil: Andrada, Matias Fernando. Universidad Nacional de San Luis. Facultad de Química, Bioquímica y Farmacia; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina Fil: Vega Hissi, Esteban Gabriel. Universidad Nacional de San Luis. Facultad de Química, Bioquímica y Farmacia; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina Fil: Estrada, Mario Rinaldo. Universidad Nacional de San Luis. Facultad de Química, Bioquímica y Farmacia; Argentina Fil: Garro Martinez, Juan Ceferino. Universidad Nacional de San Luis. Facultad de Química, Bioquímica y Farmacia; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina |
| description |
In this work, we performed a quantitative structure activity relationship (QSAR) model for a family of 5-lipoxygenase (5-LOX) inhibitors using k-means clustering and linear discriminant analysis (LDA) for the selection of training and test sets and multivariate linear regression (MLR) for the independent variable selection. With the k-means clustering method, the total set of compounds (58 derivatives of 5-Benzylidene-2-phenylthiazolinones) was divided in two clusters according to a simple discriminant function. We found that piID (conventional bond order ID number) molecular descriptor discriminates correctly 100% of the compounds of each clusters. Thirty different models divided in three series were analyzed and the series with representative training and test sets (series 3) had the most predictive models. The statistical parameters of the best model are Rtrain=0.811 and Rtest=0.801. We found that a rational selection in the setting-up of training and test sets allows to obtain the most predictive models and the random selection is sometimes unsuitable, especially, when the total set of compounds can be classified in different clusters according to structural features. |
| publishDate |
2015 |
| dc.date.none.fl_str_mv |
2015-04 |
| dc.type.none.fl_str_mv |
info:eu-repo/semantics/article info:eu-repo/semantics/publishedVersion http://purl.org/coar/resource_type/c_6501 info:ar-repo/semantics/articulo |
| format |
article |
| status_str |
publishedVersion |
| dc.identifier.none.fl_str_mv |
http://hdl.handle.net/11336/60452 Andrada, Matias Fernando; Vega Hissi, Esteban Gabriel; Estrada, Mario Rinaldo; Garro Martinez, Juan Ceferino; Application of k-means clustering, linear discriminant analysis and multivariate linear regression for the development of a predictive QSAR model on 5-lipoxygenase inhibitors; Elsevier Science; Chemometrics and Intelligent Laboratory Systems; 143; 4-2015; 122-129 0169-7439 CONICET Digital CONICET |
| url |
http://hdl.handle.net/11336/60452 |
| identifier_str_mv |
Andrada, Matias Fernando; Vega Hissi, Esteban Gabriel; Estrada, Mario Rinaldo; Garro Martinez, Juan Ceferino; Application of k-means clustering, linear discriminant analysis and multivariate linear regression for the development of a predictive QSAR model on 5-lipoxygenase inhibitors; Elsevier Science; Chemometrics and Intelligent Laboratory Systems; 143; 4-2015; 122-129 0169-7439 CONICET Digital CONICET |
| dc.language.none.fl_str_mv |
eng |
| language |
eng |
| dc.relation.none.fl_str_mv |
info:eu-repo/semantics/altIdentifier/doi/10.1016/j.chemolab.2015.03.001 info:eu-repo/semantics/altIdentifier/url/https://www.sciencedirect.com/science/article/pii/S0169743915000593 |
| dc.rights.none.fl_str_mv |
info:eu-repo/semantics/openAccess https://creativecommons.org/licenses/by-nc-nd/2.5/ar/ |
| eu_rights_str_mv |
openAccess |
| rights_invalid_str_mv |
https://creativecommons.org/licenses/by-nc-nd/2.5/ar/ |
| dc.format.none.fl_str_mv |
application/pdf application/pdf application/pdf application/pdf application/pdf |
| dc.publisher.none.fl_str_mv |
Elsevier Science |
| publisher.none.fl_str_mv |
Elsevier Science |
| dc.source.none.fl_str_mv |
reponame:CONICET Digital (CONICET) instname:Consejo Nacional de Investigaciones Científicas y Técnicas |
| reponame_str |
CONICET Digital (CONICET) |
| collection |
CONICET Digital (CONICET) |
| instname_str |
Consejo Nacional de Investigaciones Científicas y Técnicas |
| repository.name.fl_str_mv |
CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicas |
| repository.mail.fl_str_mv |
dasensio@conicet.gov.ar; lcarlino@conicet.gov.ar |
| _version_ |
1848598392163270656 |
| score |
13.24909 |