Optimal Partition of Datasets of QSPR Studies: A Sampling Problem

Autores
Talevi, Alan; Bellera, Carolina Leticia; Castro, Eduardo Alberto; Bruno Blanch, Luis Enrique
Año de publicación
2010
Idioma
inglés
Tipo de recurso
artículo
Estado
versión publicada
Descripción
Starting from different partitions of a 160 compounds dataset into training and test sets, we developed discriminant funtions to classify drugs into different categories of human intestinal absorptions rate. For each partition of the dataser, models that included up to ten Dragon descriptors were built, and the performance of each discriminante funtion in teh classification of the training and test sets was assessec and explores graphically through divergence diagrams. Results suggest that external validation tends to underestimate the predictive capability of QSAR models and that the more raliable results from external validation are obtained with even partitions of small and medium size datasets.
Fil: Talevi, Alan. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - La Plata; Argentina. Universidad Nacional de La Plata. Facultad de Ciencias Exactas; Argentina
Fil: Bellera, Carolina Leticia. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - La Plata; Argentina. Universidad Nacional de La Plata. Facultad de Ciencias Exactas; Argentina
Fil: Castro, Eduardo Alberto. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - La Plata. Instituto de Investigaciones Fisicoquímicas Teóricas y Aplicadas. Universidad Nacional de La Plata. Facultad de Ciencias Exactas. Instituto de Investigaciones Fisicoquímicas Teóricas y Aplicadas; Argentina
Fil: Bruno Blanch, Luis Enrique. Universidad Nacional de La Plata. Facultad de Ciencias Exactas; Argentina
Materia
QSAR
MODELS
Nivel de accesibilidad
acceso abierto
Condiciones de uso
https://creativecommons.org/licenses/by/2.5/ar/
Repositorio
CONICET Digital (CONICET)
Institución
Consejo Nacional de Investigaciones Científicas y Técnicas
OAI Identificador
oai:ri.conicet.gov.ar:11336/127000

id CONICETDig_70199401fe090d50d728eb80549b3842
oai_identifier_str oai:ri.conicet.gov.ar:11336/127000
network_acronym_str CONICETDig
repository_id_str 3498
network_name_str CONICET Digital (CONICET)
spelling Optimal Partition of Datasets of QSPR Studies: A Sampling ProblemTalevi, AlanBellera, Carolina LeticiaCastro, Eduardo AlbertoBruno Blanch, Luis EnriqueQSARMODELShttps://purl.org/becyt/ford/1.4https://purl.org/becyt/ford/1Starting from different partitions of a 160 compounds dataset into training and test sets, we developed discriminant funtions to classify drugs into different categories of human intestinal absorptions rate. For each partition of the dataser, models that included up to ten Dragon descriptors were built, and the performance of each discriminante funtion in teh classification of the training and test sets was assessec and explores graphically through divergence diagrams. Results suggest that external validation tends to underestimate the predictive capability of QSAR models and that the more raliable results from external validation are obtained with even partitions of small and medium size datasets.Fil: Talevi, Alan. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - La Plata; Argentina. Universidad Nacional de La Plata. Facultad de Ciencias Exactas; ArgentinaFil: Bellera, Carolina Leticia. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - La Plata; Argentina. Universidad Nacional de La Plata. Facultad de Ciencias Exactas; ArgentinaFil: Castro, Eduardo Alberto. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - La Plata. Instituto de Investigaciones Fisicoquímicas Teóricas y Aplicadas. Universidad Nacional de La Plata. Facultad de Ciencias Exactas. Instituto de Investigaciones Fisicoquímicas Teóricas y Aplicadas; ArgentinaFil: Bruno Blanch, Luis Enrique. Universidad Nacional de La Plata. Facultad de Ciencias Exactas; ArgentinaUniv Kragujevac2010-04info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionhttp://purl.org/coar/resource_type/c_6501info:ar-repo/semantics/articuloapplication/pdfapplication/pdfapplication/pdfapplication/pdfapplication/pdfhttp://hdl.handle.net/11336/127000Talevi, Alan; Bellera, Carolina Leticia; Castro, Eduardo Alberto; Bruno Blanch, Luis Enrique; Optimal Partition of Datasets of QSPR Studies: A Sampling Problem; Univ Kragujevac; Match-communications In Mathematical And In Computer Chemistry; 63; 3; 4-2010; 585-5990340-6253CONICET DigitalCONICETenginfo:eu-repo/semantics/altIdentifier/url/https://match.pmf.kg.ac.rs/content63n3.htminfo:eu-repo/semantics/altIdentifier/url/https://match.pmf.kg.ac.rs/info:eu-repo/semantics/openAccesshttps://creativecommons.org/licenses/by/2.5/ar/reponame:CONICET Digital (CONICET)instname:Consejo Nacional de Investigaciones Científicas y Técnicas2025-10-15T15:23:01Zoai:ri.conicet.gov.ar:11336/127000instacron:CONICETInstitucionalhttp://ri.conicet.gov.ar/Organismo científico-tecnológicoNo correspondehttp://ri.conicet.gov.ar/oai/requestdasensio@conicet.gov.ar; lcarlino@conicet.gov.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:34982025-10-15 15:23:01.863CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicasfalse
dc.title.none.fl_str_mv Optimal Partition of Datasets of QSPR Studies: A Sampling Problem
title Optimal Partition of Datasets of QSPR Studies: A Sampling Problem
spellingShingle Optimal Partition of Datasets of QSPR Studies: A Sampling Problem
Talevi, Alan
QSAR
MODELS
title_short Optimal Partition of Datasets of QSPR Studies: A Sampling Problem
title_full Optimal Partition of Datasets of QSPR Studies: A Sampling Problem
title_fullStr Optimal Partition of Datasets of QSPR Studies: A Sampling Problem
title_full_unstemmed Optimal Partition of Datasets of QSPR Studies: A Sampling Problem
title_sort Optimal Partition of Datasets of QSPR Studies: A Sampling Problem
dc.creator.none.fl_str_mv Talevi, Alan
Bellera, Carolina Leticia
Castro, Eduardo Alberto
Bruno Blanch, Luis Enrique
author Talevi, Alan
author_facet Talevi, Alan
Bellera, Carolina Leticia
Castro, Eduardo Alberto
Bruno Blanch, Luis Enrique
author_role author
author2 Bellera, Carolina Leticia
Castro, Eduardo Alberto
Bruno Blanch, Luis Enrique
author2_role author
author
author
dc.subject.none.fl_str_mv QSAR
MODELS
topic QSAR
MODELS
purl_subject.fl_str_mv https://purl.org/becyt/ford/1.4
https://purl.org/becyt/ford/1
dc.description.none.fl_txt_mv Starting from different partitions of a 160 compounds dataset into training and test sets, we developed discriminant funtions to classify drugs into different categories of human intestinal absorptions rate. For each partition of the dataser, models that included up to ten Dragon descriptors were built, and the performance of each discriminante funtion in teh classification of the training and test sets was assessec and explores graphically through divergence diagrams. Results suggest that external validation tends to underestimate the predictive capability of QSAR models and that the more raliable results from external validation are obtained with even partitions of small and medium size datasets.
Fil: Talevi, Alan. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - La Plata; Argentina. Universidad Nacional de La Plata. Facultad de Ciencias Exactas; Argentina
Fil: Bellera, Carolina Leticia. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - La Plata; Argentina. Universidad Nacional de La Plata. Facultad de Ciencias Exactas; Argentina
Fil: Castro, Eduardo Alberto. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - La Plata. Instituto de Investigaciones Fisicoquímicas Teóricas y Aplicadas. Universidad Nacional de La Plata. Facultad de Ciencias Exactas. Instituto de Investigaciones Fisicoquímicas Teóricas y Aplicadas; Argentina
Fil: Bruno Blanch, Luis Enrique. Universidad Nacional de La Plata. Facultad de Ciencias Exactas; Argentina
description Starting from different partitions of a 160 compounds dataset into training and test sets, we developed discriminant funtions to classify drugs into different categories of human intestinal absorptions rate. For each partition of the dataser, models that included up to ten Dragon descriptors were built, and the performance of each discriminante funtion in teh classification of the training and test sets was assessec and explores graphically through divergence diagrams. Results suggest that external validation tends to underestimate the predictive capability of QSAR models and that the more raliable results from external validation are obtained with even partitions of small and medium size datasets.
publishDate 2010
dc.date.none.fl_str_mv 2010-04
dc.type.none.fl_str_mv info:eu-repo/semantics/article
info:eu-repo/semantics/publishedVersion
http://purl.org/coar/resource_type/c_6501
info:ar-repo/semantics/articulo
format article
status_str publishedVersion
dc.identifier.none.fl_str_mv http://hdl.handle.net/11336/127000
Talevi, Alan; Bellera, Carolina Leticia; Castro, Eduardo Alberto; Bruno Blanch, Luis Enrique; Optimal Partition of Datasets of QSPR Studies: A Sampling Problem; Univ Kragujevac; Match-communications In Mathematical And In Computer Chemistry; 63; 3; 4-2010; 585-599
0340-6253
CONICET Digital
CONICET
url http://hdl.handle.net/11336/127000
identifier_str_mv Talevi, Alan; Bellera, Carolina Leticia; Castro, Eduardo Alberto; Bruno Blanch, Luis Enrique; Optimal Partition of Datasets of QSPR Studies: A Sampling Problem; Univ Kragujevac; Match-communications In Mathematical And In Computer Chemistry; 63; 3; 4-2010; 585-599
0340-6253
CONICET Digital
CONICET
dc.language.none.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv info:eu-repo/semantics/altIdentifier/url/https://match.pmf.kg.ac.rs/content63n3.htm
info:eu-repo/semantics/altIdentifier/url/https://match.pmf.kg.ac.rs/
dc.rights.none.fl_str_mv info:eu-repo/semantics/openAccess
https://creativecommons.org/licenses/by/2.5/ar/
eu_rights_str_mv openAccess
rights_invalid_str_mv https://creativecommons.org/licenses/by/2.5/ar/
dc.format.none.fl_str_mv application/pdf
application/pdf
application/pdf
application/pdf
application/pdf
dc.publisher.none.fl_str_mv Univ Kragujevac
publisher.none.fl_str_mv Univ Kragujevac
dc.source.none.fl_str_mv reponame:CONICET Digital (CONICET)
instname:Consejo Nacional de Investigaciones Científicas y Técnicas
reponame_str CONICET Digital (CONICET)
collection CONICET Digital (CONICET)
instname_str Consejo Nacional de Investigaciones Científicas y Técnicas
repository.name.fl_str_mv CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicas
repository.mail.fl_str_mv dasensio@conicet.gov.ar; lcarlino@conicet.gov.ar
_version_ 1846083377166286848
score 13.22299