Application of machine learning to predict unbound drug bioavailability in the brain
- Autores
- Morales, Juan Francisco; Ruiz, María Esperanza; Stratford, Robert E.; Talevi, Alan
- Año de publicación
- 2024
- Idioma
- inglés
- Tipo de recurso
- artículo
- Estado
- versión publicada
- Descripción
- Purpose: Optimizing brain bioavailability is highly relevant for the development of drugs targeting the central nervous system. Several pharmacokinetic parameters have been used for measuring drug bioavailability in the brain. The most biorelevant among them is possibly the unbound brain-to-plasma partition coefficient, Kpuu,brain,ss, which relates unbound brain and plasma drug concentrations under steady-state conditions. In this study, we developed new in silico models to predict Kpuu,brain,ss. Methods: A manually curated 157-compound dataset was compiled from literature and split into training and test sets using a clustering approach. Additional models were trained with a refined dataset generated by removing known P-gp and/or Breast Cancer Resistance Protein substrates from the original dataset. Different supervised machine learning algorithms have been tested, including Support Vector Machine, Gradient Boosting Machine, k-nearest neighbors, classificatory Partial Least Squares, Random Forest, Extreme Gradient Boosting, Deep Learning and Linear Discriminant Analysis. Good practices of predictive Quantitative Structure-Activity Relationships modeling were followed for the development of the models. Results: The best performance in the complete dataset was achieved by extreme gradient boosting, with an accuracy in the test set of 85.1%. A similar estimation of accuracy was observed in a prospective validation experiment, using a small sample of compounds and comparing predicted unbound brain bioavailability with observed experimental data. Conclusion: New in silico models were developed to predict the Kpuu,brain,ss of drug candidates. The dataset used in this study is publicly disclosed, so that the models may be reproduced, refined, or expanded, as a useful tool to assist drug discovery processes.
Laboratorio de Investigación y Desarrollo de Bioactivos - Materia
-
Biología
ADME properties
blood-brain barrier
brain bioavailability
central nervous system
machine learning
pharmacokinetics modeling
artificial intelligence
unbound partition coefficient - Nivel de accesibilidad
- acceso abierto
- Condiciones de uso
- http://creativecommons.org/licenses/by/4.0/
- Repositorio
.jpg)
- Institución
- Universidad Nacional de La Plata
- OAI Identificador
- oai:sedici.unlp.edu.ar:10915/167341
Ver los metadatos del registro completo
| id |
SEDICI_593916e55d9a717dad0ca27bfe226c79 |
|---|---|
| oai_identifier_str |
oai:sedici.unlp.edu.ar:10915/167341 |
| network_acronym_str |
SEDICI |
| repository_id_str |
1329 |
| network_name_str |
SEDICI (UNLP) |
| spelling |
Application of machine learning to predict unbound drug bioavailability in the brainMorales, Juan FranciscoRuiz, María EsperanzaStratford, Robert E.Talevi, AlanBiologíaADME propertiesblood-brain barrierbrain bioavailabilitycentral nervous systemmachine learningpharmacokinetics modelingartificial intelligenceunbound partition coefficientPurpose: Optimizing brain bioavailability is highly relevant for the development of drugs targeting the central nervous system. Several pharmacokinetic parameters have been used for measuring drug bioavailability in the brain. The most biorelevant among them is possibly the unbound brain-to-plasma partition coefficient, Kpuu,brain,ss, which relates unbound brain and plasma drug concentrations under steady-state conditions. In this study, we developed new in silico models to predict Kpuu,brain,ss. Methods: A manually curated 157-compound dataset was compiled from literature and split into training and test sets using a clustering approach. Additional models were trained with a refined dataset generated by removing known P-gp and/or Breast Cancer Resistance Protein substrates from the original dataset. Different supervised machine learning algorithms have been tested, including Support Vector Machine, Gradient Boosting Machine, k-nearest neighbors, classificatory Partial Least Squares, Random Forest, Extreme Gradient Boosting, Deep Learning and Linear Discriminant Analysis. Good practices of predictive Quantitative Structure-Activity Relationships modeling were followed for the development of the models. Results: The best performance in the complete dataset was achieved by extreme gradient boosting, with an accuracy in the test set of 85.1%. A similar estimation of accuracy was observed in a prospective validation experiment, using a small sample of compounds and comparing predicted unbound brain bioavailability with observed experimental data. Conclusion: New in silico models were developed to predict the Kpuu,brain,ss of drug candidates. The dataset used in this study is publicly disclosed, so that the models may be reproduced, refined, or expanded, as a useful tool to assist drug discovery processes.Laboratorio de Investigación y Desarrollo de Bioactivos2024info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionArticulohttp://purl.org/coar/resource_type/c_6501info:ar-repo/semantics/articuloapplication/pdfhttp://sedici.unlp.edu.ar/handle/10915/167341enginfo:eu-repo/semantics/altIdentifier/issn/2674-0338info:eu-repo/semantics/altIdentifier/doi/10.3389/fddsv.2024.1360732info:eu-repo/semantics/openAccesshttp://creativecommons.org/licenses/by/4.0/Creative Commons Attribution 4.0 International (CC BY 4.0)reponame:SEDICI (UNLP)instname:Universidad Nacional de La Platainstacron:UNLP2026-01-07T13:28:41Zoai:sedici.unlp.edu.ar:10915/167341Institucionalhttp://sedici.unlp.edu.ar/Universidad públicaNo correspondehttp://sedici.unlp.edu.ar/oai/snrdalira@sedici.unlp.edu.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:13292026-01-07 13:28:41.996SEDICI (UNLP) - Universidad Nacional de La Platafalse |
| dc.title.none.fl_str_mv |
Application of machine learning to predict unbound drug bioavailability in the brain |
| title |
Application of machine learning to predict unbound drug bioavailability in the brain |
| spellingShingle |
Application of machine learning to predict unbound drug bioavailability in the brain Morales, Juan Francisco Biología ADME properties blood-brain barrier brain bioavailability central nervous system machine learning pharmacokinetics modeling artificial intelligence unbound partition coefficient |
| title_short |
Application of machine learning to predict unbound drug bioavailability in the brain |
| title_full |
Application of machine learning to predict unbound drug bioavailability in the brain |
| title_fullStr |
Application of machine learning to predict unbound drug bioavailability in the brain |
| title_full_unstemmed |
Application of machine learning to predict unbound drug bioavailability in the brain |
| title_sort |
Application of machine learning to predict unbound drug bioavailability in the brain |
| dc.creator.none.fl_str_mv |
Morales, Juan Francisco Ruiz, María Esperanza Stratford, Robert E. Talevi, Alan |
| author |
Morales, Juan Francisco |
| author_facet |
Morales, Juan Francisco Ruiz, María Esperanza Stratford, Robert E. Talevi, Alan |
| author_role |
author |
| author2 |
Ruiz, María Esperanza Stratford, Robert E. Talevi, Alan |
| author2_role |
author author author |
| dc.subject.none.fl_str_mv |
Biología ADME properties blood-brain barrier brain bioavailability central nervous system machine learning pharmacokinetics modeling artificial intelligence unbound partition coefficient |
| topic |
Biología ADME properties blood-brain barrier brain bioavailability central nervous system machine learning pharmacokinetics modeling artificial intelligence unbound partition coefficient |
| dc.description.none.fl_txt_mv |
Purpose: Optimizing brain bioavailability is highly relevant for the development of drugs targeting the central nervous system. Several pharmacokinetic parameters have been used for measuring drug bioavailability in the brain. The most biorelevant among them is possibly the unbound brain-to-plasma partition coefficient, Kpuu,brain,ss, which relates unbound brain and plasma drug concentrations under steady-state conditions. In this study, we developed new in silico models to predict Kpuu,brain,ss. Methods: A manually curated 157-compound dataset was compiled from literature and split into training and test sets using a clustering approach. Additional models were trained with a refined dataset generated by removing known P-gp and/or Breast Cancer Resistance Protein substrates from the original dataset. Different supervised machine learning algorithms have been tested, including Support Vector Machine, Gradient Boosting Machine, k-nearest neighbors, classificatory Partial Least Squares, Random Forest, Extreme Gradient Boosting, Deep Learning and Linear Discriminant Analysis. Good practices of predictive Quantitative Structure-Activity Relationships modeling were followed for the development of the models. Results: The best performance in the complete dataset was achieved by extreme gradient boosting, with an accuracy in the test set of 85.1%. A similar estimation of accuracy was observed in a prospective validation experiment, using a small sample of compounds and comparing predicted unbound brain bioavailability with observed experimental data. Conclusion: New in silico models were developed to predict the Kpuu,brain,ss of drug candidates. The dataset used in this study is publicly disclosed, so that the models may be reproduced, refined, or expanded, as a useful tool to assist drug discovery processes. Laboratorio de Investigación y Desarrollo de Bioactivos |
| description |
Purpose: Optimizing brain bioavailability is highly relevant for the development of drugs targeting the central nervous system. Several pharmacokinetic parameters have been used for measuring drug bioavailability in the brain. The most biorelevant among them is possibly the unbound brain-to-plasma partition coefficient, Kpuu,brain,ss, which relates unbound brain and plasma drug concentrations under steady-state conditions. In this study, we developed new in silico models to predict Kpuu,brain,ss. Methods: A manually curated 157-compound dataset was compiled from literature and split into training and test sets using a clustering approach. Additional models were trained with a refined dataset generated by removing known P-gp and/or Breast Cancer Resistance Protein substrates from the original dataset. Different supervised machine learning algorithms have been tested, including Support Vector Machine, Gradient Boosting Machine, k-nearest neighbors, classificatory Partial Least Squares, Random Forest, Extreme Gradient Boosting, Deep Learning and Linear Discriminant Analysis. Good practices of predictive Quantitative Structure-Activity Relationships modeling were followed for the development of the models. Results: The best performance in the complete dataset was achieved by extreme gradient boosting, with an accuracy in the test set of 85.1%. A similar estimation of accuracy was observed in a prospective validation experiment, using a small sample of compounds and comparing predicted unbound brain bioavailability with observed experimental data. Conclusion: New in silico models were developed to predict the Kpuu,brain,ss of drug candidates. The dataset used in this study is publicly disclosed, so that the models may be reproduced, refined, or expanded, as a useful tool to assist drug discovery processes. |
| publishDate |
2024 |
| dc.date.none.fl_str_mv |
2024 |
| dc.type.none.fl_str_mv |
info:eu-repo/semantics/article info:eu-repo/semantics/publishedVersion Articulo http://purl.org/coar/resource_type/c_6501 info:ar-repo/semantics/articulo |
| format |
article |
| status_str |
publishedVersion |
| dc.identifier.none.fl_str_mv |
http://sedici.unlp.edu.ar/handle/10915/167341 |
| url |
http://sedici.unlp.edu.ar/handle/10915/167341 |
| dc.language.none.fl_str_mv |
eng |
| language |
eng |
| dc.relation.none.fl_str_mv |
info:eu-repo/semantics/altIdentifier/issn/2674-0338 info:eu-repo/semantics/altIdentifier/doi/10.3389/fddsv.2024.1360732 |
| dc.rights.none.fl_str_mv |
info:eu-repo/semantics/openAccess http://creativecommons.org/licenses/by/4.0/ Creative Commons Attribution 4.0 International (CC BY 4.0) |
| eu_rights_str_mv |
openAccess |
| rights_invalid_str_mv |
http://creativecommons.org/licenses/by/4.0/ Creative Commons Attribution 4.0 International (CC BY 4.0) |
| dc.format.none.fl_str_mv |
application/pdf |
| dc.source.none.fl_str_mv |
reponame:SEDICI (UNLP) instname:Universidad Nacional de La Plata instacron:UNLP |
| reponame_str |
SEDICI (UNLP) |
| collection |
SEDICI (UNLP) |
| instname_str |
Universidad Nacional de La Plata |
| instacron_str |
UNLP |
| institution |
UNLP |
| repository.name.fl_str_mv |
SEDICI (UNLP) - Universidad Nacional de La Plata |
| repository.mail.fl_str_mv |
alira@sedici.unlp.edu.ar |
| _version_ |
1853683279325036544 |
| score |
13.25844 |