INTERPRETABLE BINARY CLASSIFICATION MODELS USING XAI AND FEW DESCRIPTORS FOR PREDICTING BLOOD-BRAIN BARRIER PERMEABILITY OF PHARMACEUTICAL COMPOUNDS BASED ON RESAMPLING, CLUSTERING, AND MACHINE LEARNING METHODS
Keywords:
blood-brain barrier permeability, curse of dimensionality, explainable AI, logBB, machine learning, QSARAbstract
Background: Designing pharmaceutical compounds to treat brain diseases, or drugs that interact with biological targets in peripheral organs without penetrating the blood-brain barrier, remains a very difficult task. It is evident that animal models are costly and unproductive; therefore, the pharmaceutical industries and/or regulatory bodies need reliable, accurate and interpretable predictive tools to assess the permeability of pharmaceutical compounds across the blood-brain barrier.
Method: This study proposes the development of artificial intelligence models characterized by greater accuracy and enhanced explanatory capacity, in the context of binary classification of blood-brain barrier permeability of drug candidate compounds. By applying a resampling approach and clustering technique, we developed five distinct artificial intelligence models support vector machine, k-nearest neighbor, classification and regression decision tree, random forest, and gradient boosting machine using only 10 molecular descriptors and a dataset of 1,726 molecular observations (comprising 1,000 originals and 726 synthetic compounds).
Results: Of all the models evaluated, Gradient Boosting Machine had the best 10-fold cross-validation statistics, achieving prediction accuracy (Q), MCC and AUC of 91.04%, 0.82 and 1.0 on the external test set respectively. The gradient boosting machine outputs are explained using Shapley additive explanation approach. This method allows the main modelling descriptors involved in predicting blood-brain barrier permeability to be ranked in order of importance.
Conclusion: Non-animal predictive models were designed to determine whether pharmaceutical compounds can penetrate the blood–brain barrier. The proposed model reached a reliable level of accuracy sufficient to prove extremely useful for virtual screening of large pharmaceutical compounds libraries. It revealed two key indicators for predictions: spatial distribution of atomic charges and electronegativity.
Peer Review History:
Received 3 August 2025; Reviewed 11 September 2025; Accepted 17 October; Available online 15 November 2025
Academic Editor: Dr. Amany Mohamed Alboghdadly
, Ibn Sina National College for Medical Studies in Jeddah, Saudi Arabia, amanyalboghdadly@gmail.com
Reviewers:
Prof. Hassan A.H. Al-Shamahy, Sana'a University, Yemen, shmahe@yemen.net.ye
Dr. Adebayo Gege Grace Iyabo, University of Ibadan, Nigeria, funbimbola@gmail.com
Downloads
Published
How to Cite
Issue
Section
Copyright (c) 2025 Universal Journal of Pharmaceutical Research

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.




.