EXPLAINABILITY IN BLACK-BOX AI: BRIDGING THE GAP BETWEEN ACCURACY AND TRUST
DOI:
https://doi.org/10.46121/pspc.52.2.29Keywords:
Explainable AI, Black-Box Models, Model Interpretability, Trust in AI, XAI, Transparency, SHAP, Attention Mechanisms, Responsible AIAbstract
Artificial intelligence systems have achieved remarkable accuracy across diverse domains including healthcare diagnosis, financial decision-making, criminal justice risk assessment, and autonomous systems. However, many high-performing AI models operate as "black boxes," producing accurate predictions through complex internal processes that remain opaque to users, developers, and affected individuals. This opacity creates a fundamental tension between predictive performance and user trust, raising critical concerns about accountability, fairness, bias detection, and regulatory compliance. This paper examines the explainability challenge in black-box AI systems, exploring techniques for making opaque models interpretable while maintaining predictive accuracy. Through comprehensive analysis of explainability methods including post-hoc interpretation techniques, model-agnostic approaches, attention mechanisms, and inherently interpretable architectures, this research evaluates their effectiveness across different application domains and stakeholder needs. The study demonstrates that explainability and accuracy need not be mutually exclusive—careful architectural choices and interpretation methods can provide meaningful transparency without substantially sacrificing performance. Using case studies from healthcare, finance, and criminal justice, we show that appropriate explainability approaches vary by domain requirements, with some contexts demanding complete transparency while others accept partial interpretability for critical performance gains. Evaluation of explainability techniques reveals that SHAP values and attention mechanisms provide robust explanations for complex models, while simplified proxy models offer interpretability at some accuracy cost. User studies with 280 domain experts and 340 general users demonstrate that explanations significantly increase trust and adoption, with explanation quality mattering more than mere presence of explanations. The research identifies key principles for balancing accuracy and explainability including matching explanation depth to stakeholder expertise, focusing explanations on decision-relevant features, validating explanations against domain knowledge, and acknowledging explanation limitations honestly. This work contributes to responsible AI development, human-AI interaction design, and regulatory frameworks requiring algorithmic transparency, providing practical guidance for deploying AI systems that combine high performance with meaningful explainability.

