Category: Safety & Ethics
Definition
Model interpretability means understanding how AI systems make decisions and what factors influence their outputs.
How It Works
Interpretability tools show which inputs matter most for AI decisions. They might highlight words that influenced a sentiment analysis or show which image regions affected object recognition.
Different techniques work for different types of models, from simple decision trees to complex neural networks.
Why It Matters
When AI makes important decisions about loans, medical diagnoses, or hiring, people need to understand why. Interpretability builds trust and helps catch biases or errors.
Regulators increasingly require explainable AI for high-stakes applications.
← Back to Safety & Ethics | All Terms