-
The Ultimate Guide on Interpretability: Interpretable Machine Learning (book!)
- The Building Blocks of Interpretability (distill)
- Explaining a Black-box Using Deep Variational Information Bottleneck Approach
- Paper Dissected: Understanding Black Box Predictions via Influence Functions Explained (blog)
- Interpretability via attentional and memory-based interfaces, using TensorFlow (blog!)
- Explainable Artificial Intelligence – Model Interpretation Strategies (blog)
- Black-box vs. white-box models - Interpretability Techniques (blog)
- Ideas on Machine Learning Interpretability (H2O Vid)
-
Interpretable Models:
- Linearity: Typically we have a linear model if the association between features and target is modeled linearly.
- Monotonicity: A monotonic model ensures that the relationship between a feature and the target outcome is always in one consistent direction (increase or decrease) over the feature (in its entirety of its range of values).
-
Interactions: You can always add interaction features, non-linearity to a model with manual feature engineering. Some models create it automatically also.
- Complexity of Learned Functions (increasing):
- Linear, Monotonic
- Non-linear, Monotonic
- Non-linear, Non-monotonic
- Approaches to interpretability of ML models based on agnosticism: