Table of Contents

FIRST

SECOND

The Ultimate Guide on Interpretability: Interpretable Machine Learning (book!)
The Building Blocks of Interpretability (distill)
Explaining a Black-box Using Deep Variational Information Bottleneck Approach
Paper Dissected: Understanding Black Box Predictions via Influence Functions Explained (blog)
Interpretability via attentional and memory-based interfaces, using TensorFlow (blog!)
Explainable Artificial Intelligence – Model Interpretation Strategies (blog)
Black-box vs. white-box models - Interpretability Techniques (blog)
Ideas on Machine Learning Interpretability (H2O Vid)
Interpretable AI Solutions
Interpretable Models:
- Linearity: Typically we have a linear model if the association between features and target is modeled linearly.
- Monotonicity: A monotonic model ensures that the relationship between a feature and the target outcome is always in one consistent direction (increase or decrease) over the feature (in its entirety of its range of values).
- Interactions: You can always add interaction features, non-linearity to a model with manual feature engineering. Some models create it automatically also.
- Complexity of Learned Functions (increasing):
  - Linear, Monotonic
  - Non-linear, Monotonic
  - Non-linear, Non-monotonic
- Approaches to interpretability of ML models based on agnosticism: