Explain Yourself. Current Trends in Model Explainability
As we build bigger and more powerful machine learning (ML) models, we also create increasingly darker black box models. If you were to ask the scientists behind ChatGPT and Dall-E 2 how a certain response or image was created, you’d receive some long explanation about how the model is trained on a massive dataset with billions of parameters that can be tuned. However this leaves much to be desired. What does each parameter mean? If the model returns something inaccurate, why did it do that and how can we go about fixing it?
These questions become even more pressing as these ML models become increasingly more pervasive in our lives. ML models are increasingly being used in the healthcare industry to assist doctors in making decisions, diagnostics, and resource optimization. In the financial services industry, ML models are already being used to assess credit risk, financial crime risk, and to improve pricing. In criminal justice, models are being used to analyze DNA and forecast crime.
It is clear that in this small subset of fields that ML models are at a crucial decision making junction. These models significantly improve the efficiency and effectiveness of decision making which in turn provides significant value to all stakeholders. However without the ability to explain these decisions and how we arrived at them, trust in these ML models is significantly hindered. Thus as we develop bigger and better models, we also need to develop better mechanisms to map out and explain the decision making process of a model.
By advancing mechanisms to explain ML models, we can build more trust in the models we use and lower the risk of the significant costs involved with a bad decision. These explanation mechanisms will also increasingly encourage the use of models by addressing the regulatory requirements involved with deploying black box models. These mechanisms also have significant implications for improved monitoring and auditing for the right results. Through these mechanisms, we can also promote fair high quality models and mitigate model drift – especially as these models are scaled.
One Quick Note: Difference between Explainability and Interpretability
When we talk about developing mechanisms to better understand ML models, there are two dimensions in which we can base these mechanisms. The first is interpretability of an ML model. Interpretability is defined as a human’s ability to intuitively understand a model. Specifically it refers to the ability to observe cause and effect in a model which in turn presents how a model works and makes a decision. Put another way, interpretability is the why and how of a model when generating predictions. Examples of interpretable models include decision trees and linear regression models where features and their weights are specifically highlighted. Finally interpretability exists on a spectrum where highly interoperable models are usually less powerful than less interpretable ones (i.e. neural networks).
The other dimension is explainability which is the ability to explain the ML behavior in human terms. Specifically it is the extent to which inner workings of a model can be explained in human terms. It is useful to note that explainability and interpretability go hand in hand but can be mutually exclusive. For instance neural networks lack interpretability as we don’t know how each neuron and its weight contributes to the final decision. However, by plugging in inputs and observing the outputs, neural networks can be explained. For instance assume we have a neural network to distinguish between cats and dogs. Then assume that by using different subsets of pictures we determine that the model categorizes animals with whiskers as cats and otherwise as dogs. In this way, we explained how the model categorizes cats and dogs without needing to understand the inner workings. In summation, interpretability refers to the intuitive understanding of the why and how of a ML model’s decision whereas explainability refers to the ability to explain ML behavior in human terms.
In this article, we will only focus on the current trends in explainability. Specifically we will look at the growing number of XAI techniques followed by an exploration of the growing players in the space. XAI techniques will be quickly introduced and described. Then we look at the various open source tools available for XAI. Next a section is devoted to shining a light on the XAI work at Fiddler AI through an interview with Lior Berry, Director of Engineering, and Krishnaram Kenthapadi, Chief AI Officer & Scientist. Then we end off with a quick overview of the other industry players in the space.
There are three overall levels of XAI. There is local model explainability which asks specific questions about the model’s decisions. Then there’s cohort model explainability which uses subsets of data to test accuracy with new and unseen data. This can weed out potential bias in the model. Finally there’s global model explainability which focuses on the highest impact features on model outcomes and decisions. Below I’ve outlined several techniques that tackle each of these levels to varying degrees.
- SHapley Additive exPlanations (SHAP) is a game theory based approach to model explainability. Specifically it uses game theory to determine how each individual feature in the model affects the final output. In order to generate the Shapley value for each feature, we first create a partial dependence plot for each feature with respect to the expected value of the output given that feature. Then we subtract the values from this plot from the expected model output to get the final Shapley value.
- Local Interpretable Model-Agnostic Explanations (LIME) focus on the local region and use linear approximation to reflect the behavior of a model. We begin by sampling points around an input value X. Then we use the original model to predict each point. Then we weight each of these predicted values by their proximity to X. Using these weighted values, we fit a linear model and then use that linear model to explain the local behavior around X.
- Explainable Boosting Machines are tree-based cyclic gradient boosting Generalized Additive Models that have automatic interaction detection. We produce this machine by first training a bagging and gradient boosting tree on one feature. In this manner, we train on a feature then update its residue and do this for all features. In this manner, we can reveal the importance of each feature.
- Saliency Maps maps and visualizes the importance of each pixel in an image by taking its gradient with respect to each input pixel.
- Testing with Concept Activation Vectors (CAVs) quantifies the degree of importance of a user-defined concept with respect to a classification result. It does this by first generating a concept activation vector aligned with a user-defined concept. This training is done by using data that is aligned with the user-defined concept. Then we take the derivative of these CAVs to get each individual TCAV score per concept. Finally we validate each CAV by comparing the TCAV score distribution of random data versus the concept data.
- Distillation encapsulates the transfer of knowledge from a larger teacher model to a smaller explainer model. In essence, the explainer model attempts to mimic the behavior of the teacher. This can be done through the use of a decision tree as the smaller explainer model.
- Counterfactual defines the approach where you make the smallest change to input features and map how that changes the model prediction.
- Partial Dependence Plots plot the effect of an input feature on an ML model outcome. It hones in on the singular feature by marginalizing out all other features.
- Permutation Feature Importance reveals the importance of features by permuting it and seeing how these permutations change the prediction error.
- Accumulated Local Effects Plot is an unbiased alternative to partial dependence plots by determining the model prediction change in a small window of the provided feature.
- Individual Conditional Expectation expires the changes in model prediction for a data point as the feature varies for all data points.
- Surrogate Models are additional models trained to approximate and explain existing black box models.
- Leave One Column Out describes a simple technique where one column is left out and the model is retrained. The difference in model prediction score is used to evaluate the importance and provide some explainability of the column.
- Anchors generate IF-THEN rules that support the predictions made.
- Deep Learning Important Features (DeepLIFT) breaks down neural networks by comparing neuron activations with reference activations and contribution scores from these differences.
- Layer-wise Relevance Propagation is similar to DeepLIFT except with special purpose propagation rules.
- Contrastive Explanations Method (CEM) trades off pertinent positives (features necessary for prediction) and pertinent negatives (features that can be minimal and absent) while maintaining predictive power. This reveals the necessary features for a prediction.
- ProfWeight is similar to distillation in that it transfers information from a bigger pre-trained model to a smaller model. It does this by inserting probes into neural networks to find easier samples. These samples are weighted heavily when trained by the simple model.
- General NLP Technique is to train a neural network, then freeze its weights, and then examine its representations to gain some useful information about the data.
Open Source Tools
The growth of XAI has led to a proliferation of open source tools that enable developers to easily incorporate explainability into their models. These tools include:
- interpretML is a library which provides many of the above-mentioned XAI techniques as pre-built functions.
- AI Explainability 360 Toolkit by IBM is another library that provides many similar and many different explainability functions as interpretML.
- ELI5 is a Python package that makes it easier to debug machine learning classifiers and to provide further information about the predictions they make. It has some support for examining black box models.
- Activation Atlases is a technique (with code open sourced online) that visualizes neural networks and the interactions of neurons as they mature.
- Alibi is a Python library with a massive amount of high-quality explanation functions for black-box, white-box, classification, and regression methods.
- TensorFlow’s What-If Tool provides a nice visual interface for developers to edit and rerun examples to gain a deeper understanding of black-box classification and regression ML models.
- moDel Agnostic Language for Exploration and eXplanation (DALEX) provides a wrapper for models that in turn examine and explain the behavior of the model.
Fiddler AI: Explainability to Make Better and Fairer Models
Fiddler AI is a rapidly growing startup looking to build greater trust into AI. To build this greater trust, Fiddler incorporates various tools in analytics, model monitoring, fairness, and explainable AI. In this way, Fiddler tackles the entire ML lifecycle and works to build trust at every step of the way. Explainable AI plays a crucial role in building this trust and Fiddler has done a lot of remarkable work in providing state of the art XAI and working on the cutting edge of XAI.
I was fortunate enough to have the opportunity to talk with Lior Berry, head of engineering at Fiddler AI, about XAI. We talked about how Fiddler AI incorporates XAI into their product offerings. XAI at Fiddler takes a five layered approach. The first layer consists of model monitoring to protect against extensive drift and data violations. The second layer explores the features of the presented data and further examines the dominance of each feature. Then the third layer provides point explanations while also exploring what drove a certain model decision. The fourth layer allows the developer to do a What-If analysis of the model by testing different values. Finally the fifth layer tackles fairness and ensures that sub-populations are trained fairly and that the model does not perform poorly on certain populations. We also talked about Lior’s perspective on the XAI sector and its challenges. He noted that an expansion of regulation in the use of AI has encouraged companies to adopt more XAI techniques. Amongst the challenges that Lior mentioned, was the challenge of explanations for image ML models. Fiddler is currently exploring the use of a k-mean and binary search based technique to further build explainability for these models.
In addition to talking with Lior Berry, I also talked to the brilliant Krishnaram Kenthapadi. Krishnaram is the chief data scientist and chief AI officer at Fiddler. These two roles encompass managing the data science team, dealing with product as well as engineering, talking with customers, and exploring AI value adds for customers. During our conversation, Krishnaram noted how XAI was an extremely young yet rapidly growing field. This means that many of the problems that arise are new problems that Krishnaram often discovers through conversations with his customers. Then with this insight on the problems, Krishnaram is able to steer his team and researchers towards solving these problems. From there our conversation drifted to how explainability will continue to lag behind as long as models become exponentially more complex. Explainability is made even more difficult by the fact that it cannot take a one-size-fits-all approach due to the differing business and legal needs of customers. From there, we began a more technical discussion about some of the new XAI approaches that Fiddler was tackling. In regards to the challenges of monitoring and understanding large language models (LLMs), Krishnaram mentioned that they were exploring a distributions based approach. For image-based models, they were looking into the use of random noise to stress test models.
These conversations have revealed not only the speed of development and deployment of XAI but also the growing challenges in the space. Companies like Fiddler are working alongside researchers to tackle these challenges head on as they work to make AI more explainable and in turn more responsible.
Other Growing Players in XAI
In addition to Fiddler, there is a growing list of well funded startups and existing Big Tech firms looking to tackle the XAI space. These companies are:
- IBM’s AI Factsheets which is a collection of information about the creation and deployment of a model. This includes but is not limited to the model’s purpose, critical nature, and characteristics.
- Google’s AutoML tables, BigQuery ML, and Vertex AI which incorporate aspects of Google’s built-in feature attribution, example based explanations, and model analysis.
- Arthur AI which is an AI performance company working to track model performance to counteract data drift and improve model performance. Additionally they have APIs for explainability and transparency while also allowing for the active monitoring of bias and fairness.
- Arize AI is a startup focused on ML Observability with integrations for unstructured data monitoring, drift detection, and explainability and fairness.
- Fairly is a startup focused on making AI governance easily accessible. It accomplishes this goal through a risk monitoring component that focuses on explainability and a bias detection mechanism for models.
- Quantpi a startup that provides XAI to companies looking for reproducible explanations for documentation and compliance reports. This in turn reduces risks for enterprises and eliminates uncertainty in AI use cases.
- Dataiku provides a series of XAI techniques like What-if Analysis, bias detection, and individual prediction explanations in light of its broader goal of Everyday AI.
- Kyundi provides AI solutions that enable an explainable way of analyzing long-form unstructured text.
- Truera utilizes XAI to drive model quality improvement while also allowing developers to demonstrate model quality and fairness.
As models become more complex, we will start to see massive shortcomings in understanding the decisions they make. These shortcomings will come to clash with the growing regulations about the use of AI that will start to require more deterministic explanations for decision making by AI. These two counteracting forces, if unresolved, will lead to bottlenecks in the development of AI and the deployment of AI to improve our daily lives. Explainable AI provides one highly probable solution to this conflict. By making opaque black box models more transparent, explainable AI encourages greater trust in our models by providing insights into their decisions. A lot of work has already been done in explainable AI but much more still remains. One day we will be able to ask an AI to make a decision for us and further follow up by asking it to “explain yourself.”