Model Accuracy and Interpretability
A big part of deciding what Supervised Machine Learning Model to use for your project is determining whether you value how accurately the model predicts future events, or if you care mostly about how easy it is to interpret the model and its outcomes. It seems easy enough, the most desirable model is one that has the highest level of accuracy, and is easy to interpret. However, as we will soon learn, it is almost impossible to have such a model. Model complexity and Interpretability are inversely related, whereas complexity and accuracy are directly related. That is, the more complex the model, the more accurate it is, and vice versa. Equally, the more complex a model is the less interpretable it is, and vice versa as well. It logically follows then that the more accurate the model is, the less likely it is to be interpretable. In machine learning, this is the accuracy-interpretability tradeoff. In this post, we will discuss the tradeoff, and how to strike a perfect balance in your next supervised learning project.
Model Complexity
There isn’t really an accepted definition of what model complexity is. This is because there are so many things that go into deciding whether or not a model is complex. For me, model complexity is determined by how well a model is able to pick interactions between variables in the data. More complex models are those that do a good job of picking up underlying relationships and interactions within the data and use this information to more accurately make predictions on unseen data.
There are various factors that affect the complexity of a model. The first reason is the number of features in the data. The more features the dataset has, the more complex it is likely to be because more parameters have to be estimated. A simple linear regression model with one feature and one target variable is less complex than a model with multiple features for example. Furthermore, the more features in the dataset, the higher the possibility of complex relationships and interaction between the features. A model can also be complex or less complex based on the model structure. A linear model is far less complex compared to a nonlinear model such as a quadratic or cubic model.
Model complexity may also refer to the complexity of the model’s learning process, and or how computationally complex a model may be. Nonetheless, overly complex models usually require a lot of computational capacity, are susceptible to overfitting, and are not easily interpreted. Deep Learning models are examples of complex models.
Predictive Accuracy vs. Interpretability
Complex models like most deep learning models have strong capacities to find hidden relationships in data and make more accurate predictions. The process by which they learn and discover this hidden relationship often results in reduced Interpretability.
This is because these models tend to involve a lot of features with complex relationships that are difficult to interpret, and sometimes may not even be immediately visible (especially in the case of latent influences), intuitive to a person, or make business sense.
This tension between interpretability and accuracy is a fundamental trade-off when using Machine Learning algorithms for prediction. The specific use case should determine the complexity of the model used.
Deciding on whether you care about accuracy or Interpretability depends ultimately on the goal of your project. If your goal is inferential in nature, you will be better off using models that are easily interpreted. If your goal is prediction, then you want to use models with high predictive power.
Prediction vs. Inference
In statistics and data science, inference and prediction are often confused with each other, in most part due to the fact that they are neither mutually exclusive nor really that different. In fact, in most analyses, inferences arise from prediction. Asking questions like how predictor variables (eg. X and Z) predict the outcome variable (eg. Y) results in inference about how important X and Z are to Y. However, these are still very different. Any researcher and or analyst should thus establish the goal of their analysis (whether to make predictions or inferences).
The first goal of prediction is to create robust models that consider all predictors that will predict the outcome variable with a great degree of accuracy and low error. On the other hand, the goal of inference will be to estimate relationships between outcome and predictor variables.
For example, in a marketing model whose goal is prediction, you might ask what Sales level does the model predict given a marketing budget on TV, radio, Facebook and newspapers? On the other hand, inference cares about how advertising platforms influence sales.
If inference is the goal, you want to use models that are easily interpretable. In this case, you care about explaining the underlying relationships between the variables. Complex models with high accuracy don’t give you enough room to present and explain those relationships.
If the goal is prediction, then you will need to use more complex models that predict with a high degree of accuracy (the only caveat here is that you must be careful of the problem of overfitting). Using models that can be easily interpreted but perform terribly at prediction will hardly be useful in this case.
Final Thoughts
When deciding between what I value the most when it comes to Accuracy or Interpretability, I do so with the model/product end user in mind. I try to understand what their business values and needs are, and what they care most about.
If I am building a model that predicts whether a patient has cancer or not, I will tend to use more complex models (ie. Neural Networks instead of Logistic Regression) because this patient will mostly care about you getting the prediction right, as opposed to being able to explain how you arrived at the answer.
When building a model that predicts whether or not a potential client will default on a loan, I will probably use models that are interpretable, because a client is likely going to want to know why their application is being declined for example.
Basically, have people in mind when you are building models and systems.
Below is a graph that shows some of the models and their level of Interpretability and accuracy.
- ****Note: there is so much more that goes into deciding what model to use besides just Interpretability and prediction accuracy. Will explore those in future posts.
Happy building!
References
Rane, S. (2018, December 3). The balance: Accuracy vs. Interpretability. Retrieved from Towards Data Science: https://towardsdatascience.com/the-balance-accuracy-vs-interpretability-1b3861408062
Jain, I. (2022, June 12). What is Complexity of a Machine Learning Model? Retrieved from Medium: https://ishanjain-ai.medium.com/model-complexity-explained-intuitively-e179e38866b6
Goodrum, W. (2016, November 4). Balance: Accuracy vs. Interpretability in Regulated Environments. Retrieved from Elder Research: https://www.elderresearch.com/blog/balance-accuracy-vs-interpretability-in-regulated-environments/