Question regarding SHAP-values

Hi! I have a question regarding SHAP-values. I trained a neural network predicting a score (regression) and want to derive the variable importance using SHAP-values. My model was trained on scaled (normalized) input features. Now my question is: when I calculate the SHAP-values, do my input features also have to be normalized? Or is it allowed to use the original inputs? I tried both approaches - once with normalized inputs and once with the original inputs. Interestingly, from a logical and practical standpoint, the importance makes much more sense when using the non-normalized variables.

The code I used is:
explainer = shap.Explainer(model.predict, X_t_scaled)
shap_values = explainer(X_t_scaled)

Looking forward for any advise! Thanks and best regards!

model.predict must make use of scaled data. So, please use X_t_scaled.

See this example as well.

For deep learning models, you can use shap.DeepExplainer like this:

explainer = shap.DeepExplainer(model, X_train_scaled)
shap_values = explainer.shap_values(X_test_scaled)

Thanks for the help! When using your provided code I get the following error message: UserWarning: Your TensorFlow version is newer than 2.4.0 and so graph support has been removed in eager mode and some static graphs may not be supported. See PR #1483 for discussion. warnings.warn(“Your TensorFlow version is newer than 2.4.0 and so graph support has been removed in eager mode and some static graphs may not be supported. See PR #1483 for discussion.”)

Is there a solution to that?

Did you see this?

Yes, I tried it, but unfortunately it only led to further error messages. Are there any other solutions to this problem?

Have you tried to use a version of tensorflow < 2.4 ?

Yes, it works but somehow when I want to plot it it is “empty”. It says 0 features

for plotting I used:

shap.summary_plot(shap_values)

The 2nd parameter to shap.summary_plot is features which represents the values for which you computed shap values.