C5W2A2 - a better confusion matrix

Not so much a question as suggesting an improvement to the confusion matrix that allows you to graph your Emoji results.

I think the original approach - where the grid you plot shows the totals on the left and right is super unhelpful as the scale simply means that you highlight the bottom right total number and can’t really see the relative balance of on and off diagonal elements for the predicted versus actual labels 0-4.

But if I’ve missed the point of the original approach please call this out.

A much better approach is below. The changes are:

  1. Set margins = False to remove the totals at right and bottom
  2. Actually graph the df_conf_norm values rather than absolute values (original had df_confusion in the matshow() call.
def plot_confusion_matrix(y_actu, y_pred, title='Confusion matrix', cmap=plt.cm.gray_r):
    
    df_confusion = pd.crosstab(y_actu, y_pred.reshape(y_pred.shape[0],), 
                               rownames=['Actual'], colnames=['Predicted'], 
                               margins=False)
    
    df_conf_norm = df_confusion / df_confusion.sum(axis=1)
    plt.matshow(df_conf_norm, cmap=cmap) # imshow
    #plt.title(title)
    plt.colorbar()
    tick_marks = np.arange(len(df_confusion.columns))
    plt.xticks(tick_marks, df_confusion.columns, rotation=45)
    plt.yticks(tick_marks, df_confusion.index)
    #plt.tight_layout()
    plt.ylabel(df_confusion.index.name)
    plt.xlabel(df_confusion.columns.name)

The confusion matrix uses gray-scale shading (based on the values in the table) to show how well each label is predicted.

There are many ways to construct such a graphic.