Need Help: Adding Confusion Matrix to Mobilenet V2 Assignment

akumar · April 19, 2022, 2:53pm

Greetings, fellow coders. I am trying to figure out ways to add a confusion matrix to the output of the Mobilenet-based classifier but haven’t been able to figure it out. Any help?

My code:

{moderator edit - solution code removed}

paulinpaloalto · April 19, 2022, 3:38pm

Hi, Ankit.

It looks like the only changes you have made so far as to convert the output from 1 unit (binary classification) to multi-class and added softmax as the output activation. There is some confusion about the loss functions: you import categorical_crossentropy, reference binary_crossentropy and then appear to actually use sparse_categorical_crossentropy, which is a slightly different (but related) thing. Note that you’ve also changed from from_logits = True mode to False mode, which is the default.

So where is the “confusion layer”?

The other bigger issue here is that I am uncomfortable with the fact that you have basically published your solution to the original problem in the exercise. I don’t think it is a good idea to do that. So if you want to talk about extending it, we need to figure out a way to talk about that without revealing your source code for the actual solution. I will edit your post to remove the source code.

Sorry if that’s not very much help, but we have to work within the rules here. Please take another shot at this, but let’s start by discussing what a confusion layer is, why you want to add it for the purposes of your goals here and then talk in more general principles about how to approach that. Note for starters that Prof Ng has not discussed “confusion layers” in Courses 1 through 4 here. I have not finished Course 5, so I don’t know if he discusses that there. So this question is beyond the scope of DLS Course 4, meaning that we need to start from something closer to “first principles” here. You can’t just assume the rest of us know what you are talking about.

akumar · April 19, 2022, 4:28pm

Hi Paul! Apologies for posting the code, and thanks a lot for removing it. Coming back to my trouble:
I am doing multiclass classification and therefore the loss is defined as sparse_categorical_crossentropy instead of binary_crossentropy. The latter is in the code but not being used.

Coming back to the first principles as you suggested, I want to see how well the model can classify the respective classes. Some of my classes are visually more similar to each other and I want to find out if these classes are causing the drop in the total training and validation accuracy. From my limited Google search, I found out that confusion matrix is the way to go but haven’t been able to integrate anything into my code. I am a biologist who’s new to programming so any help would be much appreciated!
Eager to hear some thoughts on this matter.

paulinpaloalto · April 19, 2022, 4:32pm

I have never looked at confusion layers before, so this will be a learning experience for both of us. Well, I assume you found some articles about confusion layers. I’ll do my own google search in a second, but what do the articles say about how a confusion layer functions? They must have given some hints on this: e.g. does it act as a preprocessing layer, modifying the input data in some way like augmentation? Or is it an internal layer in the network that performs like another hidden layer? Or is it a post-processing layer that you apply to the output of your softmax layer? And what are the hyperparameters that you use to configure your confusion layer?

Or to put the question at a higher level: from the reading you have done so far, what is the point of a confusion layer and why do you think it will be useful in your application?

akumar · April 19, 2022, 4:42pm

From my current understanding, there are no confusion layers. The confusion matrix interprets the training output, and just like we plot training and validation accuracy, we can plot/print the confusion matrix, which shows accuracy and recall.
Relevant links: 1, 2

paulinpaloalto · April 19, 2022, 5:13pm

Ok, thanks for the links. So a “confusion matrix” is just a technique for evaluating the output of your model compared to the labels. It just gives you a convenient way to look at the metrics like precision and recall that are used to evaluate the performance of your model in a unified fashion.

So what you need to do is to convert the output of your model (softmax values) into the appropriate form to compute the confusion matrix. That is to convert from it the probability form of softmax to actual predictions. That’s pretty easy:

predictions = tf.math.argmax(activations, axis = -1)

That just computes the index that has the largest softmax output value for each input sample. It’s the multiclass equivalent of saying:

predictions = (activations > 0.5)

in the case of a binary classifier. Then there’s a TF function to compute the confusion matrix. Here’s the docpage. You feed both the predictions and the labels into it as 1D arrays of index values (not “one hot” vectors).

So that will give you the confusion matrix. Now the question is how to interpret the results and whether it actually tells you anything “actionable”.

paulinpaloalto · April 19, 2022, 5:36pm

Let’s try a little toy example to see what it looks like:

labels = tf.constant([0, 2, 1, 1, 2, 3, 3, 3, 2, 2, 0, 1])
preds  = tf.constant([0, 0, 0, 1, 2, 1, 1, 3, 3 ,2, 0, 0])

confusion = tf.math.confusion_matrix(labels, preds, num_classes = 4)
print(f"confusion matrix:\n {confusion}")

Running that gives this:

confusion matrix:
[[2 0 0 0]
 [2 1 0 0]
 [1 0 2 1]
 [0 2 0 1]]

Reading the documentation, the columns represent the predictions and the rows represent the true values. So the numbers on the main diagonal are the number of correct predictions for each class and anything off diagonal represents incorrect predictions.

So for label 0, we can see that every 0 sample was correctly predicted, but there were a total of 3 “false positives” for 0. We can see that samples of type 1 and 2 are quite likely to be falsely predicted as 0.

For label 2, we can see that those are likely to be “false negatives” and can get predicted as either 0 or 3.

And so forth for labels 1 and 3 …

So that gives you a quick visual way to figure out which of the labels are the most problematic for your model. Then you can do further analysis to figure out if the inputs are incorrectly labelled or if perhaps you need more data for those difficult classes. This sort of topic (what to do when your model doesn’t predict as well as you require) was addressed more in Course 3. A lot of people skip that one, because there are no programming assignments, but there are lots of interesting ideas taught there that may help in this sort of situation.

akumar · April 19, 2022, 8:11pm

Do the activations need to be defined separately? I got the error:
NameError: name 'activations' is not defined

paulinpaloalto · April 19, 2022, 8:12pm

I didn’t mean that is necessarily the variable name to be used in your particular case. I was just writing some sample code to demonstrate what a confusion matrix looks like.

The “activations” are whatever the softmax output is for your model. How do you compute that?

akumar · April 19, 2022, 8:33pm

I see. I’m trying to figure out a way to get the activations of the last layer.

paulinpaloalto · April 19, 2022, 10:04pm

Let’s see what we can learn by looking at the existing code in the notebook. At first they just plot the accuracy, which doesn’t show you how to get the actual activation output. Although note that their network is set up differently: It does not include the output layer activation (sigmoid in that case) and uses from_logits = True mode on the loss function. Ok, but not much help there. Sorry.

I created my own test block in the MobilNet exercise to process a single image. Here’s how I did it:

# Try running predict on a single image

image_var = tf.Variable(augmented_image)
print(f"image_var shape {tf.shape(image_var)}")
plt.imshow(augmented_image[0])

model2.trainable = False

pred_logit = model2(image_var)
pred = tf.math.sigmoid(pred_logit)

Of course the thing to note there is that I was using the original model as they defined it here, meaning that the outputs are logits as opposed to activations, so I had to manually apply sigmoid. You wouldn’t need to do that extra step, since you included softmax in the output layer of the model.

paulinpaloalto · April 19, 2022, 11:09pm

The other way to approach this is to use the “predict” method of the Keras Model class, but the documentation says to reserve that for dealing with large batches. Just directly invoking the model, as I showed in the previous example, is fine if the amount of data used as input is relatively small.

TMosh · April 20, 2022, 1:38am

Note that the Emojify assignment (in Course 5 Week 2) includes a confusion matrix function and plot. Perhaps come back to this topic after you complete that assignment.

Lingxiao_Gao · June 13, 2022, 1:20am

Hey, friend. Have you solved your problem? How did you do that

Topic		Replies	Views
How does the mobilenet assignment model verify accuracy when the final layer is not softmax or sigmoid? Convolutional Neural Networks coursera-platform	5	575	May 26, 2022
C3W3 Assignment Exercise 4 Evaluation/classify - Confusion matrix is wrong (not exactly matches to the expected output) NLP with Sequence Models week-module-3	16	886	January 1, 2025
NLP Course 3 Week 3: Confusion matrix NLP with Sequence Models week-module-3	14	186	July 12, 2024
Confusion matrix Convolutional Neural Networks in TensorFlow week-module-1	1	529	September 6, 2022
Assigment linear activation dor binary classification Convolutional Neural Networks week-module-2 , coursera-platform	2	159	April 22, 2024

Need Help: Adding Confusion Matrix to Mobilenet V2 Assignment

Related topics