C2_W2_Multiclass_TF - Output layer explanation

Hello @JJaassoonn

-50 to 5 is a good approximation. To get the more precise value (which is actually not quite necessary up to this point), we can subsititute a0 = a1 = 0 to get 3.18, and subsititute a0=8 and a1=10 to get -41.2. Therefore, the range is -41.2 to 3.18.

Again, it is not necessary to be that precise up to this point… We only need to be precise when the ranges are too close.

Your statement about unit 0 is correct and important!

We want unit 0’s output to be the largest for samples of class 0. For unit 0 to be the largest, just seeing that the blue dots being at the lower bottom corner is not enough, instead , we also want to see that unit 1’s, 2’s and 3’s outputs are smaller in value compared to unit 0’s output.

For unit 3, it is quite clear that class 0 samples are located in the whitest corner which looks to be small than 0, and if that is the case, then unit 0’s output for samples of class 0 is larger than unit 3’s (which I will leave it to you to verify :wink: )

For unit 1 and 2, they are less clear just from the graph, but you can verify them from their values.

I have explained in the above example. If you want unit 0 to be the largest all of units, you do not only need unit 0 to be large, you also need unit 1, 2 and 3 to be small.

Remember, you need unit 0 to be the largest among all unit, not just large among itself.

There is no simple example to that, but there is an explanation to that.

The set of weights and biases that you had copied to your post was the result of the model training, right? And they were already the “coordinated” result. The forces of that coordination behind the scene are our friends - the softmax and the cost function.

The softmax and the cost function together makes sure that when a sample of class 0 comes in during the training process, unit 0’s output will be maximized while unit 1, 2, and 3’s output are minimized. To see this, you can compute the gradients with respect to all of the 4 units (which I have shown here).

Cheers,
Raymond