Confused on C2_W2_SoftMax Lab

Christina_Fan · March 12, 2024, 7:02am

Hi team,

I am so confused about the objective of this Lab? Would greatly appreciate if someone could please provide detailed expaination to my following questions?

Q1: What are we trying to predict here? Is 4 deferent classifications (0,1,2,3?)

Q2: X_train is (2000 by 2) why the output is 2 by 4?

Q3: in the content below says "the outputs are not probabilities, but can range from large negative numbers to large positive numbers. " Is not -log(ez/sum(ez) should be greater than 0 hence, the outputs should be range from 0 to large positive numbers?

Q4: in the content below says: " To select the most likely category, the softmax is not required. One can find the index of the largest output using [np.argmax()]. What this means? does it mean we can just use .argmax() to make prediction instead of using .softmax?(numpy.argmax — NumPy v1.26 Manual)."

Many thanks for you help! Look forward to your responses.

Christina

TMosh · March 12, 2024, 9:57pm

A1 The training dataset consists of 2000 examples, where each example has two coordinates, and the values are clustered around four different coordinate pairs.
centers = [[-5, 2], [-2, -2], [1, 2], [5, -2]]
These clusters are numbered 0, 1, 2, and 3 by the ‘y_train’ values.

However, since we want to create a model that gives the best classification for each example, we are doing a classification method, rather than directly trying to predict the integer values 0 through 3 (which would be a linear regression method).

This is why the model is compiled to use SparseCategoricalCrossentropy and either softmax, or “from_logits = True”. Both of these methods will automatically convert the y_train values into one-hot outputs (where there are four outputs, one for each classification), as seen in the model definition here:

A2) The size (2 x 4) output gives the four output values for each of two selected examples.

A3) When the model uses a linear output, the values can range from -Inf to +Inf. The softmax function re-scales these values to between 0.0 and 1.0.

TMosh · March 12, 2024, 10:20pm

I created a plot of the X_train data, color-coded by the y_train values:

Christina_Fan · March 13, 2024, 1:28am

Thank you Tom for your detailed responses. much appreciated.

I could not see a response for Q4, could you please provide an answer to it as well?

Q4: in the content below says: " To select the most likely category, the softmax is not required. One can find the index of the largest output using [np.argmax()]. I don’t get what this means? does it mean we can just use .argmax() to make prediction instead of using .softmax?(numpy.argmax — NumPy v1.26 Manual)."

TMosh · March 13, 2024, 3:00am

Since the softmax function is monotonic, it doesn’t change which of the values will have the highest value.

So if you just need to find the output that has the highest value, you don’t need softmax at all.

Christina_Fan · March 13, 2024, 4:09am

Brilliant Thanks Tom.

Topic		Replies	Views
How softmax function associates output index with categories Advanced Learning Algorithms week-module-2	1	477	November 9, 2022
Softmax layer preds Convolutional Neural Networks week-module-2 , coursera-platform	9	34	February 9, 2025
Softmax Categorization of Targets Advanced Learning Algorithms week-module-2	3	502	July 27, 2022
Model Output with and without Softmax Activation / from_logits=True Advanced Learning Algorithms week-module-2	11	492	June 1, 2023
Point of clarification for video "Neural Network with Softmax" Advanced Learning Algorithms week-module-2	4	404	July 10, 2023

Confused on C2_W2_SoftMax Lab

Related topics