Week 3 Excersice: compute_cost(logits, labels) function

paulinpaloalto · August 12, 2022, 4:00am

You are right that the outputs are Z3, the “logits”, meaning the linear activation outputs of the last layer, as opposed to the full activation outputs (A3). But that is not the reason for the transpose: the activation functions are always applied “elementwise”, so they don’t change the dimensions or orientation of the data. It just turns out that they defined the network to take data that is oriented as n_x x m where n_x is the number of features and m is the number of samples. That’s the way Prof Ng chose to orient the data in Course 1 and earlier in Course 2. But it turns out that the TensorFlow functions we are now switching to using assume that the “samples” dimension is the first dimension. That is why you have to do the transpose to get the m as the first dimension. They mention this in the instructions for the compute_cost section of the assignment.

Note that the reason the network outputs the logits instead of the activation outputs is that Prof Ng choose to use the from_logits = True mode of the various cross entropy loss functions here. This is the way it will be whenever we’re using TF. Here’s another recent thread which discusses that.

Topic		Replies	Views
Tensorflow_introduction - compute_cost(logits, labels) Improving Deep Neural Networks: Hyperparameter tun	2	664	January 1, 2022
TensorFlow use of Z3 instead of A3 Improving Deep Neural Networks: Hyperparameter tun	2	623	May 10, 2022
Course 2, week 3 programming assignment - Compute cost function Improving Deep Neural Networks: Hyperparameter tun	5	1209	February 20, 2023
Tensorflow Exercise 6 - compute cost Improving Deep Neural Networks: Hyperparameter tun	4	691	February 5, 2022
Course 2 Week 3 EX6 Improving Deep Neural Networks: Hyperparameter tun	2	427	December 12, 2023

Week 3 Excersice: compute_cost(logits, labels) function

Related topics