TensorFlow question

Where is the ground truth label Y in this code example of forward prop using TensorFlow?

Two points:
The truth Y would not be used in forward propagation, since that would only be making a prediction of the Y value. For that you only need the weights and features.

However, the code example says it’s for training and optimization, but it’s a defective example: But both of those tasks require computing the difference between the predicted Y and the truth Y.

So you are correct that the example is broken.

First I would like to know from where did you get this code example screenshot?

Is this complete code as here it is only training model for x features

please provide link from where you got this screenshot

This is very similar to this one:

https://community.deeplearning.ai/t/question-about-tensorflow-code-example/889344?u=balaji.ambresh

2 Likes

This is provided by Prof Ng in one of his course videos.

Is it incorrect then?

Already answered in my previous reply.

I can’t provide an http link in this post.

Deep Learning Specialization→Improving Deep Neural Networks…→TensorFlow

So why does the code work as expected when Prof. Ng runs it in his Jupyter Notebook?

I just checked the whole course 2 all 3 weeks lecture notes and couldn’t find this code example, except a slide show programming framework tensorflow.

are you stating about the same?

Its in his video lesson not lecture notes. He shows it working correctly in a jupyter notebook. It does back prop and forward prop in Tensorflow and he shows how it reduces the cost function to 0 for the correct value of \\w=5

@jjbarnes

Lecture notes are nothing but video slides.

in the screenshot he is more explaining about how trainable weights are determined using tf.variables (which is nothing but tensor form of x features)

these trainable weight gradients are then passed through tensorflow gradient tape to keep a check on variables during training to see which which values to keep

then ground truth labels are used to calculate loss by comparing model predictions to actual targets.

this loss indicates how far the model is from the truth, allowing an optimizer to update model weights. The labels are essential for training supervised models and evaluating performance.

Ground truth labels are calculated by bounding boxes with the marked labelled objects, which then act as a actual targets, in the model prediction to compare with true predict outputs

This topic is actually from advanced tensorflow specialisation, where you will be taught how to calculate ground truth labels with bounding boxes. (it is an advanced course)

Ground truth bounding boxes are hand-labeled or automated annotations that define the tightest possible rectangle around an object, typically using [X min, y min, x max, y max]. They are calculated by finding the outermost pixels of an object to minimize intersection over Union discrepancies during model training.

With that being said, I know my explanation must have got you some to no understanding, as this part of tensorflow what Andrew is explaining, requires more topics to be explained, and even when I was doing the course, I had got confused until I completed by tensorflow advanced specialisation. Andrew must have also did not explain all the topics here to avoid more confusion for learners as it would caused more frustration.

So @jjbarnes as a previous learner who also have been in the same place, i approached this topic, one topic at a time, and then dig for answer just like you are asking here, or checked in tensorflow.documentation but until I did tensorflow advanced techniques, my complete doubt about how ground truth labels weren’t cleared on how it is used on model training, to get respective class outputs. ending up having more question when doing course 2.od DLS.

So for now understand ground truth labels are the true class labels determined by annotation.

So what are the ground truth labels for this example code and where are they in the code?

Prof Ng demonstrates that his code works to minimise the cost function J for an optimal value of w=5

This is the lecture in DLS C2 W3 titled TensorFlow.

This is just an artificial example. He’s not training a neural network here: he just defines an example cost function that turns out not to depend on a ground truth value. He’s just defining the cost function to be a parabola and showing that TF can minimize it without you having to specify the gradients. He says all this in the lecture.

1 Like

It’s not a real cost function - it’s just a very poor “for example”.

1 Like

@jjbarnes

I think as I mentioned in my previous reply Andrew is more explaining here about how cost function is implemented in case the variables when using tensorflow, per se

as @paulinpaloalto this example is more a data point of view using tensorflow

As I said even I was confused when I did this topic, i agree with Tom, that is a poor correlation of example explaining any functional implementation with data.

But then I also understand, at that time if I was told about gradient tape,. or ground truth labels, depending on the approach of keeping a topic focused on neural network training using tensorflow, Andrew must have chosen to explain only this part as the course we are talking about is improving deep neural network.

Some might agree or some might not agree with this approach, remember this example solely explains when your x variables using using tensorflow function and calculating cost function, and not explaining in total about input, ground truth labels to predict output classes.

Regards

Dr. Deepti

I’m not really understanding what you mean.

So how is tensorflow minimizing the cost function without gradients?

No, it uses gradients to do the minimization. The point is that you don’t need to create the gradient functions, as we did in Course 1. That’s the part that TF does for you. All you have to do is write the forward propagation logic, select the cost function and the optimizer that you want to use and TF does the rest.

Remember all the logic we had to write to compute dW^{[l]} and db^{[l]} in DLS C1.

So how can this code minimize the cost function without the ground truth labels being specified as is required in the gradient descent algorithm unless this code is not using gradient descent to minimize the cost function.

The cost function is defined by Professor Ng as a quadratic polynomial in one variable w. Here is the mathematical notation for the function, given the code that you showed in the first post on this thread:

J(w, x) = x_0 * w^2 + x_1 * w + x_2

That function is differentiable. TF can compute the gradients of J w.r.t. w given the value of x (constant for his purposes) and the current value of w.

He defines x = [1.0, -10.0, 25.0], so it’s pretty easy to see that J is a parabola which is concave upwards, so it will have a minimum. TF uses Gradient Descent to find that minimum.

You only need the ground truth values if the cost function is defined to involve the ground truth values.

As I said earlier, this is just a simple artificial example that he’s using to show that TF can do Gradient Descent without us having to do any differentiation ourselves in our logic.