Hi everyone, If I wanted to train a forward propagation model for classification from scratch, what values of Y am I supposed to take for (ŷ - y) in gradient descent for each neuron and layer?

Let’s say we have a 2 layer model [3 and 1 neurons] and an example where our y = 1, I could see that we should train the last neuron during gradient descent where we could input y = 1 since it’s the final output that we wish to get, but what are the y values of the neurons in the first layer? Are we using y = 1 as well?

I hope I made myself clear haha, and sorry if this is explained later in the course, or it has already been explained, Thanks in advance

For the output layer, the ‘y’ values are from the training set.

For the hidden layer, there are no ‘y’ values known. So you have to use a mathematical trick called “backpropagation of errors”. It requires that you compute the equations for the partial derivatives of the cost function. This uses calculus.

This is a non-trivial bit of work and (edit) was just recently added to the course.

Thank you very much! It seems later in week two there is a module called back propagation, but if that is not covered there I’ll sure take a look into it somewhere else