Maybe Backprop optional videos should not be optional

I did not think about cost functions and how derivations for each unit are carried out till I watched the “Larger neural network” video from the “Back Propagation Optional” videos. Till now I was thinking that every unit will have its own cost function and gradient descent especially because the layers can have unique activation function.

For Example lets consider a NN model with (Layer 1 : Sigmoid) and (Layer 2 : ReLU) and then (Output layer : Linear Function). If we are to make this model in tensorflow we only mention single cost function and that will be based on last output layer, in this case since it is linear function cost function can be MSE. I just thought every unit or every layer(since they have unique activation function) will have their own cost function and gradient descent algorithm. Now that I think about it, it does not work like that but I never gave it a serious thought.

So basically without backdrop we have to substitute Layer 1 function in Layer 2 function and then substitute that function into the output layer activation function and then find derivative with respect to all the parameters and then we can finally do gradient descent for every input value.

I dont know If I am slow to understand this or what but I feel like this is important to get a sense of how neural networks work even if we do not need it to use the neural networks and that why I think that this should not be optional.

1 Like

Thanks for your recommendation.

2 Likes

Hello @tinted,

Thank you for sharing your thoughts with us. I believe the course sets it to optional because it is more maths demanding, and, in practice, thanks to the deep learning frameworks such as Tensorflow and Pytorch, we don’t really need to work out the backprop and all those derivatives ourselves, therefore, even if we skip those optional materials, we will still be able to build and train a neural network.

However, I do agree with you that it is some details that shouldn’t be ignored as we progress to a professional machine learning practitioner, but it is also some details that we can go into later when we are ready to, and skipping it for now won’t be harmful. Once we are more confident in using a nerual network with those frameworks, we can always turn around and look into those behind-the-scene again - it is about where you put it in your own learning roadmap.

Raymond

1 Like