In the first course we derived the explicit formulas for gradient descent in the case of regression. I am wondering if anyone has a reference where they carry out all the computations for doing the latter but in the case of a simple two-layer network with for example ReLU activations and mean squared error loss.

Hello @fragodec,

Welcome to our community! The course 1 of the deep learning specialization will tell you about gradient descent for a multi-layer neural network. Although it focuses on classification, if you understand the materials in that course and what you have learnt in this specialization, you should be able to change the cost function to MSE and make it work for a regression problem.

Raymond

The computations for backpropagation were covered in Andrew’s original Machine Learning course, but they’re not offered here.

This article covers the basics. It’s not simple.

Note that the examples you find online usually use an NN with a linear output. This is because the backpropagation calculations for a linear output are a lot simpler than if you have a logistic output (which you would use for classification). The linear output’s cost function is the simple sum-of-squared errors variety.