I have also taken the older machine learning course (stanford).
In that for layer j the dimension of weight matrix is s_(j+1) *s_j+1.
In the weight matrix first column represent the bias terms of each neuron in the layer j.
But in this course the weight matrix is of order s_j*s_(j+1) and bias of layer j is in different vector.
s_j is number of neuron in layer j.
Can’t we follow the older course representation of weight matrix?