I have a question if a part of Andrew’s lecture is really correct. In week1 CNN Example it says at minute 11:00 that the number of parameters for the FC3, FC4 and softmax are 48.001, 10.081 and 841 respectively.
This is my first deep-learning-specialization course. I have not attended course 1-3. I have however completed the course “Machine Learning” by Andrew. But in that course it was said that
the number of parameters is defined by a matrix with dimensions of: s_(j+1)xs_(j)+1 (as can be seen on the bottom of the screenshot).
Shouldn’t that mean that the number of parameters given for the fully connected layers should read: 120x401= 48120, 84x121=10164 and lastly 10x85=850.
The reasoning in the “Machine Learning” course was that the previous layer has a bias unit added which then connects to all of the nodes of the next layer. So for example if your last layer consists of 10 nodes, all of those nodes have a bias with a certain weight. Thus 10x84+10 (and not 841 as presented in the table in the lecture).
It might very well be that something was said regarding this in the previous 3 specialization courses of deep learning. I would just like someone to explain why we only have one “bias weight” per layer or maybe if that is indeed a mistake in the lecture.