i reaaly don’t understand what is the use of addtional side branches in inception networks prof andrew said it help in reducing overfit but i don’t get how it can be helpfull
Hi Mustafa!
Hope you are doing well.
These are called auxiliary classifiers. We are trying to do the classification using our hidden layers too, basically, the loss incurred here will also be included in the main loss, so the loss is increased → analogous to adding extra terms for regularization.
Hope you get what I’m trying to say. If not feel free to post your queries, will explain it in detail.
Regards,
Nithin
But how the additional loss will affect the parameters in the main network the addtional loss is comming from another parameters of another network then how it will affect ?
They all belong to the same “whole network”, it is just these losses are from the predictions given by the intermediate layers.
For the sake of understanding, let us say that we have 10 layers and we are classifying by using the first 6 layers alone and also we are classifying by using all the 10 layers. In both these cases the weights of the first 6 layers are the same, right?
So by minimizing the total loss (loss from the whole network + loss from the auxiliary classifiers), we are updating the whole network’s parameters. These auxiliary classifier parameters belong to the main network only, they are not a separate entity, but just are from intermediate layers.