Regarding week 2‘s coursework.
Why do we apply the identity block several times?
i also don’t understand why we have different numbers of identity blocks.
E.g. step 3 has 3 identity blocks, step 4 has 5 identity blocks. Please help.
Regarding week 2‘s coursework.
Why do we apply the identity block several times?
i also don’t understand why we have different numbers of identity blocks.
E.g. step 3 has 3 identity blocks, step 4 has 5 identity blocks. Please help.
The way network architectures in general are determined is by experimentation: you try various combinations until you arrive at a combination that gives good results on the particular problem or problems you are trying to solve. This is just a specific case of the general question of hyperparameter tuning, which Prof Ng spent quite a bit of time on in Course 2 of this series.
So I guess you could say that the answer is “because”. Because that’s what the researchers who came up with this architecture found worked pretty well.
But there are two edges to that sword, of course. You are welcome to try to come up with versions of this architecture that work better. Try using 3 identity blocks in step 4 and see how that works compared to the setup they have. If you do, maybe by next year everyone will be using Eger Nets instead of Residual Nets. Seriously. This kind of thing happens all the time. Do the work, write the paper and it’ll be your name in lights! Or the very least, your experimentation will give you some insight into why the choices were made the way they were.
Thanks for that. I feel I lack intuitions on how deep the techniques we learnt can be glued together, for new problems.
I suppose, I should try things out myself.
Hehe, look forwards to an Eger Net.
The best ways to gain intuition are 1) to observe what other people have successfully used in the past on similar problems and 2) trying things yourself to see how changing and tuning networks affects the behavior. I think Prof Ng says the above someplace in the hyperparameter tuning section of Course 2. Of course a lot of what we are seeing in these courses is Prof Ng showing us solutions that work for different types of problems. There is a lot of “prior art” out there by this point, but still plenty of new problems which may or may not be well handled by the existing techniques. So there is still plenty of space for creativity and experimentation.