Hi, could someone help me with some explanations on why we would block the weights of BatchNorm layers before fine tuning? Thanks!
The reference notebook is assignment 2 - Trigger word detection
Hi, could someone help me with some explanations on why we would block the weights of BatchNorm layers before fine tuning? Thanks!
The reference notebook is assignment 2 - Trigger word detection
There are only 2 assignments in 3rd week of course 5. Please fix your post.
Thanks for the alert, I have updated the post
When the distribution of data remains the same for your model and the base model, freezing batch norm layers makes sense.
Does this post help?