What are the cases in which HE initialization doesn’t work?
I am asking this question, as when I tried to apply HE initialization in the cat classification problem of course 1 week 4 assignment 2, it gave me the accuracy of 65% on training data and 34% on the test set. This comes after training for 10K iterations.