Plot of Cost Function Vs Number of Iterations :
I have written my own code for computing a deep NN bit by bit. For verification, I used the same data (catvsnoncat) for training and testing. It fetched the same result (i.e. the value of cost after 2500 iterations was exactly the same as the programming assignment’s value of 0.08843… ). It took me a while to find out the weights initialization are done using Xavier’s initialization ( the function was saved separately in the folder ). I plotted the cost vs iteration and it appears to be noisy. Is it okay for the cost function to converge in this way??
Also, if we don’t use np.random.seed(1) i.e. not fixing the random values, the convergence seems to be better. I attained a cost of 0.04 after 2500 iterations. (I understand that it is fixed for grading purposes).