In earlier courses we were told to initialize W matrix as np.random.randn(size) and not by np.zeros() but when I am trying to initialize dW matrix using np.random.randn() method I am getting wrong answer and when I used np.zeros() I am getting proper answer. Why is it so?

The grader expects you to use zeros, because it canâ€™t grade an assignment that has random data.

For grading purpose I understand to use np.zeros() as we have to match our output to grader output, but if we were to run this code in real world we should have used np.random.randn() right?

Most likely yes, due to the need for breaking symmetry during training.