When calculating db by summing up dZ (optional assignment), why don’t we need to divide by number of training examples m?
Probably because the code computes dJ/db and not dL/db as per lecture videos. I like the former notation better because it is a cleaner solution than to introduce dL/db and dL/dW. We have a recent topic where you can read more: