I want to check if I am getting this right for gradient descent: When you try to min the cost function we try to minimize in all the m examples and we do that for the n iterations (where n is a number that we choose), right?
Hi, @Cesla_Pardo_Araujo. Yes, from what you have stated your understanding is correct. I will only draw some emphasis to the difference between the “cost” and the “loss:” the the cost function is defined over all examples as the average of losses. The loss is the contribution to the cost for a single example.
Thanks for the quick reply! Yes it is true, I always change loss with cost and the other way around.