Hello fellow learners,
I’m struggling to understand the math here…
question is, even if I feed my training data in such a way that all x values map to all y values for a fixed w1, w2, w3 and b value, will the cost ever go down to zero?
in other words, can an equation
y = w1x1 + w2x2 +w3x3 + b ever be fitted on a straight line?
Hi @arpan_banerjee
Welcome to the community!
Yes, this(w1x1 + w2x2 +w3x3 + b ) can be fit the data but it wouldn’t be like it would be like surface or plane that fit the data, BUT fixed weights w1, w2, w3 and b value wouldn’t fit the data correctly it should be adjust or tuned by the optimization algorithm like gradient descent algorithm to fit the data correctly
Best Regards,
Happy Ramadan 
Abdelrahman
1 Like
Hi @AbdElRhaman_Fakhry happy Ramadan 
thanks for the reply. yes, I also found the same.
also, for anyone reading this in future I realized:
the cost eventually does go down to zero (I think… I followed till 0.005), but after 0.05, the gradient descent gets really slow, and the cost doesn’t change much in each iteration
That says you might reached to local minima or global minima. Try with different learning rates to see the effect of this change