Regarding setting initial parameters and learning rate in Regression

A question ,it may sound a bit noob but I just wanted to ask if I don’t take specific initial weights ,bias and the learning rate , am I ought to get sometimes garbage results ?


initialize weights, and bias in library like tensorflow is some random normalize values between 0 and 1 to make the computation more easier and if you initialize the weights with a big numbers gradient descent will automatically tune it is values but it may take more time …also the learning rate you didn’t have the rule that control it’s value but in many applications(models) the learning rate is 0.1 or 0.01 but it may be less or more according to your application you initially it by 0.1 and if the model converge slowly you increase it and so on and there are many techniques that allows you to give a fairly large value, and after specific iterations it starts to decrease at the rate you set it is called learning rate decay


Thanks @AbdElRhaman_Fakhry
Actually I was implementing Linear and Logistic regression from scratch only using numpy and pandas , can you answer the question with respect to this please , actually I totally forgot to mention this.Never-the-less Thank you for your help.

When you implement linear and logistic regression form scratch you can initialize the weights with zeros using np.zeros( … ) and give it the dimension of the data do you have as if your data is 2 columns(features) and number of training examples is 100 so you doing np.zeros((100,2)) and b = np.zeros((100,1)) or you can use initialize randomly using np.random.randn(…) also give it the dimension of data(personally I prefer this choice as it is also using in Neural network as didn’t make all layer symmetric)


1 Like


In the case of NN, we randomly initialize with different weights to break the symmetry. However, this issue does not exist with linear regression and logistic regression. You are fine to start with any value, including zeros. Furthermore, Since the loss function is convex there is no issue of getting stuck in a local minima either.

Hi @shanup
actually I am having problem with my code , when I use large weights I get garbage results but when I use weights in between 1 and 0 then it converges and fits well with data , I am not getting where am I going wrong , can you plz help with the code part , I did them from scratch .

By garbage, do you mean Nan?

1 Like

No , I mean the Linear Regression line , or the sigmoid function does not fit well

Are you running a fixed number of epochs or are you setting any exit condition?

Have you tried reducing the learning rate?

The termination function i am using is : when the difference between the values of old weights and new weights is less than 10^-6 .

Can You please review my codes ? I am stuck and cannot understand the issue.

Okay. Send me a DM with your notebook and i will take a look.


I have started off the learning algorithm with high initial values…but also made 2 changes -

  1. I added the epoch loop and increased the number of epochs (instead of using the weight update as the exit parameter). You will see that even with very small update to w, the cost is still dropping.

  2. Increased the learning rate

The fitting is as good as if we had started out with low initial values

Can u please send the updated code in the DM sir , please please it is a earnest request.

Sure, I will send it.

And according to you what was wrong with the code and weights stuff , if you can summarise.
I mean what was going wrong in my code

You were doing ((w_new - w_old)**2 + …). The squared term was making the “w” update look smaller than it actually was.

So, the loop was exiting quickly while the cost was still dropping. The algorithm had not converged yet. Put a Sqrt() around the if condition and now try again with a high initial w. Also, increase the learning rate. It will take more time to exit the loop, but once it exits you will see the model fitting the data well.

Ok ok I got it now , have you done the Linear and Logistic both as of now I am facing the problem of heavy weights and bias with the Logistic one after doing this :

The Linear Regression Model is now ok

No, I have checked only the Linear Regression case.


For the logistic regression case, i had to modify your grad function. But, otherwise it works out fine even when you start out with high initial value for w.

Hello @shanup

Nilesh here,

Sir , the grad function you defined second time was the grad function for the Squared error one , not for the Binary Cross Entropy one.