About Multivariate linear regression algorithm

Hi,
i have tried to implement Multivariate linear regression algorithm but i am not getting expected weights and bias.

i am expecting weights(W) to be [2,3] and bias(b) to be [4], because i have generated the dataset such that i can set weights and bias.

Are you working on one of the course assignments, or are you working on your own experiment?

it is my own experiment. in this,i am using the algorithm explained in course and implementing it on my own generated dataset.

That’s neat, you will learn a lot by doing your own experiment.

thanks, but can you please review my code? i am getting unexpected weights and bias.

May I know how do you know what would be your expected weights and bias?

yeah sure,
firstly i want to tell you that i am generating the data set the following way "
np.random.seed(0)

x1_train = np.linspace(-10, 10, 100)

x2_train = np.linspace(130, 450, 100)

Combine x1_train and x2_train into x_train

x_train = np.column_stack((x1_train, x2_train))

Generate the dependent variable using a linear equation

y_train = 2 * x1_train + 3 * x2_train + 4 + np.random.randn(*x1_train.shape) * 40".

so In this case, the linear equation used to generate the dependent variable is y = 2 * x1 + 3 * x2 + 4. The expected weights correspond to the coefficients of x1 and x2 in the equation, and the bias corresponds to the constant term.

Are you replicating an optional lab or graded assignment? Lab/Assignment name? Furthermore, try tuning hyperparameters, like learning rate, num of iterations, etc.

Yes, i have replicated the algorithm from “C1_W1_Lab04_Gradient_Descent_Soln”.
i tried to tuning the hyperparameters but the result is not changing a lot.

I suggest trying the simple one. Like y_{\text{train}} = 2x_{\text{train}} + 4 and see if you get the same weight and bias.

It seems like you’re not passing the correct arguments x and y in the gradient_descent function’s gradient_func and cost_func . Instead, you’re using the actual x_train and y_train sets. This could be why you’re not getting the exact same parameter values.

Other factors like hyperparameters and training on normalized data can also affect your parameter values.

Then why are you using this "+ np.random.randn(*x1_train.shape) * 40" extra term ? This affects it too.

Hello @Smit_Shah1,

Nice work!

From your following response,

did you consider the results wrong because the fitted weights and bias were not 2, 3 and 4? If so, that’s the problem.

If your model fits to the raw features, you can expect 2, 3 and 4. However, because the features are normalized, we can’t expect the fitted weights and bias to be 2, 3, and 4. Think about this, you have normalized the features to make them between -1.7 to 1.7, but your y_train is always larger than 300, how can they still be 2, 3 and 4? Right? They can’t be because otherwise it cannot add up to be larger than even 10, not to mention 300.

Note that we have to do the normalization in order for gradient descent to converge well, so we can’t give it up in order to expect for 2, 3 and 4. The point is, you need to figure out a correct way to make the comparison while keeping the normalization. Good luck.

Btw, it is a pretty nice notebook!

Cheers,
Raymond

Also, in each of your last 2 graphs, we also can’t expect the two lines to overlap, because the scatter line is against the true y_train which has contributions from all of the two features, whereas the solid line is only contributed by one feature.

I don’t mean you were using those 2 graphs to judge whether your model was good, but I would just like to point it out.

Sir, I understand and acknowledge that assuming the values of w and b to be 2, 3, and 4 was incorrect. However, considering that we have only scaled the dataset, would it be reasonable to expect the ratio of w and b to be 2:3:4? I’m puzzled because I’m currently observing a w and b ratio of 1:1:6.

Additionally, I would appreciate your guidance on how I should plot the predictions in order to assess the accuracy of my answer. Thank you for your assistance.

I don’t know about the ratio but at the end of this notebook (“C1_W1_Lab04_Gradient_Descent_Soln), I think there is a plot of Cost vs Iterations. You can plot the same and check if the cost is decreasing with iterations.

No, it would not be reasonable. The ratio of the weights and biases has no significance whatsoever.

Hello @Smit_Shah1,

Note that the features are normalized independently, meaning that in some scenario, one feature can be scaled down by 1000 times, and another feature can be scaled up just 2 times, since they are not always scaled by the same amount, we can’t expect the ratio to be maintained.

Now, I need you to do the thinking to get the answer for yourself. I will give you the first steps, and the final step, but you will need to fill in the middle by yourself. Note that thinking takes time, and I won’t be able to respond to you any time soon, so take your time.

This is the original formula generating y_train :

y = w_1x_1 + w_2x_2 + b , where w_1=2, w_2=3, b=4 in your case, but we will stick with the symbols instead of the actual values for now.

Then you did the normalization by converting x_1 to x_{1, norm}, and x_2 to x_{2,norm}. Each feature’s normalization has two relevant factors - a mean \mu and a standard deviation \sigma.

x_{1, norm} = \frac{x_1 - \mu_1}{\sigma_1}
x_{2, norm} = \frac{x_2 - \mu_2}{\sigma_2}

By now, you have 3 equations.

On the other hand, your model fits to normalized features, meaning that it is:

y = w_{1, norm}x_{1, norm} + w_{2, norm}x_{2, norm} + b_{norm}

Your task is to find out the relation formula between w_{1, norm} and w_1, and the relation formula between w_2 and w_{2,norm}, and the relation formula between b_{norm} and b.

Please write down those formulae with the symbols but not their actual values.

With these formulae, you should be able to evaluate the expected weights and bias (which are w_{1, norm}, w_{2, norm}, b_{norm} from the known w_1, w_2, b which are just 2, 3, 4 respectively and the known means and standard deviations.

I won’t respond to this thread for at least the next 2 days unless you are able to share the correct relation formulae and the correct expected weights and bias.

Cheers,
Raymond

We will get into that after the relation fomulae.

i guess the answer is w1 = (w1_norm * x1_std_dev) + x1_mean , w2 = (w2_norm * x2_std_dev) + x2_mean and b = b_norm.

Hello @Smit_Shah1,

You need to derive the formulae out, even though I admit that your guesses for w_1 and w_2 are quite close and should have some reasons behind.

Look, whether you will have the right formulae depends on you because I have no intention to derive them as I do not need them.

Feel free to share your steps of deviations based on the equations that I have laid out in my last post, and we can look at your steps together. Here is one example for your reference.

Cheers,
Raymond