How to use SGDRegressor to get prediction for a specific input?

When we have implemented multiple regression in our code, we’ve used a sample input of a house which is 1200 sqft, 3 bedrooms, 1 floor, 40 years old. (Code taken from Lab3)
We’ve implemented it like this and we got the output - $318709.09

x_house = np.array([1200, 3, 1, 40])
x_house_norm = (x_house - X_mu) / X_sigma
print(x_house_norm)
x_house_predict = np.dot(x_house_norm, w_norm) + b_norm
print(f" predicted price of a house with 1200 sqft, 3 bedrooms, 1 floor, 40 years old = ${x_house_predict*1000:0.0f}")

Similarly when we’re using a linear regressor from scikit learn we’ve used very similar code( albeit without normalisation) and the solution we got was the same as that of the above - $318709.09 (Code taken from Lab 6)

x_house = np.array([1200, 3,1, 40]).reshape(-1,4)
x_house_predict = linear_model.predict(x_house)
print(f" predicted price of a house with 1200 sqft, 3 bedrooms, 1 floor, 40 years old = ${x_house_predict*1000}")

How do we do this with SGDRegression? The following is the code I tried using, however it’s giving a very erroneous answer of - $363139.3. (Trying to execute code in Lab 5)

x_house = np.array([1200, 3,1, 40]).reshape(-1,4)
x_house_norm = scaler.fit_transform(x_house)
x_house_predict = sgdr.predict(x_house_norm)
print(x_house_predict*1000)

I believe I’m making an error in normalisation, but I don’t know how to fix it

Hi @SudebSarkar

First to get the correct output use inverse_transform(X…) that make a reverse of standard scalar because any standardization library has it own way to normalize the data …also sgdregressor use Stochastic Gradient Descent (SGD) it is different technique from batch gradient descent also sgdregressor use an regularization term it’s default L2 regularization term is called Ridge Regression so the predicted value may be close different from using any other models

Cheers!
Abdelrahman

1 Like

I don’t think I understood why we would need to do an inverse transform given that the model is already calculated. And our model already has the values of the various parameters for the features.
However I was able to get the solution by extracting the mean and variance calculated by the original Standard Scalar using

scaler = StandardScaler() #creating an instance of Standard Scaler
X_norm = scaler.fit_transform(X_train)
means=scaler.mean_
variance=scaler.var_

And then I used it to manually normalise the input I wanted to get an output of

x_house = np.array([1200, 3,1, 40]).reshape(-1,4)
x_house_norm = (x_house-means)/(variance**(0.5))
x_house_predict = sgdr.predict(x_house_norm)
print(x_house_predict*1000)

Which then gave me a much closer value of $318828.68

HI @SudebSarkar

I mean if you used standard scaler on y output value when you train model I suggest to use inverse transform instead of using x_house_predict*1000 but it is okay