How to evaluate accuracy of a regression model

Basit_Kareem · December 16, 2022, 12:59pm

After completing the course 1 of the MLS, I developed a regression model using MATLAB. To test the model, I downloaded a cleaned dataset file from Kaggle. The file contained a dataset for training and another for testing.

I used the training dataset to get my parameters using z-score normalised features. However, at about 50,000 iterations, my learning rate was around 2e-7 with J(w,b) of 53,000. The maximum target value was over 500,000.

I then used the obtained parameters to predict y. The testing dataset doesn’t include true y, so I couldn’t evaluate MAE or MSE. As a result, I do not know how accurate my predicted values are. Any suggestion on how to determine the accuracy of my model?

Also, I do not know if the model overfits or underfits. Well, it is a linear regression model with no polynomial feature. I don’t think overfitting applies to that?

Juan_Olano · December 16, 2022, 1:17pm

Hi @Basit_Kareem ,

Welcome to the community!

So the testing dataset doesn’t include ground truth?

How about splitting the training dataset, which contains outputs, and use part of it for dev and test? that way you’ll be able to determine accuracy.

Thoughts?

Juan

Basit_Kareem · December 16, 2022, 1:22pm

Okay. Thanks Juan. I am going to do just that. But then…

Imagine after computing my MSE, I got a value of 32*10⁸, how do I ascertain that the obtained score translates to a good model?

Juan_Olano · December 16, 2022, 1:32pm

This is how I would look at it:

We are assuming that the 2 datasets that you have, the one with ground truth and the one with no ground truth, both have the same distribution.

If this is not the case, we can certainly expect bad outputs.

But if they are of the same distribution, then, when you take the 1st dataset and split it in Train, Dev, and Test, and you run the training and then validate with the Dev and Test, and we get a model properly trained (like for instance, no overfitting, no underfitting, good accuracy, etc), then when we use the 2nd dataset for inference we should expect the output to be good.

Basit_Kareem · December 16, 2022, 1:50pm

Okay, thanks. The problem I now have is how to make an informed inference as to what a good performance is.

I have the formulas for evaluating MSE and MAE but these formulas will only give me a scalar value that might have a large or small magnitude. As a beginner, how do I conclude that this value of MSE I calculated indicates that my model is performing well?

Juan_Olano · December 16, 2022, 2:01pm

One way to calculate accuracy in your regression model is to get the % of good answers vs all the answers.

Since you have the ground truth for your training process, you know the answers for each and every sample, right?

While passing your training set (or dev or test set), keep track of the number of good (and bad) answers by comparing the y_hat to the ground truth y. At the end just divide good_y_hat / total_y, to say it somehow. That’ll give you a % of the accuracy of your model.

Basit_Kareem · December 16, 2022, 2:27pm

Good. Let’s say I am predicting a house price for instance, one of my y = 345,000 while it’s corresponding y_hat = 305,000, the error is 40,000. Will I classify that as good y_hat or bad?

Juan_Olano · December 16, 2022, 2:33pm

If I want my training to produce accuracy of 95%, then my threshold for this sample would be 332,000, so this would not be a good result because 305,000 < 332,000. I would keep training the model. Now, this is looking at just one sample, but I would probably evaluate the accuracy based on the result of all the samples.

Basit_Kareem · December 16, 2022, 2:33pm

Or on another thought, let’s say I desire a 90% accuracy, should I just classify any y_hat that exceeds 90% of y as good and otherwise as bad?

Juan_Olano · December 16, 2022, 2:34pm

exactly what I just wrote

Basit_Kareem · December 16, 2022, 2:37pm

So in that case, I would need 95% of my predicted feature to be above the 95% threshold of the ground feature for ith example?

Juan_Olano · December 16, 2022, 2:38pm

Yes, that’s what I would do. And then, I would add all the goods and at the end divide by the total samples.

Basit_Kareem · December 16, 2022, 2:39pm

Okay, thanks. I am very grateful

Juan_Olano · December 16, 2022, 2:39pm

Please share with me how it goes.

Basit_Kareem · December 16, 2022, 2:40pm

I will make sure I do so.

Basit_Kareem · December 16, 2022, 2:41pm

Well, I was typing already. Since the editor won’t let me see what’s going on in the background, I didn’t notice you sent that already.

Thanks

Juan_Olano · December 16, 2022, 2:51pm

@Basit_Kareem, another way to estimate the accuracy of a regression model is with what is called the R2 score. I am not sure in which specialization you are and which course you are taking, but may be you saw this already or you will.

The R2 score will basically take:

The sum of the square of the residuals “sum(y_i - y_i_hat)2” (RSS)
- y_i is the ground truth i, and y_i_hat is the prediction for sample i
The total sum of squares “sum(y_i - y_mean)2” (TSS)

y_i is the ground truth i, and y_mean is the mean of all ground truths.

And then do R2 = 1 - (RSS/TSS)

The resulting R2 is a value between 0 and 1, and it would be your % accuracy.

The one that we discussed initially could be an implementation of a method is called the MAE (Mean Absolute Error), which is basically the sum of all the distances divided by the total of samples.

Hope this adds more light to your case

Juan

Basit_Kareem · December 16, 2022, 3:03pm

Okay. I will try this also and compare with previous suggestion.

ai_curious · December 16, 2022, 4:39pm

Remember that Kaggle uses the test data to evaluate submitted models. You may not have the truth values, but they do, else how could they compare submitted models? The only use for running predictions on Kaggle test data is to confirm your model executes within any time constraints they provide for the competition and that it doesn’t blow up and produce nans etc but you can’t measure model quality with it.

While testing software is sometimes about non-functional requirements like scalabity, throughput, and usability, in your case you are evaluating functional performance. This requires truth values. Make a dev/val/test split on Kaggle train data as suggested by @Juan_Olano above and ignore the Kaggle test data unless you’re submitting to a competition.

Basit_Kareem · December 29, 2022, 5:02am

Hello, so I did implement the advice offered regarding model evaluation. I could determine at what percentage of the y_ground my model predicted y_hat well.

Thanks all.

Topic		Replies	Views
Grade error. MAE and MSE is more than what I got Sequences, Time Series and Prediction week-4	15	729	January 11, 2023
Query related Linear Regression AI Discussions	5	59	June 6, 2023
C3 what about Regression problems? Structuring Machine Learning Projects week-1 , week-2	4	270	January 19, 2024
Underfitting and Overfitting AI Discussions	1	206	December 16, 2022
What is a good RMSE value? AI Discussions	12	206	March 30, 2023

How to evaluate accuracy of a regression model

Related topics