Optional Lab: Gradient Descent1

Nathan_Angell · April 27, 2023, 2:25pm

Hello, I just finished week 1 of the supervised ML: Regression and Classification. I have two questions, one about the lab and one talking about gradient descent.

One about the lab:
If you take a look at this image right below, this is how they computed the cost function.

If you take a look at image right below this text, in the top right, it looks like the calculation is different between the two of them.

why you might ask? Because with the order of PEMDAS, I thought you had to subtract x[i] and y[i] first then multiply that subtracted number from the function F(w,b) next? But what the lab does is, it uses the function to calculate x[i] then subtracts it from y[i]. I could be wrong with my thinking but I am curious what other people have to say about why I am wrong or if I am right.

Question about gradient descent:
Is gradient descent only used for regression algorithms like linear regression instead of classification algorithms?

saifkhanengr · April 27, 2023, 2:46pm

Hi,

f_{w,b}(x^{(i)}) doesn’t mean that you have to multiply f_{w,b} with (x^{(i)}). It means that the value of \hat{y} at given values of w,b and x. Note that \hat{y}^{(i)} and f_{w,b}(x^{(i)}) are the same. So, the equation can be rewritten as \hat{y}^{(i)} - y^{(i)}.

Regarding gradient descent, yes we use it for regression as well as classification problems.

Best,
Saif.

Kic · April 27, 2023, 3:16pm

Hi @Nathan_Angell ,

If you look carefully of the formula for the linear regression model, you can see that the model is built on a linear transformation, outputtng a value,\hat{y}^{(i)}, for a single example. So when calculating the cost J with respect to w,b, it would have to be the transformed value, ,\hat{y}^{(i)} to be used and not x[i] for each example. Bearing in mind that the loss, which is the difference between the \hat{y}^{(i)} and y^{(i)} is telling us how far apart the predicted value \hat{y} is from the true value.

Gradient descent is an optimizer for finding the values of parameters W and b of a function where the cost is at the minimum. So it that fits the bill, then, there is no reason why gradient descent is restricted to just regression algorithms.

Nathan_Angell · April 28, 2023, 12:09pm

Thank you for that information!

Nathan_Angell · April 28, 2023, 12:09pm

Thank you so much, that totally makes sense!

Topic		Replies	Views
C1_W1_lab05: Linear regression code questions Supervised ML: Regression and Classification week-1	4	631	September 21, 2022
Supervised Machine Learning Optional lab: Gradient descent Question Supervised ML: Regression and Classification week-1	6	554	July 11, 2023
Question about Week 1 Lab 4 Supervised ML: Regression and Classification week-1	2	475	May 3, 2023
Week 2: Practice lab Supervised ML: Regression and Classification week-2	4	895	July 25, 2023
Why logistic regression is not used to calculate gradient descent Supervised ML: Regression and Classification week-3	8	279	May 8, 2024

Optional Lab: Gradient Descent1

Related topics