# Qs on Cost Function Lab

For when we are computing cost - C1_W1_Lab03_Cost_function_Soln

Q1 - why do we compute_cost with (x, y, w, b) parameter in this lab but missing (y) in previous lab: model representation when dong model function compute_model_output(x, w, b)? I don’t get why the y parameter is not included? Also is parameter the correct term of (y)?

Q2 - For computing cost, why do we need to start cost_sum = 0?

Q3 - For computing cost, why then do cost_sum = cost_sum + cost… why not just have total_cost = (1 / (2 *m)) * cost?

Q4 - why do you use m = x.shape[0] for computing cost in this lab … but in previous lab: model representation we used m = x_train.shape[0]. Why do we not use x_train[0] again?

Q5 - When doing the Cost Function Visualization- 3D with larger data set, how visually or analytically did you actually find that the value are exactly 𝑤=209 and 𝑏=2.4. I know we place the mark on contour plot where w and b have the lowest cost… but it shows visually that w is close to 200 and b is close to 200… but how do we exactly find that w = 209 and b = 2.4. I tried to hover in other areas on the axis, but could not get the exact value

Q6 - if in “Implementing gradient descent” lesson we talk about “truth assertion” where we can’t have a = a +1, then why do we in the “cost function” lab use cost_sum = 0, then later use cost_sum = cost_sum + cost. Does this not apply the truth assertion principle?

1 Like

Welcome to the course!

Based on these questions, I recommend you find a basic introduction to programming first before continuing on future assignments. Most of these questions are related to basic programming skills, which is a pre-requisite for this course. You can find resources online (for example, here is one on Coursera).

Q1 - why do we compute_cost with (x, y, w, b) parameter in this lab but missing (y) in previous lab: model representation when dong model function compute_model_output(x, w, b)? I don’t get why the y parameter is not included? Also is parameter the correct term of (y)?

There is a difference between computing model output or computing the cost. The cost needs “y” (or the target), whereas the model output does not require “y”.

Imagine you have an AI model that predicts the price of a house, based on the size of the house and the location of the house. Here, the size and location of the house will be provided in the variable “x”. You can then ask the AI model to make a prediction of the housing price based on the variable x. This is computing the model output.

Let’s say you now want to train the model (to improve on it). What you need to now is the real price of the house, which is provided by “y”. You then compare the real price, “y”, with the predicted price from the AI model. In this case, you can think of the “difference” between the real and predicted value is the cost. The goal of training the model is to change the AI model to minimize the real and predicted value.

The “w” and “b” variables are the parameters of the AI model. The “y” is used either for the target or the prediction (and is normally named y_target or y_prediction to differentiate between the two).

Q2 - For computing cost, why do we need to start cost_sum = 0?

The cost_sum is called a variable in programming. The line “cost_sum = 0” stores the value of 0 in the variable cost_sum (or in programming jargon, the variable cost_sum is assigned a value of 0).

Q3 - For computing cost, why then do cost_sum = cost_sum + cost… why not just have total_cost = (1 / (2 *m)) * cost?

The “cost_sum = cost_sum + cost” basically adds “cost” to the current value of cost_sum, and then store the result of that addition to the cost_sum variable. After this line, the value stored in cost_sum will be updated.

Q4 - why do you use m = x.shape[0] for computing cost in this lab … but in previous lab: model representation we used m = x_train.shape[0]. Why do we not use x_train[0] again?

This is because the variable name is different in this lab and the previous lab.

Q5 - When doing the Cost Function Visualization- 3D with larger data set, how visually or analytically did you actually find that the value are exactly 𝑤=209 and 𝑏=2.4. I know we place the mark on contour plot where w and b have the lowest cost… but it shows visually that w is close to 200 and b is close to 200… but how do we exactly find that w = 209 and b = 2.4. I tried to hover in other areas on the axis, but could not get the exact value

There are different ways to compute the minimal values, and one way is using gradient descent (which is described in the lectures). For simple models, there may be mathematical equations that provide you with the exact results (depending on the model/equation).

Q6 - if in “Implementing gradient descent” lesson we talk about “truth assertion” where we can’t have a = a +1, then why do we in the “cost function” lab use cost_sum = 0, then later use cost_sum = cost_sum + cost. Does this not apply the truth assertion principle?

Truth assertions are different from programming variable assignments. They are different concepts. In programming, truth assertions are usually done using two equal signs next to each other, such as “variable_a == variable_b”.

2 Likes

@hackyon thx for the thorough answers. Very clear!