C1 Week 3 Assignment - 2.4 Ex 5 explantion

Hi, I’m looking to understand why the cost function improves in this exercise and get a better mental model of what these lines of code do. Let me share how I think it works, and correct me if I get anything wrong and/or elaborate.

1.) First, we are calling the NM function using our test data set defined 1.3

2.) Then we are initializing our parameters to zero

3.) Then we are getting our predictions using the Y-hat formula. The Y-hat takes our dataset defined in 1.3 and uses our formula W*x+b. But I’m not sure what y-hat is at this point in the code because wouldn’t the prediction be both an X and Y point instead of a single value?

4.) Next we are computing our cost. The cost is like our performance / accuracy

5.) Finally we are updating our parameters using w3_tools.train_nn (I’m not sure what this does. Can you elaborate?)

So why is the cost function getting better and better over time? Is it because as we cycle through inputs, the system learns the pattern and the predictive performance improves through progressive elaboration?

Thank you in advance.

1 Like

Replies:

  1. The first code cell in Exercise 5 defines the nn_model() function.
    The second code cell calls nn_model() using the training set that was defined in section 1.3.

  2. The first thing nn_model() does is initialize the weight parameters.

  3. Predictions are y_hat - that is, the output of the model for each example. For a given X and parameters, you get a value for y_hat. That’s what y_hat = w*x + b means.

  4. Cost is a measurement of how well the model (the y_hat values) fits the training set.

  5. The training function uses gradient descent to update the weight parameters. This will be more fully explained next week, you’ll implement this for yourself.

The cost is minimized because the cost function is convex with respect to w and b, and gradient descent allows us to follow the gradients “downhill” toward the values that give the minimumcost.

1 Like