Hi mentors.

I am taking a big step back at the moment to revise and improve my understanding, as opposed to just obtaining results.

Can you clear up something I didn’t take the time to understand properly right at the start.

Given an input matrix X[n,m].

is there a J for every row [n] added across the [m] examples of x(i).

My logic says that J must be a [n,2] vector but I haven’t seen this defined anywhere.; I would expect there to be ‘n’ number of j (small case) sums. with j[0] = sum(dW) and j[1] = sum(db)

Regards

Ian

There is a loss value for each sample. That’s each column of X. That is a scalar value for each sample. Then we define the cost J as the average of the loss values across all m samples.

Yes, I should have written ‘average’ I understood that.

The implication of what you say is that J (capital) is a single number at each iteration of the calculation?

To express it in math formulas, we have the loss first:

L(\hat{y}_i, y_i) = - y_i * log(\hat{y}_i) - (1 - y_i) * log(1 - \hat{y}_i)

Then the cost is defined as the average of the loss values across the samples:

J = \displaystyle \frac {1}{m} \sum_{i = 1}^{m} L(\hat{y}_i, y_i)

Yes, J is a scalar value at every iteration.

I need to research that a little further - my matrix math is rusty.

I defer to you sensei

Ian

Thanks Paul. I have got my head around that now. Understanding that explains a lot that I was unsure of further into the second course .

Regards

Ian