Why introduce z when y hat should be enough

In Coursera | Online Courses & Credentials From Top Educators. Join for Free | Coursera (and elsewhere) the variable Z is introduced but why not just say that Y-hat = wX + b? How Z help?


@toontalk ,

Although I have not taken the M4ML courses, I’d like to share my understanding on this:

Typically, I use y_hat as the end result of the network. These are the predictions, which we get after the last layer is processed.

And then z represents the intermediate values, from layer to layer.

In the last layer, after the Z is calculated, I can say: ok, this z (or Z) is my prediction, so I assign it to y_hat (or Y_HAT if this is a matrix).

Does it make sense?



I think the problem is that for regression Z isn’t an intermediate value - it is the final value. But I see in the logistic classification case that y-hat is set to alpha of Z so it is motivated in that case.