1.) I’m assuming input layer is equivalent to inputs to the model and output layer is equivalent to outputs of the model. Any reason for referring to it as a layer from a mental model perspective?
2.) Secondly, if z = wx + b and y = z, what is the benefit t calling out y = z if it’s functionally equivalent to the first equation? Why isn’t variable z good enough?
btw: already completed the assignment. I’m just asking these questions to help the learning gel.
The term “layer” is useful in neural networks, where you may have more activation units between the input and the output. It’s a simple term that helps describe the complexity of the model.
y does not equal z.
y is the known-correct label.
z is the prediction based on the model.
The goal is to minimize the difference between y and z, but you cannot presume the error is always going to be zero.
Also, I’m seeing in exercise 2, we are asked to specify the size of the input layer and output layer? What are we asking for in human terms? I’m assuming it’s equivalent to the dependent variables used to predict right for the input layer?
The size of the input layer is the number of features in each example in the data set.
The size of the output layer is the number of predictions being made. For linear regression (or for true/false classification), typically there is only one output (called ‘z’ or y_hat, depending on who wrote the exercise).
If you have multiple classes, then you might have more than one output. In that case the output with the highest value represents the predicted class.