C1W3 what is the input layer in the lecture notes?

Hello dear DeepLearning.ai community.
In the 3rd week lecture videos Dr. Ng used a column of x_1, x_2, x_3 as the input layer.
I was 100% sure that each of them (x_1, x_2, x_3) is a separate training set of the size n_x. And that we have m=3 training sets.
But I had problems with the sizes of the weights and biases and so on - which I decided was either a mistake or a teaching move by Dr. Ng.
But when I was doing the quiz I started suspecting that x_1, x_2, x_3 are the elements of one training set. In other words, we have m=1 training sets x = (x_1, x_2, x_3) of the size n_x=3.
So, what’s going on in the week 3 lectures? Are those x_1, x_2, x_3 vectors representing training sets or the elements of one training set?
Thank you in advance.

Hello, @Ivan_Nepomnyashchikh. Yes, x_1, x_2, and x_3, do comprise the input layer in that example. Each is a “feature” of the dataset X. For example, if one were trying to predict house prices, they might represent, square-footage, number of bedrooms, and number of bathrooms. Each would have m training examples (or “observations” if you prefer). So the dataset X would have dimension \left(3,m\right), with each row containing the m examples for a particular feature.

Thank you @kenb .
Just to clarify: x_1 is a scalar, not a vector, correct? x_1, x_2, x_3 comprise an input vector x, correct? m input vectors x comprise a matrix X, correct?

Let’s be sure. For a single example (observation or data point), x_1 would be a scalar. To use the notation from the lectures, x_1^{(i)} is the i^{th} example from the m-dimensional vector x_1. That notation leaves no doubt that we are talking about a scalar. Furthermore, there are three inputs to the model, x_1, x_2, and x_3. These signify data “features”; let’s say height, weight, and eye-color with regard to a person. Each of these is an m-dimensional row vector, because we have m people in the data training set. If we “stack” these three row vectors vertically, we get the matrix X.

1 Like

Thank you @kenb .

That’s interesting what you’ve written. There are several points of confusion for me.

The first one is that I don’t understand your examples with either people or houses unfortunately. Dr. Ng did make use of the house prices example either in the week 1 or the first lecture of the week 2 - I don’t remember exactly. But that was briefly. Week 2 lectures and home work were focused on the cats images solely. As a home work we used logistic regression to recognize if there was a cat on an image. Therefore, cats images is the only example I am able to understand at this point.

The second one is the notation that we used in week 2 when we were dealing with the images of the cats. We called one image a training example. We said that one image could be represented as a vector x_i. We said that the vector x_i had (n_x, 1) dimensions. In other words, a vector x_i was a column vector. Every element of a vector x_i was either R or G or B numerical code of a pixel of the image. We could have m images, i.e. m training examples. Each of those images could be represented as a column vector x_i of the size (n_x, 1). We could stack those m vectors x_i horizontally and obtain a training set matrix X of the size (n_x, m).

At this point, I’m wondering if you can be so kind, please, to use the cat images example and the notation I described above to confirm that, in the week 3 lecture videos:

  1. x_1 is a column vector of the size (n_x, 1) and can be thought of as an image of a cat (or non cat),
  2. analogously, vector x_2.shape = (n_x, 1), vector x_3.shape = (n_x, 1) and both of them are cat/non-cat images,
  3. column vectors x_1, x_2, x_3 can be stacked horizontally to form the matrix X.shape = (n_x, m).

Thank you.


Hello @kenb,
I understood your latest reply when I got to Exercise 1 of the programming assignment. But it took me a while, honestly speaking. And a good deal of thinking.
I marked your latest reply as the solution, thank you.
I like that this course relies on examples to give understanding instead of long verbal explanations. But in this case, the example appears way overdue in the week.
I would suggest Dr. Ng talk about what x_1, x_2, x_3 mean in greater details right when he introduces them in the first video.
It’s extremely hard to grasp what “features” mean in the week 3 having gone through the week 2 lectures and done the week 2 programming assignment.
I, personally, would like an explicit clear explanation of what “features” are in the first video of week 3. I don’t need it now, though, because I’ve grasped the general idea.
And by the way, I don’t think I would’ve understood what “features” are only from the Exercise 1. It was a combination of your replies with the Exercise 1 that, eventually, led me to understanding of the concept.

Great! Wrestling with this sort of “dimension analysis” early on is well worth the effort. As Prof Ng states in the videos, it is a common source of error in executing an analysis. I thought that referring to a data feature as a commonly understood measurable quantity (e.g. height, weight, eye-color of a person) may be easier to grasp than a pixel intensity from an image which contains 64643 of them–12288 features! Onward!

1 Like