# Can anyone explain this?

How are images represented as input features for a neural network ? I have understood how an image is understood by a computer . Are the input features x1,x2…xm , the matrix representations of each image ??

Does the matrix X ( of dimension nx X m ) contain all the matrix representations of each image for m images ??

Yes. The way the Neural Networks that we are learning about here in Course 1 work is that each input (“sample”) needs to be formatted as a vector. Prof Ng has chosen to use column vectors as opposed to row vectors, but that is just a choice he made. For efficiency, we want to handle multiple samples at once using vectorized instructions, so when we have multiple input samples, they are concatenated into a matrix usually called X. The standard convention is to use n_x as the number of features (the number of elements in each input vector) and the number of samples in the given set is called m. So given that each image is “unrolled” into a column vector, the dimensions of X end up being n_x x m. Each “sample” (image) is one column of X.

Of course images normally are represented as 3D arrays with dimensions height x width x channels. The input images here in Course 1 Week 2 are 64 x 64 RGB images, so the arrays are 64 x 64 x 3 with the “channels” being the R, G and B color values for each pixel. 64 * 64 * 3 = 12288, so we have:

n_x = 12288

in this particular case, but the ideas generalize to images of any size.

Then we have to preprocess those images by “unrolling” or “flattening” them from 3D arrays to 1D vectors. They explain this and basically write the logic for you in the Week 2 assignment notebook, but here is a thread which explains in detail how that all works.

2 Likes

Hello, thanks for the reply. I have a doubt in week-3. Here Prof. Andrew Ng, while teaching about neural network with one hidden layer, he takes x1, x2, x3 as inputs in the input layer. What exactly are these x1, x2, x3s.

I initially thought that each x_i in an input layer represents the feature vector of each image. But later when prof explained about multiple examples, there arouse a very big confusion in me. What are those x_is that prof showed in the slides?
Please do illustrate with a real time example like the same logistic regression used in the course.

I have understood whatever happens in the neural network like the calculations everything. But the only input layer I wasn’t able understand about.