So I have a doubt in this equation [Logistic regression]:
π§(π)=π€π * π₯(π)+π.
This is what I understood. In python, X was flattened in a way that it contains m columns of images and images were represented in the form of pixels.
Now, x(i) was row vector then taking transpose on w and performing dot product we will getting a scalar value in return. But my doubt is, why x(i) is row vector. Since X contains m columns of images (each images were stored in columns), then it doesnβt make sense that if x(i) was row vector.
And the notational convention that Prof Ng uses is that if he does not write an explicit operator, between the w^T and the x_i, then the operation is a dot product style matrix multiplication. If he means it to be βelementwiseβ, then he will use the explicit operator β*β.
Youβre right that the βsamplesβ matrix X is oriented so that each column represents one sample. So that means the number of rows of X is the number of features or input values in each sample vector. So the full vectorized equation to handle all the samples at once is:
Z = w^T \cdot X + b
Notice that Iβve used the explicit dot product operator there to be a bit more clear. So the bottom line is that in the first equation shown above depicting handling just one sample x_i, then x_i is a column vector. Since w is also a column vector, then w^T is a row vector and that dot product is 1 x n_x dot n_x x 1 which gives a 1 x 1 or scalar result, which is what you want for one input sample vector x_i. Where n_x is the number of features or elements in each column of X and in the weight vector w.