Why is multiple feature gradient iterating model prediction f_wb by rows instead of columns?

Why is multiple feature gradient not iterating through all training examples of one feature before going to the next one? Instead if I understand it correctly, it takes one example from each feature, then second example from each feature and so on…

The difference I think it would make is, that the result would be a precise prediction for each feature, combined. Instead it is a one-example blind shot, combined for each feature… It makes no sense to me.

The function is in Week3 of ML course.

(moderator edit: code removed)

specifically, I mean f_wb_i = sigmoid(np.dot(X[i],w) + b)
if I understand it right, X[i] is a data example (data point) for x0 and x1, so a two-element array, a row. And I would like it to be a column. So all data examples for feature x0. Then in next iteration, all data examples for feature x1. And so on.

Why my approach is incorrect?

Hi @neural_ghost ,

The main goal is to determine if a sample (define by all its attributes) is of a given class, lets say 1 or 0.

Lets pretend that we have a dataset of 10,000 objects with the following attributes: size, color, shape.

To determine the type of an object, the algorithm will actually need to learn all the attributes of each object, one object at a time.

In fact, I would venture to say that even humans are the same, and we need to see the entire object to learn what object is that. If I were shown only the colors of 10,000 objects, and then the sizes of 10,000 objects, and then the shapes of 10,000 objects, I think that at the end, I would not learn what each object is. But if I am shown each one of the 10,000 objects and I see its color, size and shape, then I will be able to learn about each object and then I will be able to classify them or identify them.

For this reason, in the algorithm, the first loop will iterate over each one of the samples and, for each sample, it will calculate the formula for each and all attributes.

Hope this makes sense.


Please do not post your code on the forum. That breaks the course Honor Code. That function is identical to a graded function in the assignment.

If a mentor needs to see your code, we’ll ask you to send it via a private message.

A more natural way to implement this code is to use matrix math operators. There is no real reason for two for-loops in this function. For-loops are slow, complicated, and difficult to debug.

I believe both methods would be mathematically identical.

Yes, thank you very much. It exactly answers my question.

Noted! Sorry. (it was copied from an optional lab which must have misled me)

No because the way I had understood it, would’ve been f_wb_i = sigmoid(np.dot(X[:,i],w) + b) instead of f_wb_i = sigmoid(np.dot(X[i],w) + b), so performing a dot product of i-th column instead of row

You’ve got to be very careful with Python notation and what it does when you have a 2D matrix that you reference with only one index.