Confusion regarding input matrix shape

Is it true that in ML frameworks like PyTorch & TensorFlow, shape of input matrix is (number of examples x number of feaures), whereas in Andrew Ng’s implementation it’s (features x examples).

Frameworks typically follow (num examples, features per example). Please see this link to know about notations in courses.

1 Like

Yes but why the distinction? Is there a reason why andrew ng’s implementation is different?

1 Like

I don’t know. Adding @paulinpaloalto and @TMosh who might.

1 Like

It is an arbitrary choice. In DLS Professor Ng uses the features by samples representation for courses 1 and 2, where he is dealing with samples that are vectors. But once he gets to Course 4 (ConvNets) where the inputs are images in the form height by width by channels, then he switches to the “samples first” orientation for the data, because that’s the way everyone does it when the input batches are 4 dimensional tensors. Then you still have the choice of whether the channel dimension comes before or after the height and width dimensions. In TF, it is m x h x w x c, but in torch it is usually m x c x h x w. But even in torch you have a choice if memory serves.

Of course the choice of orientation as n_x x m in DLS C1 and C2 has a big effect on how the rest of the formulas are written in terms of the weight matrices and how the dot products are done.

3 Likes