Question about data in Numpy array

Could anyone tell me the difference between np.array([1, 2, 3]) and np.([ [1, 2 ,3] ]).
Why in the lecture, the prof uses the later one rather than the first one?
Thank you so much!

Check this:

arr1 = np.array([1, 2, 3])
print("arr1:", arr1)
print("Shape of arr1:", arr1.shape)  

arr2 = np.array([[1, 2, 3]])
print("arr2:", arr2)
print("Shape of arr2:", arr2.shape)  

And this is the result:

arr1: [1 2 3]
Shape of arr1: (3,)

arr2: [[1 2 3]]
Shape of arr2: (1, 3)

So, do you see the difference?

Oh, I see. But why would we need to use vector with shape of (1, 3) rather than (3, 1)?

Or is it because of multiple features?

It depends on how you shape your input data.


Hello @Nguyen_Thien_An,

The shape of np.array([1, 2, 3]) is (3, ) instead of (3, 1).

(3, ) is a 1-dimensional array, whereas (3, 1) is a 2-dimensional array. This might be another subtle difference that we want to notice.

np.array([ [1, ], [2, ], [3, ] ]) is a (3, 1) array.

In layman language, (3, 1) represents 3 rows and 1 column. (1, 3) represents 1 row and 3 columns. In the MLS convention, if it is a dataset array, we are used to represent 1 sample in a row and its features by columns. However, sometimes, if we have only one feature, we might want to use (3, ) instead. This is what @saifkhanengr meant.

Therefore, it is a convention. In our specialization (MLS), we have the above convention, but in other places, the convention might be different, and we need to pay attention to this details.

Btw, you didn’t share with us in which lecture you found that notation, so here it is just some very general explanation.


1 Like

Thanks a lot!

I know this is a bit irrelevant, but why we use row vector for multiple features whereas column vector when passing in value to the neural network. Is this the convention?

Generally, when we pass data into a neural network, we use a 2-D array, where one row represents a sample, and one column represents a feature. This is the MLS convention.

However, if you see 1-D array, this is some special case which can either means “one sample of multiple features” or “multiple samples of one feature”, and we will need to read the context to determine what it is.

2-D array is the standard representation, and it works for multiple samples with multiple features. 1-D is for the special cases.

Again, I don’t know which slide you are asking, so it is just very general discussion :wink:

1 Like