Question about data in Numpy array

Nguyen_Thien_An · July 31, 2023, 6:49am

Could anyone tell me the difference between np.array([1, 2, 3]) and np.([ [1, 2 ,3] ]).
Why in the lecture, the prof uses the later one rather than the first one?
Thank you so much!

saifkhanengr · July 31, 2023, 6:57am

Check this:

arr1 = np.array([1, 2, 3])
print("arr1:", arr1)
print("Shape of arr1:", arr1.shape)  

arr2 = np.array([[1, 2, 3]])
print("arr2:", arr2)
print("Shape of arr2:", arr2.shape)

And this is the result:

arr1: [1 2 3]
Shape of arr1: (3,)

arr2: [[1 2 3]]
Shape of arr2: (1, 3)

So, do you see the difference?

Nguyen_Thien_An · July 31, 2023, 7:45am

Oh, I see. But why would we need to use vector with shape of (1, 3) rather than (3, 1)?

Nguyen_Thien_An · July 31, 2023, 7:46am

Or is it because of multiple features?

saifkhanengr · July 31, 2023, 7:51am

It depends on how you shape your input data.

rmwkwok · July 31, 2023, 7:58am

Hello @Nguyen_Thien_An,

The shape of np.array([1, 2, 3]) is (3, ) instead of (3, 1).

(3, ) is a 1-dimensional array, whereas (3, 1) is a 2-dimensional array. This might be another subtle difference that we want to notice.

np.array([ [1, ], [2, ], [3, ] ]) is a (3, 1) array.

In layman language, (3, 1) represents 3 rows and 1 column. (1, 3) represents 1 row and 3 columns. In the MLS convention, if it is a dataset array, we are used to represent 1 sample in a row and its features by columns. However, sometimes, if we have only one feature, we might want to use (3, ) instead. This is what @saifkhanengr meant.

Therefore, it is a convention. In our specialization (MLS), we have the above convention, but in other places, the convention might be different, and we need to pay attention to this details.

Btw, you didn’t share with us in which lecture you found that notation, so here it is just some very general explanation.

Cheers,
Raymond

Nguyen_Thien_An · July 31, 2023, 8:11am

Thanks a lot!

Nguyen_Thien_An · July 31, 2023, 8:19am

I know this is a bit irrelevant, but why we use row vector for multiple features whereas column vector when passing in value to the neural network. Is this the convention?
Thanks!

rmwkwok · July 31, 2023, 8:23am

Generally, when we pass data into a neural network, we use a 2-D array, where one row represents a sample, and one column represents a feature. This is the MLS convention.

However, if you see 1-D array, this is some special case which can either means “one sample of multiple features” or “multiple samples of one feature”, and we will need to read the context to determine what it is.

2-D array is the standard representation, and it works for multiple samples with multiple features. 1-D is for the special cases.

Again, I don’t know which slide you are asking, so it is just very general discussion

Topic		Replies	Views
Numpy Array and shape Supervised ML: Regression and Classification week-2	5	519	August 22, 2022
NumPy array shape annotation / definition Supervised ML: Regression and Classification general	7	113	June 23, 2024
C1_W2_Multiple linear regression_Optional lab Supervised ML: Regression and Classification week-2	7	555	July 31, 2022
How to determine the a.shape without specified Supervised ML: Regression and Classification week-2	2	486	March 17, 2023
Wo2 assignment forward and backward prop Neural Networks and Deep Learning	3	567	April 23, 2021

Question about data in Numpy array

Related topics