Clarification of Numpy behaviour

mirko.bruhn · November 23, 2022, 8:53am

Hi,
I do not understand the following behaviour of Numpy, could somebody explain what happens here?

u = np.zeros(100)
v = np.zeros((100, 1))
print(u.shape)
print(v.shape)
print((u + v).shape)

The shape of (u + v).shape = (100, 100) looks odd to me.
How does this happen?

Mubsi · November 23, 2022, 9:11am

I believe this is broadcasting.

mirko.bruhn · November 23, 2022, 9:15am

Hi Mubsi,

thank you for your answer. But how can the second dimension be 100?
First dimension agrees in u and v, and second dimension is not existing in u and 1 in v.
How can this be broadcasted to 100 in the second dimension?

Mubsi · November 23, 2022, 10:29am

Hi @mirko.bruhn,

My best guess:

u = np.zeros(100) is a 1D array of 100 elements. Now, I’m NOT saying that it is a 2D array, but you can think of it as 1 row and 100 columns.

v = np.zeros((100, 1)) is a 2D array of 100 rows and 1 column.

When you add them together, you get the 100,100

This is my guess, but I can be totally wrong here.

paulinpaloalto · November 23, 2022, 8:50pm

Yes, Mubsi’s guess must be what’s happening. The type coercion rules turn the 1D vector with 100 elements into a 1 x 100 2D row vector and then the broadcasting happens to make the addition work and you end up with 100 x 100.

Maybe it’s better just to avoid nasty surprises by not playing that game of mixing 1D and 2D objects.

Here’s a better experiment to really see what happens:

u = np.ones(10)
v = np.ones((10, 1)) * 2
uplusv = u + v
print(f"u.shape = {u.shape}")
print(f"u = {u}")
print(f"v.shape = {v.shape}")
print(f"v = {v}")
print(f"uplusv.shape = {uplusv.shape}")
print(f"uplusv = {uplusv}")

Running that gives this:

u.shape = (10,)
u = [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
v.shape = (10, 1)
v = [[2.]
 [2.]
 [2.]
 [2.]
 [2.]
 [2.]
 [2.]
 [2.]
 [2.]
 [2.]]
uplusv.shape = (10, 10)
uplusv = [[3. 3. 3. 3. 3. 3. 3. 3. 3. 3.]
 [3. 3. 3. 3. 3. 3. 3. 3. 3. 3.]
 [3. 3. 3. 3. 3. 3. 3. 3. 3. 3.]
 [3. 3. 3. 3. 3. 3. 3. 3. 3. 3.]
 [3. 3. 3. 3. 3. 3. 3. 3. 3. 3.]
 [3. 3. 3. 3. 3. 3. 3. 3. 3. 3.]
 [3. 3. 3. 3. 3. 3. 3. 3. 3. 3.]
 [3. 3. 3. 3. 3. 3. 3. 3. 3. 3.]
 [3. 3. 3. 3. 3. 3. 3. 3. 3. 3.]
 [3. 3. 3. 3. 3. 3. 3. 3. 3. 3.]]

mirko.bruhn · November 24, 2022, 10:12am

Hi Paul, hi Mubsi,

yes, that makes senses, still I feel this is rather counter-intuitive broadcasting behavior of Numpy .
I remember, somewhere in this specialization Andrew Ng warned of stranger things happening when working with 1D objects.
I should have listened to him.

paulinpaloalto · November 24, 2022, 4:52pm

I don’t disagree that it seems like strange behavior. Seems like in the “type coercion” step where they convert the 1D vector to a 2D object, it would have made more sense to match the shape of the other object, but I’ll bet the way the algorithm works is that it only knows that it needs a 2D object at that point and doesn’t really compare to the other operand’s shape. But once you have a row vector and a column vector and you do an “elementwise” operation between them, what we’re seeing there is the definition of “broadcasting”.

My background is in math and I found the whole idea of broadcasting offensive when I first ran into. It thought it should simply throw an error if you did an elementwise operation with objects of different shapes, meaning that it was the programmer’s job to convert the vector to a matrix. E.g. by manually doing something like this:

v = np.ones((10, 1)) * 2
print(f"v.shape = {v.shape}")
print(f"v = {v}")
vexpand = v * np.ones((1,4))
print(f"vexpand.shape = {vexpand.shape}")
print(f"vexpand = {vexpand}")

v.shape = (10, 1)
v = [[2.]
 [2.]
 [2.]
 [2.]
 [2.]
 [2.]
 [2.]
 [2.]
 [2.]
 [2.]]
vexpand.shape = (10, 4)
vexpand = [[2. 2. 2. 2.]
 [2. 2. 2. 2.]
 [2. 2. 2. 2.]
 [2. 2. 2. 2.]
 [2. 2. 2. 2.]
 [2. 2. 2. 2.]
 [2. 2. 2. 2.]
 [2. 2. 2. 2.]
 [2. 2. 2. 2.]
 [2. 2. 2. 2.]]

But once you get used to the idea, it does save you a lot of work. E.g. it would be pain to have to do that expansion manually when we do forward propagation:

A = W \cdot X + b

So we can all agree on the bottom line that we should follow Prof Ng’s advice when he recommended against mixing 1D objects into our computations here.

Topic		Replies	Views
Week 1 Dinosaur Island Sample Method Sequence Models	3	690	July 28, 2022
Can anyone please explain this concept to me? Neural Networks and Deep Learning	1	565	June 25, 2021
C5 W4 A1 E7 Why are we able to add shapes (2,3,4) and (1,3,512) Sequence Models	1	550	May 10, 2022
Numpy Array and shape Supervised ML: Regression and Classification week-2	5	518	August 22, 2022
Course 1 - Programming assignment 2, layer_sizes function Neural Networks and Deep Learning	7	555	October 31, 2021

Clarification of Numpy behaviour

Related topics