Week 1 Dinosaur Island Sample Method

rtabrizi · July 28, 2022, 1:56am

This is my first time posting on discourse so I apologize in advance if I’m not requesting help as expected!

For some reason, the shape of my a vector isn’t as expected. I’m getting a 100 x 100 vector for a when I presume it should be (27,1) or something of that sort. Since I can’t share my code, I’ll describe the shapes of each operand below:
Wax @ x → (100,)
Waa @ a_prev → (100,)
(Wax @ x + Waa @ a_prev) → (100,)
(Wax @ x + Waa @ a_prev + b) → (100,100)
b → (100, 1)

a → (100,100)

Why is it that b is increasing the shape? I thought if we’re broadcasting and adding b, a column vector, it shouldn’t be augmenting the shape. Some help would be appreciated. Thanks!

anon57530071 · July 28, 2022, 3:30am

Welcome to the community.

I think this is a good question for understanding how numpy handles vector and matrix (and ndarray), which is, I think, unique.

And, this is related to “How do you control dimensions through your project”.

Let’s start with numpy basics.

a = np.ones((3,))
print(type(a)) ; print(a.ndim) ; print(a.shape)
b = np.ones((3,1))
print(type(b)) ; print(b.ndim) ; print(b.shape)
c = a+b
print(type(c)) ; print(c.ndim) ; print(c.shape)

Then, the results are;

a : <class ‘numpy.ndarray’>, dim=1, shape=(3,)
b: <class "numpy.ndarray’>, dim=2, shape=(3,1)
c: <class "numpy.ndarray’>, dim=2, shape=(3,3)

“c” does not become (3,1), but becomes (3,3). This is consistent to what you see here.

(Wax @ x + Waa @ a_prev + b) → (100,100)

Different from other major tools like Matlab, as I told, numpy handles “vector” differently. (Others are simple… It is handled as 1D matrix like (m,1))

The vector (m,) means,… a row vector with the size of “m”. So, it can be seen like (1,m), but, not a matrix, since it can not be “transposed”. If we try to transpose it like a.T, but the result is same shape.

In your case. (Wax@x).shape = (100,)…
This also depends on the definition of x. If you define x using (vocab_size,), then, that’s the result. If you explicitly define x using (vocab_size,1), then, the result of (Wax@x).shape = (100,1).

(Wax @ x + Waa @ a_prev + b) → (100,100)

This is not expected, but should happen. Both “Wax @ x” and “Waa @ a_prev” in you case, are (100,). But, given “b” is a matrix, (100,1). Then, the result is just like what I showed. It’s (100,100), since it’s like (1,100) + (100,1). In this case, by a broadcasting function of numpy, it becomes (100,100) unexpectedly.

So, all depend on how you want to control the dimension.
As all entries in a dictionary “parameters” like Wax, Waa, Wya, b, by, are Matrix (2D array), then, I prefer to control everything as 2D array not a vector. But, it’s up-to you.

In any cases, we need to reshape either (b/by) or (y) into a vector from 2D array with using either ravel() or flatten().

If you want to go with a vector for local variables like “x”, “a_prev”, etc, then, you may want to transform “b” and “by” from 2D array to a vector in the early phase.
If you want to control as 2D arrays, then, you just need to transform “y” when, you pass it for np.random.choice().
Of course, depending to your choice, the shape to initialize “x” and “a_prev” will be different. (like (vocab_size,) or (vocab_size,1)).

Hope this clarifies.

rtabrizi · July 28, 2022, 1:51pm

Thank you so much for this!

To clarify, explicitly saying x = np.zeros((vocab_size, 1)) would be going down the 2D array route, whereas simply np.zeros((vocab_size)) is the vector route? Also, I just need to remain consistent between the two to avoid the row vector added to a column array?

Thanks

anon57530071 · July 28, 2022, 2:18pm

To clarify, explicitly saying x = np.zeros((vocab_size, 1)) would be going down the 2D array route, whereas simply np.zeros((vocab_size)) is the vector route?

Yes. that’s right.

Also, I just need to remain consistent between the two to avoid the row vector added to a column array?

Right. The fact is, all variables in “parameters” dictionary is 2D-array. You need to select either way.

And, your dimension analysis is the right way for the problem determination. Keep going !

Topic		Replies	Views
Clarification of Numpy behaviour Sequence Models coursera-platform	6	523	November 24, 2022
W1 A2 \| Ex-2 Sample Function Sequence Models coursera-platform	5	2217	July 16, 2023
Week1 : def sample Dinosaurus Island Sequence Models coursera-platform	3	755	June 16, 2021
DLS Course 5 Week 1 Assignment 2 Dinosaur shape input problem Sequence Models coursera-platform	3	625	November 6, 2021
Week 1 : Data in tensorflow : Error in video and text transcription Advanced Learning Algorithms week-1	26	62	March 26, 2025

Week 1 Dinosaur Island Sample Method

Related topics