Hi, I notice that in the hint of Step 3:
- Also notice that 𝑦̂ ⟨𝑡+1⟩y^⟨t+1⟩, which is
y
in the code, is a 2D array.
I cannot figure out why is ‘y’ a 2D array? Shouldn’t it be a 1D probability vector over the vocabulary?
Hi, I notice that in the hint of Step 3:
y
in the code, is a 2D array.I cannot figure out why is ‘y’ a 2D array? Shouldn’t it be a 1D probability vector over the vocabulary?
Hi aquilaxc , I think y should be a 27x1 ndarray. not initialising vectors with two dimension lead to broadcasting errors.
you should just have to "flatten " it.
If y have different dimension check the dimension of intermediate result to find out what went wrong.
I had my x set to x = np.zeros(vocab_size) when it should have been set to x = np.zeros((vocab_size,1))! The prior made my y shape 27x100 instead of 27x1
I tried a million things I thought I was going crazy but I got it to work after realizing y is supposed to be 27x1, thank you!
It’s because b and by are (…,1) ad arrays. I did b.ravel() and by.ravel()
I also found this comment that y is a 2D array quite confusing, because one of these two dimensions is trivial. In the same vein we could view it as a 3D, 4D, etc array with multiple trivial dimensions. It would be more helpful to speak directly about numpy shapes in the hint: “Pay attention to the fact that p should be set to a (l,) shaped array for some length l, as opposed to a (l, 1) shaped array. Note also that y is a (l, 1) shaped array in the code.”
I am too confused in this assignment, as why we are updating x to x<t+1>? That would change the dimensions of x & also, what can assigned to x<t+1>? Can anyone please throw light on this?