Course 5 Week4 programming assignment #1

Hello,
I am completely confused by the start of programming assignment of course 5 week 4. A few issues :

  • First, the docstring of the get_angles function: it says the function has to return an array of shape (pos, d). But pos is a vector…
  • Second, I apply the formula for the angle, but I don’t get why at some point the even columns should equal the odd columns in the result, as tested by the unit test
  • Then finally, this line : limit = (position - 1) / np.power(10000,14.0/16.0) suggests we are looking at the angle for the last position, at the 7th coordinate of the encoding. But then the unit test retrieves the last value of the angles matrix (result[position - 1, d_model -1]) to do the comparison.

So at this point, I am completely lost as of what does pos and i represent exactly, and also what’s going on with this formula for the angles.

Any tips?
Thanks

1 Like

There are quite a few threads about the get_angles() function in the forum.

(pos,d) means that the function returns a matrix, with a set of ‘d’ values for each element in 'pos".

Re: even/odd: Prof Ng discusses this in the lectures.

That test case only looks at the last value in the last row. The test authors felt this was sufficient to verify that your function works correctly.

Thank you for your answer. I managed to find the different threads. I am reassured that I was not the only one struggling with this. There is obviously a “problem” with the formula for the angles. Not that it’s not correct, but simply that it’s confusing and doesn’t fit the explanation given in the video lecture. IMHO, it would be less confusing to use something like:

image

and then explain that k refers to the coordinates of the encoding, and thatimage
refers to the floor of the number (this is the result of i // 2 in python for any integer i).

2 Likes

Thanks for your suggestion.

1 Like

Where? Never heard anything about even/odd

Prof Ng does talk about even/odd, but not in a way that makes it clear the formula to use in the code is the one with i//2. Please look at the formula I shared in my previous answer to understand the logic better

You saved my life.
Thank you so much @morningdew .

1 Like