I also found this confusing and was quite frustrated before I found this post. The instructions do not tell us to repeat the columns to use one for sine and one for cosine, so I had no idea what was going wrong.

In addition, IMO, the //2 structure is inelegant because it requires redundant data storage and is generally complicated. Seems to me like a simpler way to implement this would be-

make an angle_rads table without duplicate columns

use
np.concatenate([[np.sin(x),np.cos(x)] for x in angle_rads]).T
to get the sines and cosines next to each other
(this is pseudocode I didn’t test it)

But perhaps I am misunderstanding part of the logic behind the //2 idea?

Hi @zer2, actually there is a reminder under <1.2 - Sine and Cosine Positional Encodings>, saying
“Reminder:Use the sine equation when 𝑖i is an even number, and the cosine equation when 𝑖 is an odd number.”

Notice that in this slide, notion "i"s in the upper part (i=0, i=1…) and in the lower part (PE(pos, 2i), PE(pos,2i+1)) are actually different. As many confusing students, I plugged the “i” from the upper part into the lower part computation and wasted like 1 hour without understanding what’s going on.

Clearly we can do it better by seperating notions in this slide.

@Damon true, I did not notice that! However it is past “get_angles” in the assignment, and I did not think to read ahead to get hints about how to do earlier cells. Perhaps that comment or something like it could be brought up to the heading under 1.1?

I got confused for a while, but you’re being misled by the I, which is just for odd and even indices。 That’s the same thing as dealing with odd sines and even indices with cosines。Because we’re crossing sines and cosines, so we have to divide by 2, and we have this many groups。For example, let’s say d is equal to 512，pos=1

I got confused originally, I didn’t see what was wrong with an approach like what @zer2 mentioned above. But this comment here by @liangyuantong helped to clarify.
10000 must be raised to values of i as in a sequence like 0, 2, 4, etc. , which is possible but may not be direct.

Still, it’s easy to make an array of angles like that, and then make an array of sine and cosine values (like pair-wise columns) for the angles.

Or did I still miss something about the necessity of having redundant columns?

What exactly is the need for having redundant pair-wise columns in angles?
Why not make a matrix of angles with rows like: 1/1000^{0/512}, 1/1000^{2/512}, ...
And then form the matrix having sin and cos of each column in angles as adjacent columns?

I totally agree! I have to admit that this has put me off. I was used to have very clear explanations of things in this specialization, despite the heavy notations. This one is the most confusing thing I faced.
That being said, I realize it’s not easy to simplify this kind of complex ideas. I think it would help a lot to separate the i from the angle formula with the i which is the coordinate of the encoding vectors

Thanks @Damon your answer helped clarify on this.
Maybe it would be even clearer to use the actual formula that explains the origin of that otherwise “magical” i // 2 :

where is the coordinate of the encoding, and is the floor of the number.
That’s where the i we see in the formula actually comes from :

Then the i which is given as argument to the get_angles function would actually be in the formula above

@manifest@morningdew . Sheesh! This is really a good start for beginners who have no idea what is going on under the hood, with at the least a small hint to start off with. I was breaking my head for at-least half an hour on this.

I think formula (3) of this exercise should be updated or removed altogether from the exercise. The fact that in the left handside of formula (1) and (2) the indices are i and i+1 respectively changes the whole meaning of how to interpret and code the inner part. Taking it out of context in (3) and requesting the students to implement it as it is written there is not just misleading but outright incorrect.

The answer to that question comes from these equations:

Note that the sub index of PE en each case correspond to even or odd numbers, where the i in the argument of each function is the same, then for the first pair (\sin, \cos)i is 0, then 1 for the second pair, etc…