Question on reshape

roger.lee · January 11, 2023, 3:48pm

Would like to understand the difference between the two, in the compute_attention_output in the assignment

x = jnp.reshape(x, (-1, n_heads, seqlen, d_head))

x = jnp.reshape(x, (x.shape[0] / n_heads, n_heads, seqlen, d_head))

The first one produce no error while the second one incurs error. Is there a difference in terms of implementation? Thank you!

arvyzukai · January 12, 2023, 7:09am

There is a bug in a test case:

I think you are talking about this error (test unit case error and not the code error)?

Cheers

Topic		Replies	Views
C4_W2_Assignment UNQ_C4 unit test has a bug: the testing data is wrong NLP with Attention Models week-module-2	17	756	May 10, 2023
C4 compute_attention_heads_closure misbehavior NLP with Attention Models week-module-2	5	659	September 12, 2022
Issue with UNQ_C4 NLP with Attention Models week-module-2	11	663	January 31, 2022
C4_W2 compute_attention_heads transpose NLP with Attention Models week-module-2	1	492	May 19, 2023
Running into error with C4_W2_Assignment for Attention Models NLP with Attention Models week-module-2	8	409	December 25, 2023