This the last step in the assignment, wherein all the functions created earlier are called. I received success for all the previous graded functions. However for the Transformer function in the last step I get assertion Error as follows:
AssertionError Traceback (most recent call last)
in
1 # UNIT TEST
----> 2 Transformer_test(Transformer, create_look_ahead_mask, create_padding_mask)
Query : The last step mentions to pass decoder output through linear and softmax layer and mentions approx 2 lines of code. However the function defined is Dense layer with activation as softmax. Is there an error in the comment stating 2 lines instead of 1?
I am not able to figure out why I am receiving assertion error. Please assist.
Which step in encoder or decoder could be an issue and how can I figure out where the error is (decoder or encoder). As if I change any input in either Encoder or Decoder it gives an error.
I did some further investigation and observe that in function scaled_dot_product_attention, attention weights instruction reads “softmax is normalized on the last axis (seq_len_k) so that the scores add up to 1.”. However when I output the attention weights and sum up the columns in the row, the sum is 1.00000006, but not exactly 1.
However the execution mentions all test passed.
Inputs to softmax is scaled_attention_logits and axis=-1.
Well, we are doing floating point here, so rounding errors are always a possibility. That would be a relatively high error with 64 bit floats, but it’s perfectly reasonable for 32 bit floats. You can take a look at the IEEE 754 standard to understand more details, but the resolution in the mantissa of a binary32 float is on the order of 10^{-7}. For binary64, it’s on the order of 10^{-15}.
Because of rounding behavior, they can’t use exact comparisons, but there are numpy primitives that are specifically designed to handle this situation. Have a look at some of the test cases in public_tests.py and you see that they typically use np.isclose or np.allclose to check your answers against the expected values.
This ~2 lines made me crazy! I added a linear layer and nothing works. When I see this message and know we do not need 2 lines, I just deleted the linear layer and it worked.
I would suggest changing that ~2 lines to ~1 line. Thanks