C5W4E7 | Decoder | Rounding Error

Hello, for exercise 7, I am getting a “Wrong values in outd” which seems to be due to rounding errors.

I printed my_output, my_output - expected_output, and (my_output - expected_output)/expected_output and the error seems to be less than 2.5%:

Could someone help me please to find the issue? Many thanks!

expected_output of outd[1, 1]: [-0.2715261, -0.5606001, -0.861783, 1.69390933]
my_output: [-0.2784571 -0.56373984 -0.8536583 1.6958553 ]
my_output - expected_output: [-0.00693101 -0.00313973 0.00812471 0.00194597]
diff%: [ 2.5526116 0.56006664 -0.94277894 0.11488055]

It’s not rounding errors, because your code should not do any rounding at all.

1 Like

@Fedi_ZOUARI, are you looping through all the required layers? Also worth checking, the params passed to self.pos_encoding and self.dropout. I recall having had some struggles there.

Juan

1 Like

I have not gotten to this section of Course 5 yet, so I can’t help you with the logic here. Better to listen to what Juan says on that. But there is an important general point to make: an error in the third decimal place is not a rounding error. In 64 bit floating point, rounding errors are typically of order 10^{-16} or smaller in the mantissa. Even in 32 bit floating point, the resolution of the mantissa is slightly better than 10^{-7}. So an error in the third decimal place represents some kind of real error in your code.

1 Like

Thanks all for your replies… I understood that I should not have called the error as “rounding error”. I also read the notebook again with all hints and details, however, I am still unable to find the mistake that I am making! Could anyone give me some hints on how I could find out my error or could someone check my notebook? Thanks in advance!

@Fedi_ZOUARI I will be more than happy to check your code and provide a more focused hint. Please send me a direct message with your code and this cue:

“DLS.C5.W4.E7 Decoder,Rounding Error”

Juan

@Juan_Olano, I think we have the issue sorted-out.

Please, I have the same issue. Can you share with me how you solved it?

Thanks to @TMosh and @Juan_Olano I was able to solve it. I had an error in the scaling of the embedding where I multiplied by the square root of the sequence length.

1 Like