Week 4 Transformer Network: package versions

paulinpaloalto · August 14, 2024, 1:04am

If you are taking the approach of modifying things to run with the current versions of the packages, then you may well get different numeric results. Note that with TF it’s not really possible to get identical results even when you set the random seeds. The issue is that the training is parallelized and that process is fundamentally non-deterministic. They may have changed the behavior of the parallelization logic in the later versions. There is a flag you can set to get deterministic results, but it basically disables most of the parallelization, so it slows everything down. Here’s a post from mentor Raymond that explains this point.

Or they could have just changed things in other more direct ways that change the resolution of the outputs. Of course even if it’s more numerically accurate, that could still be “different”.

Topic		Replies	Views
Unable to import transformers using my jupyter notebook Sequence Models week-4 , coursera-platform	2	33	October 11, 2024
C5W4: Transformer Architectures with TensorFlow Sequence Models week-4 , coursera-platform	40	4510	August 3, 2024
#c5w1a3 packages versions Sequence Models week-1 , coursera-platform	3	26	August 16, 2024
C5W4: Issues running ungraded assignments locally Sequence Models coursera-platform	2	552	August 9, 2022
Course 5, Week 4: Transformer - Exercise 5 (Encoder) and 7 (Decoder) Sequence Models coursera-platform	3	419	September 21, 2023

Week 4 Transformer Network: package versions

Related topics