I verified my implementation with the “Additional Code Hints” section and it still throws this error from the function helper_utils.verify_training_process. At first I thought I forgot to move model to the device because it is nowhere in the prior cells, but that didn’t help and it is not clear to me from the code of the function verify_training_process what is exactly the device-side assertion that got triggered:
RuntimeError: CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1 Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.