Did you read this instruction before the grader cell for convolutional block
If training is set to False, its weights are not updated with the new examples. I.e when the model is used in prediction mode.
See this image
So you understand where all you need to include training=training
Same goes for the identity block as it includes the same instructions