Exercise 3: 99.8% is better than state of the art?

regarding exercise 3: according to MNIST handwritten digit database, Yann LeCun, Corinna Cortes and Chris Burges there is no current network with better accuracy than higher than 99.8%. are we are asked to beat SOTA? or is it on training set

any tip how to get that result?

You should beat the accuracy of 99.8% on the training set.