regarding exercise 3: according to MNIST handwritten digit database, Yann LeCun, Corinna Cortes and Chris Burges there is no current network with better accuracy than higher than 99.8%. are we are asked to beat SOTA? or is it on training set
any tip how to get that result?