Difficulty Attaining Desired Accuracy - Architectural Issue?

Hi folks, I’d like some assistance with completing the week 1 assignment. I’ve experimented with a variety of network architectures and cannot seem to have my network converge to the desired function with an accuracy much larger than 90% on the training set. Currently, I’m using five convolutional layers with 64 filters each and two hidden layers with 512 neurons and 64 neurons, respectively. All activation functions are relu, except the last one (sigmoid). I’ve tried one dense layer, three convolutional layers, changing the number of filters in each convolutional layer, tried different optimizers, and cannot seem to exceed about 90% accuracy. After about 10 epochs, it seems to converge to 88% accuracy (give or take) and very very slowly increases from here. I’ve even gone to 50 epochs and it never seems to exceed about 91% accuracy (after about 20 epochs, it fluctuates between 90 and 92% accuracy). Can I get some suggestions on how to resolve this issue? What should we be thinking about in our network architecture to fit the training data appropriately?

Please click my name and message your notebook as an attachment.