It’s great that you are trying all these experiments. We always learn something when we try to take the course material and apply or extend it.
My interpretation of your results is that dropout = 0.05 gave the best performance. You got to 80% test accuracy on Epoch 400 there. Also notice that what happens after that is less “bouncy”, but it’s all in the low 72 - 74% range so maybe that is irrelevant.
One other general comment: looking at the loss values is not really all that useful. Any particular number for loss doesn’t really mean anything in the sense that if you tell me the loss value I can’t make any conclusion from that. It’s only useful to know if it’s going up or down, as a proxy for whether convergence is working or not. The accuracy is the Gold Standard or “sine qua non” here. Those are the numbers that really matter for assessing the performance. Well, that and the compute cost it took you to get there.
The other general thing to say here is that this dataset is ridiculously small for a problem this complex. That’s really a big limitation of investing too much effort in this particular set of experiments. Here’s a thread from a while back, where I did some experiments perturbing the mix of “yes” and “no” samples between train and test sets.
It might be a better idea to find a richer and more realistic dataset. One I’ve heard of, but haven’t personally used, is the Kaggle Cats and Dogs Challenge. That has O(10^4) samples. Of course that means that all the training will be a lot more expensive, but at least you’ll have a better chance of learning generalizable things.