Hi, @Caleb.
Thanks very much for sharing your results here. I also did a few experiments using just plain ReLU (as opposed to Leaky ReLU). I was able to get 81% accuracy with n_h = 40, \alpha = 0.6 and 12k iterations.
It’s interesting that it seems quite a bit easier to get good results using tanh here on this particular problem. The general rule is that there is no one magic recipe for what will work best in any given situation.