Understanding the working of intermediary layers

It’s as simple as it is. If your model can be learned upon a less complex dataset, is a great news!
Practically, a less complex dataset has a fewer dimension/features and thus, 1/2 hidden layers would be sufficient enough. But larger dimensions/features count upon 3-5 hidden layers.

It is said that the no. of hidden neurons must be 2/3 the size of the input, plus the size of the output layer. But that’s not always the case, they also depend on other factors like, the complexity of the training, outliers, simplicity and complexity of the dataset etc and etc.

Less number of neurons can lead to underfitting, whereas higher number can cause overfitting like problems. An optimum of all the conditions is the necessity.