Is there a way to optimize the number of connections per layer of a large neural network as they increase their performance predicting outputs?
By this do you mean “optimize the number of units in each hidden layer”?
Yes, for instance. I have the feeling that interlayer connections should adaptively decrease as output main drivers are identified.
Adaptively, no that doesn’t happen.
The size of a hidden layer is determined by experimentation.