If regularization makes our network act more like a linear function, can we achieve a similar effect by decreasing the number of units in our hidden layers manually? When would I choose one over the other? And why isn’t “smaller network” in the flow chart options for solutions to high variance?