C4W2 - question on doing well on benchmarks

To do well on the benchmarks, Andrew suggests building ensembles of models and averaging their results. Is the actual trick here that one is not spending time finding the right set of parameters for the network? So in real-life application scenarios, it’s always better to pick carefully parameters and train 1 neural network X times longer, than X neural networks?

It’s hard to compare training a NN for more epochs & the ensemble architecture without looking at performance on validation set. Ensemble usually improves the worst case performance since averaging comes into play. One must evaluate all candidate models and the final ensemble on the validation set to ensure that the ensemble doesn’t hurt validation set metric.

You’ll still have to train the candidate models that make up the ensemble. When using X copies of a model, do hyperparameter tuning of 1 and use the best parameters as template for training other copies.

When discussing Machine Learning solutions and methods, “always” is a very rare concept.