Perf comparison on Feature Selection video

For the video on Feature Selection, were the model with different features (after selection) trained with the exact same number of iterations or they were picked at first sign of convergence? If its the latter, what is the impact of different feature selection on (1) Training time per iteration and (2) Time to converge for the same hardware?

All else remaining constant with respect to model hyperparameters, feature selection aims to improve training performance with fewer features. Since you are now training the model with fewer but important features, training time will go down and model converges sooner that when a lot of noise was in data.