hello there
I just finished the c4w4 assignment and I have a question about this text.
" Notice that this is only changing the learning rate during the training process to give you an idea of what a reasonable learning rate is and should not be confused with selecting the best learning rate, this is known as hyperparameter optimization and it is outside the scope of this course."
So I do not understand why picking an optimal learning rate by running experiment training one time to see how learning change is not a way of selecting the best learning rate. Why is it different from hyper-tuning?? or does the process of hyperparameter optimization needs to be done via grid-search or hyperopt method only???