I searched for this question online and came across this blog: (A brief history of learning rate schedulers and adaptive optimizers) which says that we do not need to use a learning rate scheduler with optimizers like Adam while Prof. Ng said in this video (https://www.coursera.org/learn/deep-neural-network/lecture/hjgIA/learning-rate-decay)
if we reduce learning rate over time then it may help speed up learning.
I’d like to request the people in community to share some thoughts on the topic.
Thanks!