I just finished my first course on machine learning, I have a question about the learning rate:
Why not use a higher learning rate for the first iterations and then decrease it as training advances?
I just finished my first course on machine learning, I have a question about the learning rate:
Why not use a higher learning rate for the first iterations and then decrease it as training advances?
That is a known method (the ADAM optimizer), but it’s not covered in the entry-level courses.
Yes, that is an interesting thought and a good intuition! As Tom says, there are more advanced techniques that involve managing the learning rate dynamically. I’m not familiar with the curriculum in the MLS specialization, but Prof Ng introduces several of these techniques in Course 2 of the Deep Learning Specialization (DLS). So “hold that thought” and learn about this when you get to DLS C2.