Hi,
I just need a bit of explanation on picking up the learning rate from the graph that plot the loss values against the epoch and learning rate scales. In the video “Adjusting the Learning Rate Dynamically”, Lawrence mentioned that it seemed the best learning rate is somewhere between 10e-6 and 10e-5. Then Lawrence picked 5e-6 as the learning rate. Perhaps, my understanding of scientific notation is very limited, but it seems to me that 5e-6 is not between 10e-6 and 10e-5. Should it be 50e-6 instead? I just want to make sure that I understand how to apply this technic correctly because it is going to help me a lot. Thanks