Hi Everyone,
Well… here we go again…
I am doing the course #2 week #3 assignment, I genuinely tried multiple different model architectures, using sequence of convolutions, lstm.
Every training takes at least half an hour, the best slope I could achieve is 0.002, which is 4 time larger than required 0.0005.
Every time the loss starts lower and then inevitable rises.
The latest architecture I have takes 1 hour to train.
Could anyone provide some reasonable hints about how to beat this one?
Hi @Taranovski_Alex i am not a mentor for this but as far as I remember the assignment is similar to the lab, maybe the model is a bit larger -extended, but not much different.
Hello @Taranovski_Alex ! Following the same idea as the example of the week, I used the model architecture and made some changes to try to understand what improves or worsens performance. During my tests, I didn’t wait for the model to complete its run because it’s possible to get a preview of whether it’s improving or not in the output of each epoch. My suggestion is to revisit the example of the week and make small changes to identify what can contribute.
1 Like
Thanks Everyone for the hints!