I have got an MSE of 5.22 and MAE of 1.79.
But my grade says otherwise.
Please set tf.random.set_seed before building your model to reduce the randomness between your environment and grader environment.
Use an instance of Callback to decide when to stop training. This helps weights to stop updating once the grader criteria is met.
Couple of things:
- Set the seed for us to compare results.
- When using Conv1D, keep the
kernel_size
small in range [2-3]. - Leaving the learning rate of the optimizer in the midpoint of 1e-4 and 1e-3 seems to produce good results.
It’s still not working. I think it might be a problem of the platform.
One way to approach the problem in such a scenario is to build a model the performs very well. Please update your model definition that shoots for an MAE ~ 1.9 and MSE < 6.
You can also use the EarlyStopping with a high value for patience and set restore_best_weights
to True
.
One more thing to notice. The exercise asks you not to use lambda layer.
Archiving a model has nothing to do with model performance.
A ticket has been filed with the staff regarding variations in results across the coursera jupyter environment and the grader environment.
Please click my name and message the best performing notebook you have as an attachment.
This is your model performance on the validation data:
mse: 6.95, mae: 2.05 for forecast
Here’s the cutoff required on the validation set to pass the assignment:
To pass this assignment your forecast should achieve a MSE of 6 or less and a MAE of 2 or less.
Do note that there is likely going to be some variation on model results across the coursera lab environment and the grader environment. So, it’s best to achieve a better performance on the validation set than the expected cutoff to have good confidence of passing the grader test.
Moving forward, share the grader feedback (expanded) on a public topic so that other learners who stumble across the same issue will find the post useful.
I’m now getting this on the training set
mse: 4.1215 - mae: 1.3855
and then mse: 6.17, mae: 1.94 on the validation set
The other time I’m getting this on the training set
mse: 4.6947 - mae: 1.6280
vs
mse: 6.77, mae: 2.03 on the validation set
MSE is not going down anymore. Any ideas? (i have bidirectional, conv1d, dropout, a small window and a std batch size, i removed model.summary, and used set_seed, using Adam, my kernel_size is in proper range)
Please click my name and message the best performing notebook you have as an attachment.
Hi Balaji,
Here’s my mse: 6.33, mae: 1.97 on the validation set.
[snippet deleted by mentor]
Cheers,
Michael
Ah, I just got a slightly better result-
mse: 6.29, mae: 1.96 for forecast
[snippet deleted by mentor]
Please don’t post your notebook in public. It’s okay to share the stacktrace in public posts so that others find it useful.
Here are a few hints to improve model performance:
- Don’t touch the config for
SPLIT_TIME
from the defaults since it affects the validation set size. - Pay attention to the dataset values since the values of data affects training.
- Refer ungraded labs for sample architectures and how data is pushed to lower values.
- Use early stopping callback to stop training when applicable and restore the best model weights.
Balaji - thank you for the hints. Appreciate that!
Could you please remind me the default value of SPLIT_TIME? Btw. I already have an EC callback in the notebook.
As for 2 and 3, could you elaborate? What ungraded labs do you have in mind? (are you talking about the lecture notes?) What about batch_size? I’d keep it 32, but I’m not sure about the window_size. Could you please also remind me what the default value of window_size is?
Here are steps to refresh the workspace and get the starter code.
WINDOW_SIZE
given in the starter code can be left unchanged. LSTMs can deal with sequences of length 64.
Your approach towards BATCH_SIZE
is along the right direction.
See this topic to understand how the scale of training data influences model training.
There are a few ungraded labs for course 4 week 4. Here are the links: