C4W4 Assignment: Cannot achieve MSE below 6, MAE below 2

Hi guys, I’ve been at this for a few days now and have not been able to come up with an architecture that satisfies the course’s passing requirements.

For loss, I have always used default Huber, and for optimizer I have used SGD. After experimenting with the LR callback, I have usually settled on about 1e-6. Not always but usually around there. For metrics, “mae”.

Some of the architectures I have tried:

[code removed - moderator]

There were many more but I will not post them all. I have also tried making the LSTMs bidirectional with no success. What direction should I take? Will Adam help instead of SGD? Different parameters?

I do not want the answer, just some guidance.

Thanks everyone.

New update:

Out of nowhere I attempted a new architecture and actually got the correct MSE and MAE! I don’t want to spoil it for other people doing this course, but a hint, I had to revamp the whole thing. I started closer to Lab 3 that we saw in week 4. My problem was with my optimizer.

Thank you for reading anyways. I am good.

Really happy to have finished this certificate!

1 Like

Check the learning rate first of all. Second, why are you using a lambda to multiply by 400? the temperatures are maximum 25 (maybe try to multiply by a similar number).

I’ve achieved the desired results with a much simpler architecture. You can also use None on the input shape of the Conv1D it will automatically pick up the window_size :slight_smile:

And yes trying different optimizers might help

1 Like

Hi @christianc,
I have same problem, could you please tell me how to slove it?
Did you change your model architecture? or just revised the optimizer learning rate?

Hi Enna, can you please send me a pdf or html of your code? I would need to see the architecture you used so far.
Chris

Hello, for the optimizer can you please try Adam, (default learning rate, none specified) and let me know what happens. You can send over another pdf of the result if there’s still trouble.

Example:
optimizer = tf.keras.optimizers.Adam()

Chris

Try this and tell me what happens. Keep optimizer to Adam().

[code removed - moderator]
Chris

Hi Chris,
I passed the test after I changed the Conv1D kernel size and input shape. Thanks a lot!!
Enna

Youre welcome. If anyone reaches out to you for help, please be sure to help them as well. We all must assist each other. Good luck with your studies,

Chris

Dear Chris,
thank you for your optimistic messages. You are really helpful for struggling people.

Hello, sorry I don’t have a lot of time to troubleshoot, but check out this attachment. Compare to yours and see what you did wrong.
C

(Attachment Week 4_Assignment (1).html is missing)

Trying again.

(Attachment Week 4_Assignment (1).html is missing)

Third attempt. Their messaging service is not letting me send the HTML file type.

[code removed - moderator]

Thank you, CHRISTIANC,

I passed the course before I had sent to you my greetings. I just like that you are so generous in your help.

By the way I used 2 memory layers and 3 dense with one conv. Barely achieved the required values of mae.

Regards,

Ram

You’re welcome. If somebody asks you for help as well, please be sure to pass on the advice. We all must help each other.

C

Hello Ramzis , would you share with me your work ? i find a lot of problems trying to optimize my model

1 Like

Write me on the email akhmitzy@msu.edu from your another email address. I will try to help. It is unethically to give the solution at once. Laurence does not like it, NG too.

~WRD3990.jpg

1 Like

Thank You @Ramzis_Akhmitzyanov . i passed the assignement

Hi, @christianc,

Like others, I have been trying multiple architectures and hyperparameters.
My mae keeps bottoming out around 2.2.
There does not seem to be any guide to how to improve the network other than blind trial and error.
Are there any principles I can try to at least point me in a direction where I can incrementally improve the network?

Regards,
Marty

I was finally able to find an architecture and hyperparameters that worked. It still would be interesting to gain some intuition what the right combinations of these elements are.