W4 Assignment - Success with Minimal Architecture

MikeML · August 28, 2024, 9:27pm

For the Week 4 assignment, I defined the simplest model possible: input > flatten > dense (with 1 neuron and sigmoid activation). My plan was to gradually build up from here to see what is the minimal architecture needed to solve the problem. To my surprise the initial model quickly converged to 99.9% accuracy. I tried multiple times and got the same result. The grader scored it as 100%.

Is this too good to be true, or are these examples just very easy to categorize? I thought I’d need to put some convolution layers in there, but I didn’t.

Deepti_Prasad · August 28, 2024, 9:39pm

hi @MikeML

Your approach was right from going for simple model to going towards complex, so if you scored a perfect grader, don’t doubt.

This assignment did require a simpler model, as you go towards advanced specialisation course you will get to experiment complex models.

Also probably share a screenshot of final training read which would show accuracy.

So keep learning!!!

Regards
DP

MikeML · August 28, 2024, 9:48pm

I’ll take it! Here’s a screenshot of the final training which was completed in 6 epochs.

Deepti_Prasad · August 28, 2024, 9:52pm

the task here was more pertaining to getting the desired accuracy which you seemed to achieve for this model, stating this doesn’t mean it is an ideal model as you see your loss is still in the range 0.04

You can try adding convolution layer and practice and resubmit, you will learn how the accuracy and loss gets affected by including a convolution layer

Regards
DP

MikeML · August 28, 2024, 10:55pm

Thanks for the advice. I wasn’t paying attention to the loss at all, partly because I don’t know what a good loss value might be. I’m guessing when it gets so small that it’s displayed in scientific notation (e.g. 1.3167e-06)

I briefly tried a few variations. Here are my observations:

Allowing the minimal model to train for more epochs can sometimes get me to this point (outcomes are somewhat random)
Adding a convolution layer brings the loss down for the same number of epochs and similar accuracy, but it slows down training considerably
Adding multiple convolution layers does not noticeably improve metrics, but it does slow down training even more.
If, instead of convolution layers, I add a dense layer before the output layer, the loss comes down to very small numbers in very few epochs, and the training is fast
If I increase the size of the dense layer, there’s a point where training converges more slowly
If I use both a convolution layer and a dense layer, I can get maybe faster loss improvements, but slower training than the dense layer alone

It feels like a bit of a balancing act, trying to trade off model size and training speed. And a bigger model doesn’t seem to always mean faster convergence.

Deepti_Prasad · August 28, 2024, 11:15pm

Great so you did learn some important points. Being said although you did understand significance of each layer being add, how affects the loss or training speed, one should not forget about the dataset being used here was much simpler than a complex data or more precisely has two significant features. So when it comes to complex data with more than 2 features or probably confounding features relating to a data, that is where a simpler model might not get the result you are looking for.

Your idea of achieving a balance between training accuracy and loss is ofcourse important but that comes with the basic understanding of what kind of data one is handling.

Also just to point on your addition of dense layer giving you better results because of its significance to connect each hidden units when comes a neural network.

I hope you have completed Deep Learning Specialisation.

Being inquisitive a good to step to understand something more better.

Keep Learning!!!
Regards
DP

MikeML · August 28, 2024, 11:19pm

Noted. Yes, I’ve completed the Deep Learning Specialization. Taking the TF course to get more hands-on now. Thanks for the help.

Deepti_Prasad · August 29, 2024, 12:40am

you will not regret surely.

I don’t if you learning out of interest but if you want to understand more complex models, then tensorflow advanced technique specialisation and NLP are good mind-benders.

Good luck

Regards
DP

Topic		Replies	Views
C2W3 Assignment: I can't get over 99.9% accuracy Convolutional Neural Networks in TensorFlow week-3	17	104	August 30, 2024
Can't reach desired accuraacy for C2W4 task Convolutional Neural Networks in TensorFlow week-4	12	679	August 15, 2023
Course 1 Week 4 Assignment Low Accuracy Introduction to TF for Artificial Intelligence ... week-4	1	411	September 18, 2023
Course 4, Week 2, Assignment 1 (Residual Networks) Model performance Convolutional Neural Networks week-2 , coursera-platform	5	32	March 29, 2025
C4 Exercise_4_Multi_class_classifier_Question-FINAL Convolutional Neural Networks in TensorFlow week-4	2	639	March 6, 2022

W4 Assignment - Success with Minimal Architecture

Related topics