C2_W4 Assignment: I've read most of the forum posts about low accuracy issues, but none of the solutions are working

I have read the various forum posts about the C2_W4 assignment, but none of them have helped. I am achieving a ~4% accuracy and can’t fix this issue.

  1. My expected outputs all look correct.

  2. I am not one-hot encoding, so I am using:
    loss = 'sparse_categorical_crossentropy' with softmax

  3. I have set my last layer to 24 nodes:
    dense_35 (Dense) (None, 24)

  4. I have already tried optimizer='adam' and optimizer='rmsprop' without luck

  5. I’ve already removed horizontal_flip, shear_range, and rotation_range from my train_datagen generator

  6. I’ve already experimented with different batch sizes, trying both the initial value of 32 and going all the way up to 1830

Current training output:

Epoch 1/15
858/858 [==============================] - 15s 16ms/step - loss: nan - accuracy: 0.0410 - val_loss: nan - val_accuracy: 0.0462
Epoch 2/15
858/858 [==============================] - 13s 16ms/step - loss: nan - accuracy: 0.0410 - val_loss: nan - val_accuracy: 0.0462
Epoch 3/15
858/858 [==============================] - 14s 16ms/step - loss: nan - accuracy: 0.0410 - val_loss: nan - val_accuracy: 0.0462

Epoch 14/15
858/858 [==============================] - 15s 18ms/step - loss: nan - accuracy: 0.0410 - val_loss: nan - val_accuracy: 0.0462
Epoch 15/15
858/858 [==============================] - 19s 22ms/step - loss: nan - accuracy: 0.0410 - val_loss: nan - val_accuracy: 0.0462

I saw that you’ve posted several different versions of this question today.

I’m not a mentor for that course, but I will observe that your accuracy is just about what you’d expect if you were making random predictions among 24 labels - or if you were always making the same prediction (i.e always predict 0 from a range of 0 to 23, regardless of the data). 1/24 is just about exactly 0.041.

In my experience, this sort of issue is a huge clue.

Also observe that in “loss: nan”, that means Not a Number. So the loss values are not valid. Loss should always be a positive real value.

Putting these together, I’d guess that your training method is not working.

Have you tried this post, read all the comments, this should give you the solution

Regards
DP

Hi @TMosh thank you for your feedback. I suspected that my training method is not working, but I don’t know why it isn’t working or what I can do to fix it, which is why I am reaching out to ask for help. Do you have any ideas what might be going wrong? All of the “Expected output” fields are correct and I have followed the examples from the teachings this week, so I’m not sure where to go from here.

Hi Deepti,

The post that you referenced was someone trying to further optimize their model from 86%, and I have reviewed the responses for this post. My model is leveling out at 4% so I think there’s something much different that is wrong with the model. I have been reviewing these posts but nothing is standing out to me as a reason why the accuracy would be so low. Would you be willing to take a look at my notebook?

Hi Nick,

can you share your notebook in personal DM, click on my name and then message.

Regards
DP

Honestly Nick,

When I did this notebook assignment, what I remember I was following the course weeks’ video, and included every image augmentation parameter but one need to understand, the train model requires it do be as simple as possible. the emphasis on batch_size, dense layer and optimiser also is important.

You can share your notebook, will have a look.

Regards
DP

HELLO NICK,

ERRORS IN YOUR NOTEBOOK

  1. you need to append your file in row and not in line.
  2. Your reshape code for labels and images is incorrect.
  3. As explained in the previous comment, your Image augmentation for train_datagen need to be as simple as possible. I know we get confused here and include everything what course instructor mentions in the weeks video, but the lesser things you add, the better accuracy.
  4. Your last layers dense need to be 26, as the number of categories is 26 alphabets. Your dense layer model compile needed changes.
  5. You do not need dropout layer always.
  6. Use Adam optimiser as it gives better performance with little hyperparameter tuning.

Please check the notebook thoroughly, let me know once you have cleared the model training with achieved accuracy.

Keep Learning!!!
Regards
DP