Why does this error keep showing up?
@balaji.ambresh Can you please check this
Your image does not show the entire error message. Please post a new image.
Are you running the notebook via Coursera Labs, or on some other platform?
Thanks for creating the public topic.
Bottom of the trace shows that your input data is string instead of numeric type:
Node: ‘categorical_crossentropy/Cast’
Cast string to float is not supported
[[{{node categorical_crossentropy/Cast}}]] [Op:__inference_train_function_948]
Here’s the cell output you missed:
Actual:
Training images has shape: (27455, 28, 28) and dtype: <U3
Training labels has shape: (27455,) and dtype: <U2
Validation images has shape: (7172, 28, 28) and dtype: <U3
Validation labels has shape: (7172,) and dtype: <U2
Expected:
Training images has shape: (27455, 28, 28) and dtype: float64
Training labels has shape: (27455,) and dtype: float64
Validation images has shape: (7172, 28, 28) and dtype: float64
Validation labels has shape: (7172,) and dtype: float64
Here are a few more mistakes that need to be fixed:
- Input shape parameter to the 1st layer to your model is incorrect. It needs to be fixed based on the shape of training images.
- Loss function is incorrect.
- Number of neurons in the output layer is incorrect.
- Activation function of the final layer is incorrect.
It worked!! Thank you for pointing it out!
Please click my name and message your notebook as an attachment.
Number of nodes in the output layer are incorrect. Please look at the markdown at top of the notebook to figure out the number of classes.
Now when I’m tuning the parameters ever so slightly I finally got to a place where no matter what I change it’s approaching 0.99 training accuracy but isn’t reaching it.
What do I do?
Are you sure that 0.99 validation set accuracy isn’t good enough?
It’s never hitting 0.99
The last five epochs normally run from 98.5 to 98.97
The submission won’t be accepted until training acc >=0.99 and val acc >=0.95
Please read this topic.
Based on the plots, the first thing to notice is that the model is not meeting the desired training accuracy and so is underfitting. You are pretty close to the required threshold. Here are a few things to try which do go along with some of the suggestions offered in wikipedia as well:
- Watch lectures from deep learning specialization on coursera (courses 2 and 3).
- Try different architectures with the tips mentioned in the other post. Your current architecture does follow the pointers from the other topic.
- Do pay attention to the augmentations. Without any augmentations, training accuracy is met within a few epochs. But, with your augmentations, the same NN doesn’t meet the desired training accuracy. This points to 2 things:
a. Augmentations are making harder for the network to fit the data within the limit on the epochs. So, redo augmentations keeping the data distribution in mind.
b. Network architecture needs to change to improve on accuracy.
Hope this helps.
Thank you for these suggestions.
- I’ve finished them all and tried to apply them here.
- I’ve tried with 1 and 2 Conv2D and maxpool layers, changing the number of filters a few times. I tried the same with 1, 2 and 3 Dense layers with different neuron sizes in decreasing order.
- I tried combinations of augmentations, and changed the ranges in 0.01 differences for multiple parameter combinations and found the highest accuracy (0.989) for 1.17 to 1.21. I’ve tried other architectures for these augmentations too.
I’ve tried nearly 25 combinations and it’s peaking at 98.9 training accuracy. For other architectures with no augmentations, training accuracy hits 1 pretty fast, but validation accuracy gets stuck at 0.93-0.94.
What do I do?
Can you clarify which assignment you’re working on?
The forum area is DLS CNN, that’s DLS Course 4.
But the thread title says C2 W4, but C2 of DLS only has three weeks.
What is the title of the notebook you’re working on?
The category is now fixed.
When you say you’ve finished them all, does that include deep learning specialization
?
Please note that it’s not just about throwing numbers and performing random augmentations. Do pick the transformations that make sense considering how training and test sets are distributed.
There are other tensorflow layers, different kernel sizes and model configurations you can try as well.
I started with your notebook, removed 3 transformations and changed to model to get this performance:
You can use early stopping to stop training once you reach the desired performance.
I meant the Deep Learning with TensorFlow specialisation
I tried kernel sizes of 3x3 and 2x2
Changed the model configurations too
What do you mean removed 3 transformations? You mean Conv2d and maxpool?
Sorry for the confusion. I meant 3 augmentations.