Hi,
After 15 epochs, both training and validation losses become ‘nan’, and the accuracy remains stagnant at 0.0410 for training and 0.0462 for validation. This issue persists despite configuring the output units to 24 and using ‘sparse_categorical_crossentropy’ as the loss function. Any insights or suggestions on resolving this would be greatly appreciated.
Thank you!
Hello @Muhammad_Usman5
Kindly let us know
- The most important part is parse data from input grader cell, note in this grader cell you need to append the labels with row[0] and image with row[1:]
Next if you have correctly reshape the images and labels and converted them into array form correctly using numpy. - in the train val generator grader cell
In this section you will have to add another dimension to the data, what axis did you include and did you use Hint: np.expand_dims - Next mention for train and val generator what all parameter you used. Hint keep it as simple as possible.
- Your batch size?
- your model architecture like how many conv2d layers and dense layers choice, last dense layer’s activation?
- Next optimizer and metric choice.
Regards
DP
Hi Deepti,
I appreciate your response and helpful suggestions. In response to your suggestions:
- I’ve implemented the corrections according to your instructions.
- I’ve utilized the code
img = np.expand_dims(img, axis=-1)
. - Augmentation has been applied to the training generator, while it hasn’t been used for validation.
- The batch size has been set to 32.
- I’ve structured the model with 2 Conv2D layers, each with 64 (3,3) filters, followed by a flatten layer, a dense layer with 64 neurons and ReLU activation, a dropout layer of 20%, and finally, the output layer consisting of 24 classes with softmax activation.
- I’ve chosen the ‘adam’ optimizer and ‘accuracy’ as my metric of choice.
Despite these efforts, the problem still persists. Therefore, I would like to share the assignment notebook with you for your reference.
Thank you for your assistance.
M. Usman
{ASSIGNMENT NOTEBOOK REMOVED BY MODERATOR}
kindly follow code of conduct. you cannot share assignment notebook on public post thread. it is against community guidelines
Hello @Muhammad_Usman5
After review mistakes present,
- for row in csv_reader, the way your have append label and image is incorrect.
follow this.
for row in csv_reader, labels need to append with row[0] (please do not use float), then same images need to be append with row[1:]
Then convert the images into array using np.array and use .astype(‘float64’) rather than np.float64, then reshape images with the correct recall as
image=np.reshape(images,(images.shape[0],28,28))
then label converted to array using same np.array and use .astype(‘float64’)
-
train-val generator grade cell
While adding another dimension, you can use axis of 3. -
Another reason train_datagen, you have used too many arguments to augment images, keep it simple(HINT: use only rescale, width_shift, height_shift, fill_mode)
-
In create_model, your model using dense layer unit same as conv2d doesn’t seem right choice for the model to learn anything, change units for last two dense layer.
-
dropout is a powerful technique for preventing overfitting in neural networks. The general rule of thumb is to add dropout after the last pooling layer , but depending on the size of your dataset and the complexity of your model, you may want to add dropout to other layers as well. So do you think you require dropout layer.
Regards
DP
Hi Deepti,
Thank you for your suggestion. I will implement the corrections and update you on the output result.
I will also take note of the community’s code of conduct and refrain from sharing my notebook in public posts again.
Thank you.
M. Usman
Hi Deepti,
I’ve implemented the corrections, and everything works fine.
Thank you so much for your time and insights.
M. Usman