Tensorflow basic Object Detection fail (loss and accuracy nan)

Alex_Rojas · October 18, 2023, 10:18pm

Hi all,

I am having issues when training basic object detection using my own data. I am following the lab closely, but just modified to read my own data. Currently, I get loss and accuracy metrics as nan (for both classification and bounding_box). Also, the predictions are nan.
I used sparse_categorical_crossentropy, so my labels (for classification) can be integers.
A link to my notebook and training data is at:

https://github.com/arojas314/data-sharing/blob/main/Object_Detection_From_Scratch_to_share.ipynb

Thanks in advance!

Deepti_Prasad · October 31, 2023, 5:16am

Hello Alex,

can you give some details about your data I mean as per what you are using and what you are training the model for? The reason I am asking you these questions to have a better understanding at your notebook in order to find solution.

Also kindly share a screenshot where your model algorithm when trained gets a loss and accuracy of Nan as I cannot see your test and train files from the link you have shared.

Regards
DP

TMosh · October 31, 2023, 6:06am

@Alex_Rojas, I’m going to edit your github link so that it doesn’t render the Markdown here on the forum.

Alex_Rojas · November 6, 2023, 1:12am

Hi @Deepti_Prasad ,

Thank you very much for replying. I feel like I am speeding through the course and the lectures/code you provide is incredible, as well as this community platform.

I am currently running into issues when running a basic OD model on my own RGB imagery from my drone.

I am training the model to detect checker boards (ground control points) in drone imagery (images taken at nadir). The images are high resolution RGB from a Sony camera. There are two classes: 1=GCP_RED and 2=GCP
The images are included in the same repository as the jupyter notebook, in .zip files called train.zip and test.zip

I did not output the train and test files showing nan acc and loss, but I could do so and add it to the same repository I linked above.

best,
Alex

Alex_Rojas · November 6, 2023, 2:42am

@Deepti_Prasad
Accuracyr and loss NAN error is included in the notebook.

Deepti_Prasad · November 6, 2023, 2:38pm

Hi Alex,

I will mention as I come across something which needs another look out in your notebook.

So the first thing I noticed you are using only 1 batch size which can be cause for accuracy and loss to be Nan.
You mentioned using two classes, then why are you using activation of softmax for your last layer

Softmax is used as the activation function for multi-class classification problems where class membership is required on more than two class labels.

I think you have converted images into RGB files? I am not stating RGB is incorrect but for object detection algorithm involving images it is better to go with jpg or png files.
if you still want to use RGB files, you could have used YOLO architecture. The YOLO algorithm takes an image as input and then uses a simple deep convolutional neural network to detect objects in the image.
Can you use a different input shape which somehow I feel is too large for layers you are using, so that is why you accuracy and loss is Nan. Even if you are using input shape 424, 424 did you try with lesser kernel of 2 and 1 to train your model as I can see your bounding box for last layer seems to be higher ( I can be wrong here) I feel it is more about what format of image files you are using.
again in training the model using 1 batch_size is too small for the train model to work upon. So find methods to increase your batch size and then train your model.

I can be right or wrong related to the above suggestions and review, these are my personal view on seeing your notebook. Please do not hesitate or disregard any of the suggestions you didn’t like.

Also I just wanted to know if you have completed all the assignment of Course 3 of TF advanced technique, if yes, then why don’t you apply image segmentation? which covered in the 3rd week of the same course.

Regards
DP

Topic		Replies	Views
Model Returning Poor Results: Course 1 Week 1 Advanced Computer Vision with TensorFlow week-1	4	593	September 17, 2021
C2W4 assigment Convolutional Neural Networks in TensorFlow week-4	6	271	March 23, 2024
About Convolutional Neural Networks in TensorFlow Week 4 loss: nan Convolutional Neural Networks in TensorFlow week-4	2	493	April 30, 2023
C2W4 assignment error in model.fit Convolutional Neural Networks in TensorFlow week-4	19	738	July 11, 2023
C3W1_Assignment model efficiency very poor Advanced Computer Vision with TensorFlow week-1	7	430	August 20, 2023

Tensorflow basic Object Detection fail (loss and accuracy nan)

Related topics