I’m finishing Week 1 of the Advanced Computer Vision with Tensorflow course on Coursera, and encountered two problems with the Assignment (predicting boxes around birds).
Problem 1: In the notebook provided, several of the provided helper functions contain what I believe to be a bug. They attempt to create ragged numpy arrays. That is, these functions try to create a numpy array from images with different shapes. To the best of my knowledge, numpy does not permit ragged arrays. The simple solution was to use a list of arrays instead. This is resolved at my end, but future students will encounter the same issue and have to waste time diagnosing and fixing it.
My second issue is that any .h5 model I saved from the course notebook, and upload to the form below, is rejected with the message “Your model could not be loaded. Make sure it is a valid h5 file.” I’ve checked that my model obtains the required IOU score.
I missed the editing correction you did. Thank you for following the community guidelines
Now can I know how large is your birds.h5 file and if you followed instructions correctly as per assignment while uploading your model for submission?
Also can you confirm if you have used tf.keras.optimizers.experimental.SGD instead of tf.keras.optimizers.SGD, as the latter is needs to be used for this assignment.
Kindly let me know if this was the issue or not !! or we need to dig in more into your issue.
Kindly tag my name in case I don’t miss your response next time.
ok kindly use the below SGD optimiser (this is mentioned in the instructions section in the assignment)
your model seems to be too large.
here is what you can do, try tomorrow as you stated GPU usage is done for the day, as you know it has 24-hr cycle, then make the change of optimiser as mentioned before.
then let me know tomorrow if the issue still persisted, I hoping your other codes and expected output are as per assignment output.
also ensure you have deleted the previous model h5 file you have downloaded before you try again.
I’ve trained the model as requested on Google Colab, only for 10 epochs and with CPU. The IOU score is not high enough, but the h5 model should be valid. However, the error message is still this:
“Your model could not be loaded. Make sure it is a valid h5 file.”
Probably the reason could be the below IOU score being not how much assignment requires. Kindly DM me your notebook for review. Click on my name and then message.
You not suppose to add any extra code line other than mention already, kindly remove
IMG_SIZE = (224,224,)
IMG_SHAPE = IMG_SIZE + (3,)
Next for code line
Create a mobilenet version 2 model object
what is base_model?? you are recalling it incorrectly
YOU ARE SUPPOSE TO USE tf.keras.applications.mobilenet_v2.MobileNetV2
Also for the input shape use here correct shape rather than hard coding the path, the way you recalled input shape separately for size and shape is incorrect.
def define_and_compile_model():
You are suppose to use loss=‘mse’
Also avoid recalling optimizer and loss separately. Use them in the model.compile statement directly.
For exercise 6,
Get the steps per epoch (may be a few lines of code)
import math (YOU DO NOT REQUIRE THIS IMPORT MATH)
steps_per_epoch = math.ceil(length_of_training_dataset/BATCH_SIZE)==>incorrect, use the same step you used for validation steps
For exercise 7, in the model.fit you are not suppose to use batch_size parameter, kindly remove that.
Let me know after correction, this worked or not.
Just a heads up Course 3 and Course 4 of this specialisation requires you to avoid hard-coding any codes and follow instructions as per instructions mentioned in the assignment.
@Deepti_Prasad Thanks for the feedback. I have a few doubts that maybe you can help me with.
You not suppose to add any extra code line other than mention already, kindly remove
IMG_SIZE = (224,224,)
IMG_SHAPE = IMG_SIZE + (3,)
How does this affect the resulting tf model, or the saved .h5 model?
YOU ARE SUPPOSE TO USE tf.keras.applications.mobilenet_v2.MobileNetV2
steps_per_epoch = math.ceil(length_of_training_dataset/BATCH_SIZE)==>incorrect, use the same step you used for validation steps
Why is this? The notebook asks to define validation_steps separately from steps_per_epoch.
I noticed the homework notebook template got updated recently, removing the pip install of TF and Keras versions 2.8.0, and including a version check and conversion to 2.8.0 at the end of the notebook. What are the benefits of this change? My model passes the assert test in the new notebook.
Is there any reason to not run the homework notebooks locally on my own GPU vs. staying in Google Colab? The reason being, it would be more convenient to set the job to run locally overnight, without having to be present to prevent the runtime disconnecting.
Edit: I trained a model using 1 epoch after making the corrections you mentioned, but the same error message still persists:
Your model could not be loaded. Make sure it is a valid h5 file.
send me the updated correction you made via personal DM
this comment meant you to define steps per epoch the way you recalled for validation steps. this doesn’t mean you are recalling validation steps and steps per epoch together as you notice while training you use steps per epoch and not batch size.
recalling incorrectly the mobilnet step can have its own affect on h5 model.
yes the assignment was updated due to the version issue and runtime error.
can I know what is the iou score this time. also I hope your h5 model is name as bird.h5 as mentioned in the assignment.
The IoU score, defined as the fraction of examples for which the IoU is > 0.5, is ~0 because I only trained for 1 epoch. This was meant to be a quick test to see if the .h5 model is now correct for the submission grader. The name is birds.h5
Can I know what do you mean by adding that extra, and space to input shape while creating the mobile net version 2 object??
input_shape = (224, 224, 3, )(INCORRECT INPUT SHAPE DONOT REQUIRE THAT EXTRA COMMA AND SPACE
The instructions from assignment mentions
Set the following parameters:
input_shape: (height, width, channel): input images have height and width of 224 by 224, and have red, green and blue channels.(HERE 3 MENTIONS THESE 3 CHANNELS)
Same issue for def define_and_compile_model():
While defining input layer incorrect recalled shape=(224,224,3,) (REMOVE THAT COMMA)
You haven’t done the correction for
Get the steps per epoch (may be a few lines of code) as per previously commented.
KINDLY RECALL THE STEPS PER EPOCH THE WAY YOU RECALLED FOR VALIDATION STEPS or follow the instruction mentioned from assignment
Alternatively, you can use // for integer division, % to check for a remainder after integer division, and an if statement.
When mentors goes through assignment and give a detail response, we hope you read the comment and follow the instructions as I can see you did not change steps per epoch.
At the end of the assignment you will notice the below cell
assert tf version 2.8.0?
assert tf.version == ‘2.8.0’, f’You have TF{tf.version}. Please install the grader-compatible Tensorflow and select Runtime > Restart Session’
So make sure you run the above cell, basically you need to restart the session
Another major issue changing epoch to 1 is not allowed as instructions clearly mentions
Prepare to Train the Model
You’ll fit the model here, but first you’ll set some of the parameters that go into fitting the model.
EPOCHS: You’ll train the model for 50 epochs
The epoch 50 is assigned by the auto-grader so reducing to 1 will not let you pass the assignment.
Can I know what do you mean by adding that extra, and space to input shape while creating the mobile net version 2 object??
input_shape = (224, 224, 3, )(INCORRECT INPUT SHAPE DONOT REQUIRE THAT EXTRA COMMA AND SPACE
Sure. The trailing comma is just a style choice when creating a tuple or list, which doesn’t make any difference to the code itself.
I ran the notebook locally for 50 epochs, using TF and Keras 2.8.0, and the resulting h5 model was accepted b y the grader, so this issue is now resolved for me.