C3W3 Assignment training time

anoopebey · December 18, 2023, 3:23pm

Hi, Is it normal to have a quite high training time per epoch for the model?

Deepti_Prasad · December 18, 2023, 3:59pm

Depends on your steps per epoch code. make sure you have recalled it correctly for both training and validation set correctly. check ungraded lab for hints to do the correction.

If not able to find, let me know.

Regards
DP

anoopebey · December 18, 2023, 4:06pm

The steps per epoch was already defined in the colab and from the answer hints it seems to be the correct no. of steps (125).

Best Regards,
Anoop

Deepti_Prasad · December 18, 2023, 6:25pm

Not for the cell you are sharing. Go back to previous cells where you need to define steps per epoch for both dataset as per batch size

memoros77 · December 21, 2023, 2:35pm

Hello, I having the same problem and in the last few days, there are several post with the same problem. Maybe @Deepti_Prasad you should review the notebook.

I passed all test, and increase the learning rate but I still have the same problem.

Greetings

memoros77 · December 21, 2023, 3:00pm

ok the problem is the batch size that isn’t part of the exercises. You should clarify that in the notebook.

Deepti_Prasad · December 21, 2023, 4:46pm

If one reads all the instructions properly, can understand this part. Good that you got the catch and debugged yourself.

memoros77 · December 24, 2023, 1:35am

@Deepti_Prasad Hello, I Passed the Assigment and download the h5 file

but when I try to summit assigment I got the error:

Could you help me? Please…

Greetings!

Deepti_Prasad · December 24, 2023, 6:44am

You might have passed the test, but the assignment is expecting you a desired score which you have not according to the grade. So retweak your assignment by going through each cell where changes can be made

gent.spah · December 24, 2023, 11:11am

Yes as Deepti says go back to your assignment and check all the cells, it might be the case that you iou score is right but some other cell code is not as expected. Also make sure you have the latest assignment with the default naming of the file!

memoros77 · December 25, 2023, 3:30pm

Hello, thank you very much for your answer:

There is a dependency problem when I install packages:

I got the correct conv block

I got correct the downsampling path:

I got correct the expected output of the FCN-Encoder:

Screen Shot 2023-12-25 at 9.23.43

The model behave as expected, decreasing the loss in training and validation datasets

I got a IOU score a least > than 0.6

I’ve downloaded the last homework colab.

Please, Could you Check if there is a problem with the assigment?

Greetings

Deepti_Prasad · December 25, 2023, 7:31pm

Hello @memoros77,

Click on either mine or @gent.spah name and then message to send your downloaded notebook. Also kindly rename the assignment by your name as memoros copy while sending. Do not post the notebook or codes here. It is against community guidelines.

Regards
DP

Deepti_Prasad · December 26, 2023, 5:05pm

Hello memories,

Can you go through ungraded labs once. Also sharing a similar post thread where issues could be resolved by some of the model adjust, you have few similar errors, kindly go through them make the correction. if issue still persist, send the updated notebook.

in your convolution, you are using unit of 3, kindly stick to unit, pool_size and stride of 2 for all the conv_block.

Regards
DP

anoopebey · December 28, 2023, 3:13pm

Hi, The problem seems to be arising from the GPU session not being activated in Colab. It seems that the colab session does not activate the GPU when running with TF 2.8. The same code runs with TF 2.15 and the time per epoch seems to be as expected. But, during assignment submission it runs into an issue as the autograder is not compatible with TF 2.15

Deepti_Prasad · December 28, 2023, 3:24pm

Anoop,

Can you show me the screenshot of your submission grader output.

Your model could have issue as I can see your loss is not going towards 0.

Kindly have a look into your model algorithm once. Check your model compile what loss and optimizer you used. If your have followed the instructions mentioned in the assignment and as mentioned in post comment to memorose, that link also provide necessary guidance.

If still stuck then let me know

Regards
DP

anoopebey · December 28, 2023, 3:28pm

Hi Deepti,

This is the colab output and the submission grader output I received.

memoros77 · December 28, 2023, 10:30pm

@anoopebey you are rigth I have de same problem, the colab doesn’t activate the GPU when running TF 2.8 that is required only works with TF 2.15.

That’s could be the problem with the autograder.

@Deepti_Prasad and @gent.spah

In response of the post thread:

“While copy and pasting some of the codes, you have edited the grader cells which can also cause grader failure due to metadata editing.”

Yes I have copy and paste code but from my local enviroment, because is more efficient and cheaper work there, we have to pay double, coursera and google colab to pass this assigntment.

“In your encoder cell, Zero padding’s layer is not followed as per the instructions given.”

The colab is given us the correct padding, we don’t have to do anything:

x = tf.keras.layers.ZeroPadding2D(((0, 0), (0, 96-input_width)))(img_input)

“In the decoder cell, the data_format is incorrect.” It is Ok

“Also while I was checking your codes you choice of activation in the decoder grader cell is incorrect.” It is Ok

I have taken a lots of courses from deeplearning and it’s first time that I pass all test and get the wrigth answer and the grader hasn’t given a correct evaluation neither a good retroalimentation of the error

The loss is decreasing and the accuracy is increasing and got the correct IOU this is frustrating

I’m not a full time student, I work 10 hours a day and I try to find time to finish the courses. I don’t have time to guess which of all the parameters match yours, much less if the self-evaluator doesn’t tell us where we have the error.

anoopebey · December 28, 2023, 11:11pm

Hi @memoros77 , the workaround that worked for me was to run the code in a local machine with TF 2.10 and save the model. The model generated in this way was accepted by the auto-grader. No change in code was required.

Deepti_Prasad · December 29, 2023, 5:31pm

Hello @memoros77,

corrections required.

in def conv_block
your data_format code should be the recall function of image_ordering and not image_ordering.
in def FCN8
In your Block codes as instructions given
We recommend keeping the pool size and stride parameters constant at 2.
Please use unit also 2 and not 3 or all the conv_block
in the same def FCN8
Upsample o above and crop any extra pixels introduced
kindly remove the below codes
use_bias = False, data_format=IMAGE_ORDERING
load the pool 4 prediction and do a 1x1 convolution to reshape it to the same shape of o above, remove data_format.
remove use_bias = False, data_format=IMAGE_ORDERING from code line
upsample the resulting tensor of the operation you just did
in the below code line, again for data_Format use the recall function of image_ordering and not image_ordering
load the pool 3 prediction and do a 1x1 convolution to reshape it to the same shape of o above
Remove use_bias = False, data_format=IMAGE_ORDERING from code line upsample up to the size of the original image.
for model compile statement, use Adam optimizer instead of SGD, momentum=0.9, nesterov=True
the epoch you used is too high, try using the same number of epoch as shown in the expected output which is 70.

Do these corrections and let me know once your issue is resolved.

Sorry for the delay in response as had other notebooks to review.

Regards
DP

memoros77 · December 29, 2023, 9:05pm

Deepti thank you very much for your feedback, I will make the changes and let you know if it works

Topic		Replies	Views
Advanced Computer Vision with TensorFlow C3W1_Assignment taking really long to train Advanced Computer Vision with TensorFlow	9	551	January 8, 2024
TF-AF Course 3 Week 3 Assignment Error 2 Advanced Computer Vision with TensorFlow	5	394	December 31, 2023
C3W1_Assignment the runtime is very slow Advanced Computer Vision with TensorFlow week-1	7	33	November 9, 2024
Excessive Training Times for Epochs Advanced Computer Vision with TensorFlow week-3	2	304	December 16, 2023
TF-AF Course 3 Week 3 Assignment Error Advanced Computer Vision with TensorFlow week-3	10	515	January 2, 2024

C3W3 Assignment training time

Related topics