hi @OF4
First of all looks like you added too many codes than it was required, remember you might still fail grading if you are adding anything beyond instructed. In the train model, I noticed some extra cell codes which is no where present in the assignment notebook I worked on, so in case you have added anything outside the code markers ###Your.code here to end here###, then kindly get a fresh assignment copy and redo by only writing codes with the markers.
Now comes to your codes,
In the load dataset, you aren’t following instructions properly, read the instructions you need to split the data for train as 80% and then for validation and test set, out of the 20%, meaning remaining 10% goes to test data.
For example if I am told to assign train data 80% for cat vs mouse, the code line would be
train_data = tfds.load(‘cats_vs_mouses’, split=‘train[:80%]’, as_supervised=True)
for validation data then the split will be [80:90%] and lastly split for test data will be [-10%:]
Remove all the other codes you mentioned in that cell.
-
For the build the classifier, kindly reduce the fourth convolution block from your model, also I noticed you are mentioning kernel before the input shape, you can do this if you are using parameter arguments like kernel=(3,3), if you aren’t using this can cause.positional miscalculation as you mentioned input shape after kernel for the first convolution layer.
-
Next issue is ofcourse is in the do_salience,
3(i)when resizing and normalising image to (0,1) you don’t need to use interpolation, please remove it.
3(ii)next while defining the expected output using tf.one_hot, remember we are training in batches, so just using label will cause incorrect encoding, instead you need to mention the first dimension in each batch, i.e. images.shape[0], implementing correct indexing with label encoding
see an example here.
If label = 2, num_classes = 4, and the batch size image_shape(0) is 3:
- List creation:
[2] * 3 =>[2, 2, 2]
tf.one_hot: =>[[0, 0, 1, 0], [0, 0, 1, 0], [0, 0, 1, 0]
3(iii)Now while creating gradient tape, you were suppose to cast the image not convert to tensor the image, so that’s an incorrect tf function you used, please use tf.cast. Also while predicting model inputs you have mentioned training as false, no where in the exercise instructions I could find that mentions you to do so, so training=False was not required.
3(iv)Next to generate grayscale_tensor, you have used tf.reduce_max where as you were suppose to use tf.reduce_sum.
The difference between the two are
tf.reduce_sum will add all the elements together along the same dimension. for example a tensor or [1,2,3], result will be 1+2+3=6 where as
tf.reduce_max will choose the largest value across a given dimension. for example for a tensor of [1,2,3], result wi be 3 as it is the largest value out of the 3 values.
The instruction which must have confused you here is the take the maximum absolute gradient across color channel, but this is being handled by tf.abs which even you have used, but what happens here instead of adding all the maximum value tensor with each gradient, when we use tf.reduce_max will only choose the largest value in greyscale tensor, causing incorrect calculation.
3(v)to normalising the pixel values to [0,255], you are hard-coding implementation of min_val, max_val, I am not saying it is incorrect, but avoiding doing such multiple steps codes when Instruction wants you to implement code in a single line.(this is just a suggestion and not a bug in the code here), but yes do not use that 1e-10 while normalising values, that is incorrect and you need to remove it.
3(vi)Next your optional superimpose codes are totally incorrect, use .numpy() to convert it to the original image and cv2 Library function are cv2.applyColorMap, next apply 255 to this previous code, now superimpose using cv2.addWeighted to the image with gradient color (8)
4.Generate saliency map codes for untrained model and 18 epochs, i would advise you to go the end of the assignment where optional salency for 95 epochs is mentioned, just write the code the way it is mentioned there. No extra codes are required, the ‘salient’ name need to be mentioned in the same code line instead of separately recalling it again.
5.Lastly in your train your model, you needed to just load your model weights and then apply the model.fit to train_batches for 3 epochs but from here your file is filled with too many unnecessary codes. I would sincerely suggest to re-do after reviewing the suggestion I have made here.
Let me know if you’re still having issue. hope this resolves your issue.
regards
Dr. Deepti