Model cannot learn!

I believe my compile is not correct.


I think to this specific problem only, the accuracy should be very high (i.e >99%) because the region of the segmentation is small.

but this val_accuracy of mine is not increasing which, I think, means not learning.

I think the reason is because of the config for the model. Can anyone give me a hint on what the config should be. (I have already try combinations of SGD, adam and mse, categorical_crossentropy and some other config).


This is the decoder, I tried sigmoid and softmax, both did not work!


This is my conv_block, I got 5 of them which have filters = 32,64,128,256,256, respectively.

I tried RMSprop and It was a success! I got IOU over 85% and that is amazing. I wonder how to get the correct config??? is there any approriate way instead of error and trial ??

2 Likes

Hi there,

Assuming all the rest is fine (as long as you follow the instructions properly on the assignment) when it comes to compiling the model intentionally is left for you to decide which optimizer and loss to use. You need to tweek the parameters but there should be guidances on the videos and labs as well. Think of the outputs the model will classify and experiment with the optimizer.

Why is sigmoid recommended in this assignment and not softmax? There are 11 classes, right? Am I not understanding something…
Thank you very much!

1 Like

Hi @juancopi81,

Suppose you have a chest X-ray that has multiple diseases. You would want to predict all the diseases that are present. So, even though you have multiple classes, you would use sigmoid so that you can tell the prediction of each class.

Hope this answers your question.

Hi @thearkamitra,

Thank you very much for your answer, Yes, it makes sense and answers my question… still I was having bad results with sigmoid, so changed it to softmax - categorical_crossentropy and everything was much better (with Adam as optimizer)

1 Like

Hi @juancopi81,

Yes it makes sense. Cross_entropy works best for softmax while sigmoid works best for binary cross entropy.

Consider using smaller kernel_size and activation = ‘relu’ in Conv2D. FYI on my side it works better without cropping