Assignment C3_W3

I have couple of questions regarding the w3 assignment and ungraded ones.

  1. I noticed sigmoid is used in the activation layer of last assignment to classify the output. However in the C3_W3_Lab_2_OxfordPets-UNet Softmax is used. why is that ? why we are not using softmax in the C3_W3_Assignment?
  2. to upscale in the decoder we used Conv2DTranspose(n , kernel_size=(4,4) , strides=(2,2) , use_bias=False ). if the input size was (32,42,filter) after the function it will convert to ( 66,86,n) and then we cropping to get 64x84 image size. why we are not using :
    Conv2DTranspose(n_classes , kernel_size=(2,2) , strides=(2,2) , use_bias=False ) to get 64x84 without cropping? also what does use_bias do ?
  3. If instead of object detection I want to do multi-label classification what should be the Y_train shape and output activation function ? would you please give me some ref to study about it ?
    I was hoping this course does some multi labeling. but it only covers object detection and masking.

Hi @Pouria_Baghaei ,
The term “OxfordPets-UNet Softmax” most likely refers to the use of softmax activation within a UNet architecture tailored for semantic segmentation tasks on the Oxford Pets dataset, whereas “softmax” more broadly refers to the mathematical function used for converting raw scores into probabilities, which is commonly used in classification tasks.

Try to find out the formula for upscaling or downscaling, it will clear your doubt.

Receptive field = (output_size -1)*stride + kernel_size.

Hope this might help you to understand. The use_bias parameter is a hyperparameter that determines whether bias terms should be included or omitted in the computation of the convolutional layer’s output. The parameter is specifically associated with the Conv2DTranspose layer, which is used for performing transposed convolutions in neural networks. When the use_bias term is “False”, the output solely depends on the convolutional weights and the input.

The multi-label classification shape for y_train should be a 2D array where a row corresponds to an instance and a column corresponds to a class. The element here is a binary value. If you have n instance and m classes the shape of y_train would be (n,m).
Multi_Label Image Classification
Sequential Model APi