How is the non max suppression able to keep both trees when they have IOU of greater than>0.5. is it because both trees have different aspect ratio’s and hence different size anchor boxes ?
How does the network learn which which anchor box to assign to which object?
how to mathematically represent and put a number for the IOU of these trees using the formula for IOU?
3.Do we have to specify the output vector y in training only or do we need to draw bounding boxes in input images as well?
How to choose between RCNN and YOLO? say I have a problem for detecting fish underwater
A general question : In convolutional implementation of Sliding window’s prof Ng said that the FC layer is same as a 1*1 conv layer formed by convolving with a filter of same size as previous layer as each of the output channels are a arbitrary linear function of the previous layers activation. I’m unable to relate how to it equates to a FC layer, I though FC layer is formed by flattening the last conv layer in which each nueron in the conv layer is connected to the vertical stack of nuerons in the FC layer