Dear Tutor,
I continue to inspect the RetinaNet Architecture through the object detection API and I established the following commands:
1) detection_model._feature_extractor.summary()
Model: “ResNet50V1_FPN”
Layer (type) Output Shape Param #
bottom_up_block5_conv (Conv multiple 589824
2D)
bottom_up_block5_batchnorm multiple 1024
(FreezableBatchNorm)
bottom_up_block5 (Lambda) multiple 0
bottom_up_block6_conv (Conv multiple 589824
2D)
bottom_up_block6_batchnorm multiple 1024
(FreezableBatchNorm)
bottom_up_block6 (Lambda) multiple 0
model (Functional) [(None, None, None, 256) 23561152
, (None, None, None, 512
),
(None, None, None, 1024
),
(None, None, None, 2048
)]
FeatureMaps (KerasFpnTopDow multiple 2099968
nFeatureMaps)
=================================================================
Total params: 26,842,816
Trainable params: 26,787,648
Non-trainable params: 55,168
- detection_model._feature_extractor.get_layer(‘model’).summary()
Here it gives me all ResNet50 backbone layers with the following outpus:
detection_model._feature_extractor.get_layer(‘model’).outputs
[<KerasTensor: shape=(None, None, None, 256) dtype=float32 (created by layer ‘conv2_block3_out’)>,
<KerasTensor: shape=(None, None, None, 512) dtype=float32 (created by layer ‘conv3_block4_out’)>,
<KerasTensor: shape=(None, None, None, 1024) dtype=float32 (created by layer ‘conv4_block6_out’)>,
<KerasTensor: shape=(None, None, None, 2048) dtype=float32 (created by layer ‘conv5_block3_out’)>]
- detection_model._feature_extractor.get_layer(‘FeatureMaps’).summary():
1
detection_model._feature_extractor.get_layer(‘FeatureMaps’).summary()
Model: “FeatureMaps”
Layer (type) Output Shape Param #
projection_3 (Conv2D) multiple 524544
projection_2 (Conv2D) multiple 262400
projection_1 (Conv2D) multiple 131328
nearest_neighbor_upsampling multiple 0
(Lambda)
nearest_neighbor_upsampling multiple 0
(Lambda)
smoothing_2_conv (Conv2D) multiple 589824
smoothing_2_batchnorm (Free multiple 1024
zableBatchNorm)
smoothing_2 (Lambda) multiple 0
smoothing_1_conv (Conv2D) multiple 589824
smoothing_1_batchnorm (Free multiple 1024
zableBatchNorm)
smoothing_1 (Lambda) multiple 0
=================================================================
---------------------------Let’s concentrate on this block ----------------------------
My questions:
- I need to determine the output features MAP from here but I can’t find how?
The inspection of this layers gives:
1 conv layer: print(detection_model._feature_extractor.get_layer(‘FeatureMaps’).top_layers ) #
Empty : print(detection_model._feature_extractor.get_layer(‘FeatureMaps’).reshape_blocks[0])
2 conv : print(detection_model._feature_extractor.get_layer(‘FeatureMaps’).residual_blocks) #
2 Lambda: print(detection_model._feature_extractor.get_layer(‘FeatureMaps’).top_down_blocks)# 2 lambda
- How could I determine the output of these layers ? What represent the residual block? Are they all fed to the box predictor?
Now when representing the input to the BOX_predictor through the following command:
detection_model._box_predictor._build_input_shape
This give 5 input tensors:
[TensorShape([1, 80, 80, 256]),
TensorShape([1, 40, 40, 256]),
TensorShape([1, 20, 20, 256]),
TensorShape([1, 10, 10, 256]),
TensorShape([1, 5, 5, 256])]
- The original paper talks about 3 only?
Sorry to be too long but I need to visualize the output of the feature extractor, Then determine how the box_predictor is using it,
I will be grateful if you can help, thanks a lot.
Best regards,
Olfa
So how could Inspect