Checkpoint object detection api

Dear all,

I am trying to understand these lines of code:

It is clear that detection_model._box_predictor is a class WeightSharedConvolutionalBoxPredictor BUT when inspecting the code of , I can’t find an attribute ._base_tower_layers_for_heads, COuld I have more about where this attribute is defined from parent classes or from where please? I need to inspect the model layers of the retinaNet model. Thanks for help

Hello @Olfa,

Is this the correct specialisation and course selected for your topic?

Seems like it is from TensorFlow Advanced Techniques Course 3 Week 2 Zombie Detection assignment!!!

If yes kindly change it to the one related to your topic so your issue is addressed promptly by respective course mentors. Let me know if the topic is related to the one I asked about as I am one of the mentor for the same.

Also kindly share the attribute error you are getting related to box predictors.


1 Like

Thank you for your answers.
Yes it is . And I changed the subject accordingly.

No error is found But I want to inspect the model layers and attributes.
Furthermore, I noticed that only the two layers defined by :

var WeightSharedConvolutionalBoxPredictor/WeightSharedConvolutionalBoxHead/BoxPredictor/kernel:0
var WeightSharedConvolutionalBoxPredictor/WeightSharedConvolutionalBoxHead/BoxPredictor/bias:0
var WeightSharedConvolutionalBoxPredictor/WeightSharedConvolutionalClassHead/ClassPredictor/kernel:0
var WeightSharedConvolutionalBoxPredictor/WeightSharedConvolutionalClassHead/ClassPredictor/bias:0

While the other layers and variables related to box_predictor Tower are not finetuned while in the course it says that both subnets of classification and predictor are retrained, am I understanding well the finetuning?

1 Like

Hello @Olfa ,

Okay good to have doubts and discussion.

The reason only two layers were defined is because of fine tuning as the model required to make better predictions based on selective features.

I am sharing few images from the assignment which mentions this part.

This one at the begginning

Then were all layers were defined

Then while fine tuning all the 4 variable the expected output was the two layers you are mentioning

Another reason I felt is because this prediction model is being made with only 5 training images, small batch, So inclusion of box predictors related bias were fine tuned.

Yes one can use both subnets for classification and predictors for retraining depending on dataset, images used, model, batch size and desired prediction and object detection.

So it is not mandatory to use all the layers for model training in detection of desired object.

Let me know if you have any further doubts.


1 Like

Ok Thanks a lot, this is clear now,

An other question concern attributes of WeightSharedConvolutionalBoxPredictor class since detection_model._box_predictor is a class WeightSharedConvolutionalBoxPredictor`. How did you determine that _base_tower_layers_for_heads is an attribute of this class, it is not defined in nor in the parent class, could you help to clarify the structure of this class.

Thanks again

1 Like

Hello @Olfa,

This explanation going to be lengthy one as I am explaining you from beginning.

Once we restore weights from checkpoint, we use parts of RetinaNet to reuse for
feature extraction layers and bounding box regression layer. I am sharing images for each procedure being done in the assignment for better understanding

Once model is detected, we check the class variable in the detection model which shows how box predictor is related WeightSharedConvolutionalBoxPredictor

Further _box_predictor is inspected for the class type which gives the input you are mentioning object_detection.predictors.convolutional_keras_box_predictor.WeightSharedConvolutionalBoxPredictor

After inspection, variables are checked for the box_predictor, which gives the below details

So when one inspects the _box_prediction_head and _predictions head at convolution file, it will point towards the bounding box predictions and the class_predict_heads which is a list of heads that predict the classes.

Hence _box_predictor._prediction_heads contain a dictionary that points to both the prediction layers (bounding boxes and the class(category)).

Now while defining checkpoints for the desired layers, in this checkpoints for box predictor and then for the model which directs you to this box predictor as well as the prediction layers, model directs to the prediction box head from previously defined or determined bounding box

Then one defines box_predictors_checkpoints to be checkpoint for these two layers of the detection_model box predictor by using tf.train.Checkpoint (which know that the box prediction head is the prediction layer for bounding boxes which gives you the same variable in the previous exercise when one was trying to restore the checkpoint and inspecting box_prediction head i.e.

I know it is lengthy read, read it again and again on how it is related. Feel free to ask if it is confusing or more doubts.


1 Like

Thank You for this clear and thorough explanation :tulip:

1 Like

Happy to discuss!!!

Keep Learning!!!

1 Like

Dear Tutor,

I continue to inspect the RetinaNet Architecture through the object detection API and I established the following commands:

1) detection_model._feature_extractor.summary()

Model: “ResNet50V1_FPN”

Layer (type) Output Shape Param #

bottom_up_block5_conv (Conv multiple 589824

bottom_up_block5_batchnorm multiple 1024

bottom_up_block5 (Lambda) multiple 0

bottom_up_block6_conv (Conv multiple 589824

bottom_up_block6_batchnorm multiple 1024

bottom_up_block6 (Lambda) multiple 0

model (Functional) [(None, None, None, 256) 23561152
, (None, None, None, 512
(None, None, None, 1024
(None, None, None, 2048

FeatureMaps (KerasFpnTopDow multiple 2099968

Total params: 26,842,816
Trainable params: 26,787,648
Non-trainable params: 55,168

  1. detection_model._feature_extractor.get_layer(‘model’).summary()

Here it gives me all ResNet50 backbone layers with the following outpus:
[<KerasTensor: shape=(None, None, None, 256) dtype=float32 (created by layer ‘conv2_block3_out’)>,
<KerasTensor: shape=(None, None, None, 512) dtype=float32 (created by layer ‘conv3_block4_out’)>,
<KerasTensor: shape=(None, None, None, 1024) dtype=float32 (created by layer ‘conv4_block6_out’)>,
<KerasTensor: shape=(None, None, None, 2048) dtype=float32 (created by layer ‘conv5_block3_out’)>]

  1. detection_model._feature_extractor.get_layer(‘FeatureMaps’).summary():



Model: “FeatureMaps”

Layer (type) Output Shape Param #

projection_3 (Conv2D) multiple 524544

projection_2 (Conv2D) multiple 262400

projection_1 (Conv2D) multiple 131328

nearest_neighbor_upsampling multiple 0

nearest_neighbor_upsampling multiple 0

smoothing_2_conv (Conv2D) multiple 589824

smoothing_2_batchnorm (Free multiple 1024

smoothing_2 (Lambda) multiple 0

smoothing_1_conv (Conv2D) multiple 589824

smoothing_1_batchnorm (Free multiple 1024

smoothing_1 (Lambda) multiple 0


---------------------------Let’s concentrate on this block ----------------------------

My questions:

  • I need to determine the output features MAP from here but I can’t find how?

The inspection of this layers gives:
1 conv layer: print(detection_model._feature_extractor.get_layer(‘FeatureMaps’).top_layers ) #
Empty : print(detection_model._feature_extractor.get_layer(‘FeatureMaps’).reshape_blocks[0])
2 conv : print(detection_model._feature_extractor.get_layer(‘FeatureMaps’).residual_blocks) #
2 Lambda: print(detection_model._feature_extractor.get_layer(‘FeatureMaps’).top_down_blocks)# 2 lambda

  • How could I determine the output of these layers ? What represent the residual block? Are they all fed to the box predictor?

Now when representing the input to the BOX_predictor through the following command:


This give 5 input tensors:

[TensorShape([1, 80, 80, 256]),
TensorShape([1, 40, 40, 256]),
TensorShape([1, 20, 20, 256]),
TensorShape([1, 10, 10, 256]),
TensorShape([1, 5, 5, 256])]

  • The original paper talks about 3 only?

Sorry to be too long but I need to visualize the output of the feature extractor, Then determine how the box_predictor is using it,

I will be grateful if you can help, thanks a lot.

Best regards,

So how could Inspect

1 Like

This isn’t course assignment specific right?

No it isn’t in the course assignment

But it is related to the understanding of RetinaNet implementation in the tensorflow API

I don’t If it is here that I can discuss about this topic.

Thank You again

1 Like

Ofcourse you can discuss here.

Okay now when you are asking you need to determine the output features MAP, there must have been some criterion given to you.

your feature extraction would be depend on your checkpoints for the model configuration you have created based on the object you are trying to detect.

For more clearer picture, one needs to understand when you showing feature extractor MAP, on what basis/criterion do you want to predict you box predictor.

from question I understood, you have mapped out all the features and then you have a question how does one feed which feature needs to be used by box predictor, right?

I am also tagging a person who can also input his view @ai_curious. He loves to talk about prediction box :slight_smile:

Kevin can you look at her trailing question and give your feedback and review too.


1 Like

Yes, I am using the same configuration, Object detection for one class , using the same model RetinaNet (Resnet 50 + FPN + 640x640 image) defined in the Zombie - Duck Assignments, No modification to the model is done

It seems there 5 output layers are defined and they should all interact with the box_predictor (1 conv_layer defined by the top_layers, 2 conv layers defined in the residual_blocks and 2 lambda layers defined by the top_down_blocks),

I understand that we have more than output from the feature extractor that should be fed in the box_predictor (to ensure the multi-resolution aspect). But what I noticed that here we have 5 layers as I understand, while in the main implementation they are 3 issued from the FPN?

More interestingly, I need to visualize the different tensors issued from these 5 layers. Moreover, If I want to inspect this block for example:


I get : ListWrapper([<keras.layers.core.lambda_layer.Lambda object at 0x7f84ec3023d0>]) <keras.layers.core.lambda_layer.Lambda object at 0x7f84ec3023d0>, It seems that layers are stored in ListWrapper, What is ListWrapper? How could a model manage layers stored on ListWrappers?

Thank you for your answers

What I am not ab

1 Like

Hello Olfa,

can you share the images you are talking about.

The reason you cannot visualise tensor for these layers as the lambda layer output is not a tensor.

I can tell you more appropriately if I get to see what are you doing with the Resnet50.

You also mentioned you didn’t change anything with the model.

Can I know what is the output you are getting for the above feature Map.

One needs to understand in the assignment, the zombie class was defined before prediction box was looking upon.

So I need to know if you classified your Zombie-Duck assignment (probably at that time same model would not be suited)

Kindly share images for the steps you are doing for your feature extraction.


1 Like