Why model = Model(pre_trained_model.input, x) would include the layers before 'mixed7'?

In “C2_W3_Lab_1_transfer_learning.ipynb” this file, model = Model(pre_trained_model.input, x) include all the layers before ‘mixed7’. This is not intuitive to me, as last_output only contains the weights/bias from ‘mixed7’, though I know these weights/bias all come from the beginning to ‘mixed7’. Later steps only add flatten, dense, dropout and dense to customise the end layers. Can I ask why TF designs the code in this way?

Can I ask if someone can explain this to me?

Sorry @Cheng_Zhang1 delay in response can you confirm if the lab you are talking about is from ungraded lab or assignment lab?

If ungraded lab, can you share it here

This is it:

Can I ask this question again?

I am sorry I missed your response, please let me go through lab and then respond.

Regards
DP

Hello @Cheng_Zhang1

actually based on the lab you shared, it tells it include model upto the mixed7 and not before mixed 7, so it does include the mixed 7.

The whole reason of choosing selective part of an old trained model is to save time, inability to create a huge newer model as it has affect of cost effectiveness, and also trying to choose layers which is more focused in feature selections like convolution layer, freezing the weights of the some layer (which would not have an effect on accuracy part of an old trained model), is to reduce training time.

Again sorry for the delayed response, totally missed it as if you don’t tag me, I don’t get a notification.

Regards
DP

@Deepti_Prasad Thank you for your help.

I think I understand the logic behind the code. What I do not follow is, why the code include the layers “up to” mixed7 (i.e. all the layers up to mixed7), instead of “only” mixed7 (i.e. only one mixed7 layer)?

This code “only” selects the mixed7 layer, not the layers before mixed7:
last_layer = pre_trained_model.get_layer(‘mixed7’)

These codes “only” add flatten, dense, dropout and dense to customise the end layers:
x = layers.Flatten()(last_output)
x = layers.Dense(1024, activation=‘relu’)(x)
x = layers.Dropout(0.2)(x)
x = layers.Dense (1, activation=‘sigmoid’)(x)

Then, why suddenly, this code would have the “layers before mixed7”, plus mixed7, plus the added customised end layers?
model = Model(pre_trained_model.input, x)

@Cheng_Zhang1 it selects layers upto the mixed 7, and then selects specific convolution layer which has learn features in the old trained model, then it freezes the layer it doesn’t want to use. So basically it is basically choosing mixed 7 layer with only some part of dense network and not the whole network.

then it only add required layers like flatten layer because of we are using a new input shape and then convolution layer.

Also transfer learning advised to add only dense layers.

in this model statement
pre_trained_model = InceptionV3(input_shape = (150, 150, 3),
include_top = False,
weights = None)

x is the combination of pre_trained model with the selected convolution layer and mixed 7 layer as you could see in the flatten layer it include the last output from the previous code line.

I will give a general scenario from a doctor-patient app, say a patient goes from Mumbai to Bangalore for treatment, where his only information taken are demographic and necessary information like medical history, surgical history, medications, any habits, any recent illness but not each and every minute details of his past case history, and then they use this information for the present illness he has visits a doctor in bangalore, and his treatment will be based on information received and his current condition to make a better healthy outcome.
So here pre-trained model is patient with his brief case history and x is same patient with the present illness) is treated to get a healthy outcome.

Hope I didn’t confuse too much :slight_smile:

Feel free to ask more doubts

Regards
DP

@Deepti_Prasad Thank you for your help. Let me elaborate on my confusion in more details:

len(pre_trained_model.layers) = 311
i.e. it has 311 layers,
) input_1 (InputLayer)
) conv2d (Conv2D)
) batch_normalization
) …
) mixed0
) …
) mixed1
) …
) mixed2
) …
) mixed3
) …
) mixed4
) …
) mixed5
) …
) mixed6
) …
) mixed7
) …
) mixed8
) …
) mixed9
) …
) mixed10

The “last_layer” only refers to the ‘mixed7’ layer
last_layer = pre_trained_model.get_layer(‘mixed7’)

“last_output” is only the output of ‘mixed7’, not “all the layers up to mixed7”.
last_output = last_layer.output

Then, why suddenly, x is the combination of pre_trained model with the selected convolution layers and mixed7 layer?
Which code explicitly select the layers before mixed7?
x = layers.Flatten()(last_output)

Hello @Cheng_Zhang1

this part has all the layers upto mixed7 where in layer which were not used were freezed

this last output code is a combination of pre_trained_model with the mixed7

x becomes combination of pre_trained model with selected upto mixed 7 by the recalled function call “last_output” being used in the architecture in
flatten layer where it flatten the last_output, and then further passed through newly assigned dense layer. Hence x becomes a combination of pre_trained_model with new dense layer added.

Regards
DP

@Deepti_Prasad

As shown in the screenshot, you can see, the ‘pre_trained_model’ has 311 layers, including layers before and after mixed7.

Then, after the code of ‘last_layer’ and ‘last_output’, the ‘pre_trained_model’ still has 311 layers, i.e. before and after mixed7.

So, why do you say, “pre_trained_model: this part has all the layers upto mixed7”?

upto mixed 7 because it is mentioned in the lab, which means it includes the old model upto mixed 7.

the last layer code is creating the last layer of the base model being used in the new model architecture being created.

if you are asking why the last layer output was created, then the simple reasoning is to further use the base model to create the new model.

My understanding is, the last layer only creates the mixed7 layer, not the layers before mixed7 from InceptionV3.

@lmoroney Could you please take a look at my question? Thank you very much!

1 Like

I’m not sure that tagging lmoroney is going to do anything useful. He doesn’t appear to be active here. He’s been a member for 6 months, and has visited the DLAI forum exactly once since then.

Thank you for noting this. Can you comment on my question? Or, may I ask if I have well expressed my confusion?

I haven’t commented on this thread before, because I do not understand the confusion.

My question is, which line of code sets the model to have a subset of layers of InceptionV3 from the beginning to mixed7?

@Cheng_Zhang1

the output of the pre_trained model summary, would show you all the mixed layers as that code line taking information from the old model asa whole.

then the next line of last layer, is stating to use pre_trained model only till mixed7 and not beyond that.

Regards
DP

But I do not think “the next line of last layer, is stating to use pre_trained model only till mixed7 and not beyond that”.

You can see that, the last layer output shape is (None, 7, 7, 768), which only refers to mixed7, NOT including the layers before mixed7.