Why model = Model(pre_trained_model.input, x) would include the layers before 'mixed7'?

Cheng_Zhang1 · February 24, 2024, 3:20pm

In “C2_W3_Lab_1_transfer_learning.ipynb” this file, model = Model(pre_trained_model.input, x) include all the layers before ‘mixed7’. This is not intuitive to me, as last_output only contains the weights/bias from ‘mixed7’, though I know these weights/bias all come from the beginning to ‘mixed7’. Later steps only add flatten, dense, dropout and dense to customise the end layers. Can I ask why TF designs the code in this way?

Cheng_Zhang1 · February 27, 2024, 5:49pm

Can I ask if someone can explain this to me?

Deepti_Prasad · February 27, 2024, 6:37pm

Sorry @Cheng_Zhang1 delay in response can you confirm if the lab you are talking about is from ungraded lab or assignment lab?

If ungraded lab, can you share it here

Cheng_Zhang1 · February 27, 2024, 6:41pm

This is it:

github.com

https-deeplearning-ai/tensorflow-1-public/blob/main/C2/W3/ungraded_lab/C2_W3_Lab_1_transfer_learning.ipynb

{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a href=\"https://colab.research.google.com/github/https-deeplearning-ai/tensorflow-1-public/blob/master/C2/W3/ungraded_lab/C2_W3_Lab_1_transfer_learning.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "bT0to3TL2q7H"
   },
   "source": [
    "# Ungraded Lab: Transfer Learning\n",
    "\n",
    "In this lab, you will see how you can use a pre-trained model to achieve good results even with a small training dataset. This is called _transfer learning_ and you do this by leveraging the trained layers of an existing model and adding your own layers to fit your application. For example, you can:\n",
    "\n",
    "1. just get the convolution layers of one model\n",

This file has been truncated. show original

Cheng_Zhang1 · March 2, 2024, 2:45pm

Can I ask this question again?

Deepti_Prasad · March 2, 2024, 2:57pm

I am sorry I missed your response, please let me go through lab and then respond.

Regards
DP

Deepti_Prasad · March 2, 2024, 4:27pm

Hello @Cheng_Zhang1

actually based on the lab you shared, it tells it include model upto the mixed7 and not before mixed 7, so it does include the mixed 7.

The whole reason of choosing selective part of an old trained model is to save time, inability to create a huge newer model as it has affect of cost effectiveness, and also trying to choose layers which is more focused in feature selections like convolution layer, freezing the weights of the some layer (which would not have an effect on accuracy part of an old trained model), is to reduce training time.

Again sorry for the delayed response, totally missed it as if you don’t tag me, I don’t get a notification.

Regards
DP

Cheng_Zhang1 · March 2, 2024, 5:23pm

@Deepti_Prasad Thank you for your help.

I think I understand the logic behind the code. What I do not follow is, why the code include the layers “up to” mixed7 (i.e. all the layers up to mixed7), instead of “only” mixed7 (i.e. only one mixed7 layer)?

This code “only” selects the mixed7 layer, not the layers before mixed7:
last_layer = pre_trained_model.get_layer(‘mixed7’)

These codes “only” add flatten, dense, dropout and dense to customise the end layers:
x = layers.Flatten()(last_output)
x = layers.Dense(1024, activation=‘relu’)(x)
x = layers.Dropout(0.2)(x)
x = layers.Dense (1, activation=‘sigmoid’)(x)

Then, why suddenly, this code would have the “layers before mixed7”, plus mixed7, plus the added customised end layers?
model = Model(pre_trained_model.input, x)

Deepti_Prasad · March 2, 2024, 5:39pm

@Cheng_Zhang1 it selects layers upto the mixed 7, and then selects specific convolution layer which has learn features in the old trained model, then it freezes the layer it doesn’t want to use. So basically it is basically choosing mixed 7 layer with only some part of dense network and not the whole network.

then it only add required layers like flatten layer because of we are using a new input shape and then convolution layer.

Also transfer learning advised to add only dense layers.

in this model statement
pre_trained_model = InceptionV3(input_shape = (150, 150, 3),
include_top = False,
weights = None)

x is the combination of pre_trained model with the selected convolution layer and mixed 7 layer as you could see in the flatten layer it include the last output from the previous code line.

I will give a general scenario from a doctor-patient app, say a patient goes from Mumbai to Bangalore for treatment, where his only information taken are demographic and necessary information like medical history, surgical history, medications, any habits, any recent illness but not each and every minute details of his past case history, and then they use this information for the present illness he has visits a doctor in bangalore, and his treatment will be based on information received and his current condition to make a better healthy outcome.
So here pre-trained model is patient with his brief case history and x is same patient with the present illness) is treated to get a healthy outcome.

Hope I didn’t confuse too much

Feel free to ask more doubts

Regards
DP

Cheng_Zhang1 · March 2, 2024, 6:12pm

@Deepti_Prasad Thank you for your help. Let me elaborate on my confusion in more details:

len(pre_trained_model.layers) = 311
i.e. it has 311 layers,
) input_1 (InputLayer)
) conv2d (Conv2D)
) batch_normalization
) …
) mixed0
) …
) mixed1
) …
) mixed2
) …
) mixed3
) …
) mixed4
) …
) mixed5
) …
) mixed6
) …
) mixed7
) …
) mixed8
) …
) mixed9
) …
) mixed10

The “last_layer” only refers to the ‘mixed7’ layer
last_layer = pre_trained_model.get_layer(‘mixed7’)

“last_output” is only the output of ‘mixed7’, not “all the layers up to mixed7”.
last_output = last_layer.output

Then, why suddenly, x is the combination of pre_trained model with the selected convolution layers and mixed7 layer?
Which code explicitly select the layers before mixed7?
x = layers.Flatten()(last_output)

Deepti_Prasad · March 3, 2024, 7:42am

Hello @Cheng_Zhang1

this part has all the layers upto mixed7 where in layer which were not used were freezed

this last output code is a combination of pre_trained_model with the mixed7

x becomes combination of pre_trained model with selected upto mixed 7 by the recalled function call “last_output” being used in the architecture in
flatten layer where it flatten the last_output, and then further passed through newly assigned dense layer. Hence x becomes a combination of pre_trained_model with new dense layer added.

Regards
DP

Cheng_Zhang1 · March 3, 2024, 10:50am

@Deepti_Prasad

As shown in the screenshot, you can see, the ‘pre_trained_model’ has 311 layers, including layers before and after mixed7.

Then, after the code of ‘last_layer’ and ‘last_output’, the ‘pre_trained_model’ still has 311 layers, i.e. before and after mixed7.

So, why do you say, “pre_trained_model: this part has all the layers upto mixed7”?

Deepti_Prasad · March 3, 2024, 3:03pm

upto mixed 7 because it is mentioned in the lab, which means it includes the old model upto mixed 7.

the last layer code is creating the last layer of the base model being used in the new model architecture being created.

if you are asking why the last layer output was created, then the simple reasoning is to further use the base model to create the new model.

Cheng_Zhang1 · March 3, 2024, 5:52pm

My understanding is, the last layer only creates the mixed7 layer, not the layers before mixed7 from InceptionV3.

@lmoroney Could you please take a look at my question? Thank you very much!

TMosh · March 3, 2024, 7:27pm

I’m not sure that tagging lmoroney is going to do anything useful. He doesn’t appear to be active here. He’s been a member for 6 months, and has visited the DLAI forum exactly once since then.

Cheng_Zhang1 · March 3, 2024, 8:25pm

Thank you for noting this. Can you comment on my question? Or, may I ask if I have well expressed my confusion?

TMosh · March 3, 2024, 8:31pm

I haven’t commented on this thread before, because I do not understand the confusion.

Cheng_Zhang1 · March 3, 2024, 8:35pm

My question is, which line of code sets the model to have a subset of layers of InceptionV3 from the beginning to mixed7?

Deepti_Prasad · March 4, 2024, 8:57am

@Cheng_Zhang1

the output of the pre_trained model summary, would show you all the mixed layers as that code line taking information from the old model asa whole.

then the next line of last layer, is stating to use pre_trained model only till mixed7 and not beyond that.

Regards
DP

Cheng_Zhang1 · March 4, 2024, 11:33am

But I do not think “the next line of last layer, is stating to use pre_trained model only till mixed7 and not beyond that”.

You can see that, the last layer output shape is (None, 7, 7, 768), which only refers to mixed7, NOT including the layers before mixed7.

Topic		Replies	Views
Create a new model using tranfer learning. It works.... but why? Convolutional Neural Networks in TensorFlow week-module-2	3	481	March 31, 2023
Why choosing mixed7 as the last layer? Advanced Computer Vision with TensorFlow week-module-1	2	596	September 21, 2021
C3_W1_Lab_1_transfer_learning_cats_dogs.ipynb- problem TF-AT Resources	1	135	October 17, 2023
C2_W3_Lab_1_transfer_learning Question TF Developer Professional Certificate Resources	1	178	August 9, 2023
Transfer Learning - layers Convolutional Neural Networks in TensorFlow week-module-3	2	522	November 16, 2022

Why model = Model(pre_trained_model.input, x) would include the layers before 'mixed7'?

Related topics