In this jupyter notebook we are taking the inception_v3 model, removing its final layers and replacing them with our own DNN. What confuses me is how we are doing this exactly.
What I’m reading in this code looks redundant to me. First it appears that tensorflow has the inception model already available as tensorflow.keras.applications.inception_v3
, however then we install it using wget
and install a specific version that includes notop
in the filename? If we are downloading it, why are we also importing a library at the top of the code?
from tensorflow.keras.optimizers import RMSprop
from tensorflow.keras import Model
from tensorflow.keras.applications.inception_v3 import InceptionV3
from tensorflow.keras import layers
# Download the pre-trained weights. No top means it excludes the fully connected layer it uses for classification.
!wget --no-check-certificate \
https://storage.googleapis.com/mledu-datasets/inception_v3_weights_tf_dim_ordering_tf_kernels_notop.h5 \
-O /tmp/inception_v3_weights_tf_dim_ordering_tf_kernels_notop.h5
# Set the weights file you downloaded into a variable
local_weights_file = '/tmp/inception_v3_weights_tf_dim_ordering_tf_kernels_notop.h5'
Now we are using the imported InceptionV3
method, and then loading the weights into it? There’s no explanation for why this occurs in the notebook. Does the notop.h5
file provide weights and the InceptionV3
method provide something else? Why couldn’t we provide these weights directly to a tensorflow.keras.Model
method?
# Initialize the base model.
# Set the input shape and remove the dense layers.
pre_trained_model = InceptionV3(input_shape = (150, 150, 3),
include_top = False,
weights = None)
# Load the pre-trained weights you downloaded.
pre_trained_model.load_weights(local_weights_file)
# Freeze the weights of the layers.
for layer in pre_trained_model.layers:
layer.trainable = False
Here we now select a specific layer of the model, but it’s unclear if this returns all prior layers of the model by doing this. I imagine we are just selecting a given layer, and part of the layer object metadata is information about the prior layer that connects with it, but that’s not clear.
# Choose `mixed7` as the last layer of your base model
last_layer = pre_trained_model.get_layer('mixed7')
print('last layer output shape: ', last_layer.output_shape)
last_output = last_layer.output
It now looks like we take the output from this selected layer and then provide it as an input to the layers of our new DNN. We then create a new model with the pre_trained_model.input
object, but here it’s not clear if the layers that existed after the 'mixed7'
layer in the original model are also included with the pre_trained_model.input
object.
# Flatten the output layer to 1 dimension
x = layers.Flatten()(last_output)
# Add a fully connected layer with 1,024 hidden units and ReLU activation
x = layers.Dense(1024, activation='relu')(x)
# Add a dropout rate of 0.2
x = layers.Dropout(0.2)(x)
# Add a final sigmoid layer for classification
x = layers.Dense (1, activation='sigmoid')(x)
# Append the dense network to the base model
model = Model(pre_trained_model.input, x)
# Print the model summary. See your dense network connected at the end.
model.summary()
Our new model seems to include the added DNN layer but excludes any layers that existed after 'mixed7'
layer in the original model. I’m just not clear on how those layers are being excluded.
mixed7 (Concatenate) (None, 7, 7, 768) 0 ['activation_154[0][0]',
'activation_157[0][0]',
'activation_162[0][0]',
'activation_163[0][0]']
flatten (Flatten) (None, 37632) 0 ['mixed7[0][0]']
dense (Dense) (None, 1024) 38536192 ['flatten[0][0]']
dropout (Dropout) (None, 1024) 0 ['dense[0][0]']
dense_1 (Dense) (None, 1) 1025 ['dropout[0][0]']
==================================================================================================
Total params: 47,512,481
Trainable params: 38,537,217
Non-trainable params: 8,975,264