Course 4, week 3 Assigment 2, exercise 1

Hello,

I am a bit confused with this exercise:
{Moderator Edit: Solution Code Removed}
I dont get why we initialize twice the same convolutional layer.

Also I dont get how we initialize MaxPooling2D and dropout? why are we giving conv as an argument here? what is class and what is object there?
what is actually happening when we write this: next_layer = MaxPooling2D(pool_size=(2,2))(conv)

Have another look at the architecture of the network that we are implementing here. When you cascade two conv layers back to back, just because they have the same hyperparameters does not mean that it doesn’t make sense or that it’s a NOP. You are applying another conv layer to the output you get from the first conv layer.

Then we optionally have a dropout layer and also optionally a max pooling layer.

The point is that Keras “Layer” functions (subclasses of Layer) take a set of hyperparameters and then return you an instance of the actual function, which you then invoke with a tensor as input and it produces a tensor as output. In this instance the result of this call:

MaxPooling2D(pool_size=(2,2))

is a function. Then you invoke that function with the input tensor conv and it returns an output tensor.

Please see this thread for a good explanation of how to use both the Keras Sequential and Functional models.

Hello @Anja_Petrovic!

I am deleting the image you posted as it has solution code and sharing it is not allowed. Also, you posted this in the General Category but it belongs to DLS course 4, right? Consider moving it to the correct category. Read this on how to do that.

Best,
Saif.

As you go further on the deep learning journey, it probably makes sense at some point to begin answering questions like this by looking directly at the source code, which is available on github. Specifically, https://github.com/keras-team/keras/blob/v2.12.0/keras/layers/pooling/max_pooling2d.py#L26

Here is how I would decompose and answer your question using that reference material.

MaxPooling2D is a Python class, a defined type. It extends the class Pooling2D, which in turn extends the class Layer. MaPooling2D is-a Pooling2D is-a Layer.

Of interest to us for this discussion, MaxPooling2D has an init() function, and a call() function in its public API.

MaxPooling2D(pool_size=(2,2)) invokes the init() function, which looks like this

    def __init__(
        self,
        pool_size=(2, 2),
        strides=None,
        padding="valid",
        data_format=None,
        **kwargs
    ):

Here the pool_size parameter is being passed in explicitly, even though it is the same as the default, which is (2, 2). The init() function immediately calls the base Pooling2D class init() function which immediately calls the base Layer class init() function (and so on up the inheritance hierarchy). The result of this chain of init calls is a fully specified and initialized object instance. You can think of it as doing this:

max_pool_2d_instance = tf.keras.layers.MaxPooling2D(pool_size=(2, 2))

Now that you have an object instance, you can invoke its call() function. Here’s what the source of that looks like…

    def call(self, inputs):
        if self.data_format == "channels_last":
            pool_shape = (1,) + self.pool_size + (1,)
            strides = (1,) + self.strides + (1,)
        else:
            pool_shape = (1, 1) + self.pool_size
            strides = (1, 1) + self.strides
        outputs = self.pool_function(
            inputs,
            ksize=pool_shape,
            strides=strides,
            padding=self.padding.upper(),
            data_format=conv_utils.convert_data_format(self.data_format, 4),
        )
        return outputs

tl;dr is you invoke the call() function and pass the inputs, getting back the outputs. In Python …

max_pooling_layer_inputs = assume_this_was_defined_already

max_pooling_outputs = max_pool_2d_instance(max_pooling_layer_inputs)

It turns out that Python allows you write expressions tersely, and combine multiple steps into a single expression. In this case, you can combine the steps for invoking init() and call() into a single line of code. And since in this exercise the max_pooling_inputs in none other than the outputs from the conv layer, you can write the entire thing as …

max_pooling_outputs = MaxPooling2D(pool_size = (2, 2))(conv)

For convenience of writing (though not of reading) you often see Keras Functional models using the single variable x for both the input and the output variables. Think of that as x on the left hand side of the assignment is an output, which is then used on the right hand side of the assignment as an input to a subsequent layer.

conv_layer_output = Conv2D.(...)(conv_layer_input)
pooling_layer_output = MaxPooling2D(...)(conv_layer_output)
dropout_later_output = Dropout(...)(pooling_layer_output)

is often written instead as

x = InputLayer(...)
x = Conv2D(...)(x)
x = MaxPooling2D(...)(x)
x = Dropout(...)(x)

HTH