Inheriting from Layer vs Model for recurring building blocks

piliv · March 14, 2022, 9:49pm

In the lecture, Laurence talks about defining custom Model classes for recurring building blocks / sub-networks in a larger model architecture. But in the ResNet-like example, he uses Layer as base class for CNNResidual and DNNResidual. At first sight, I found a Layer containing itself potentially many many Layers a little counter-intuitive, but admittedly it seems arbitrary to worry about sub-Layers within Layers, while accepting the concept of sub-Models within Models…
To the point: what are reasons, or good rules of thumb, for choosing one base class (Layer) over the other (Model) for such building blocks?

ai_curious · March 14, 2022, 10:38pm

I would say it depends on how you think you might (re)use them. In the class hierarchy a Model is-a Layer. A Layer is-a Module, and thus so is a Model. However, a Layer is NOT a Model. A Layer has no compile() or fit() API, for example. In this sense Model extends Layer and adds behavior. Model also extends the state of Layer - it has an attribute that is a collection of Layers. Layer, by default, does not, though you can add it to one you define yourself.

The more likely you are to use the object standalone as a container of multiple Layers, and do training on it as a single unit, the more you should reuse what Model brings along. The more likely you are to use the object as a stackable component in a larger context, especially if you will use more than one instance in that context, and train only as part of a larger whole, the more you would prefer Layer. Does that make sense?

piliv · March 14, 2022, 11:44pm

Yes, that makes a lot of sense. So, avoid the extra baggage that Model brings with it (on top of Layer) unless it’s actually needed. And think of a Layer more freely as some encapsulated structural unit, rather than literally as “one layer”. Anyway, plenty of food for thought. Thanks!

gent.spah · March 15, 2022, 9:36am

This is a similar question to the one we have a bit further down on this page:

ai_curious · March 15, 2022, 11:21am

Glad we both came to the same conclusion in both places

I like this comment string from the Model class source on github

`Model` groups layers into an object with training and inference features.

Here is the corresponding description from class Layer

This is the class from which all layers inherit.

A layer is a callable object that takes as input one or more tensors and that outputs one or more tensors. It involves *computation*, defined in the `call()` method, and a *state* (weight variables).

I also notice that while both Layer and Model accept one or more tensor inputs upon which they perform operations to produce one or more tensor outputs, Model requires a specific type of input tensor, which is an instance of class InputLayer.

I can really give myself a headache reading this code. Looks to me like keras.Input() is a factory method that builds instances of InputLayer. It actually returns the input’s outputs

def Input():
…

input_layer = InputLayer(**input_layer_config)
…

outputs = input_layer._inbound_nodes[0].outputs
  if isinstance(outputs, list) and len(outputs) == 1:
    return outputs[0]
  else:
    return outputs

So a Model’s inputs are the output(s) of an InputLayer. While a Layer’s inputs, as best as I can tell, can be any tensor, or dict/list/tuple of tensors.

gent.spah · March 15, 2022, 1:44pm

This is a very detailed and interesting explanation of the question, thank you for your effort. Its just wonderful that your oversight went into this depth!

piliv · March 15, 2022, 5:58pm

Thank you both for the insightful discussion, and sorry for missing the related earlier discussion (When to use Layers or Model as parent class) – I went straight for the “Week 4” sub-forum and overlooked the general one…
Interesting point about the requirement for a Model’s input to be InputLayers – even though that seems to apply primarily when using Model “neat”, within the context of the Functional API. It is no longer the case for the custom-define subclasses derived in the W4 lectures/lab to construct ResNet (given the specific way the init and call methods are defined). So at least, custom Model subclasses can be made to stack/re-combine as flexibly as custom Layers (but of course with much additional functionality/overhead that may not actually be needed). Does that sound fair?

gent.spah · March 15, 2022, 6:05pm

Generally speaking it can be fair to say that (but I am not sure if there are many practical applications of
“Model subclasses can be made to stack/re-combine”).

ai_curious · March 31, 2022, 2:50pm

I just came across an interesting use case in the NLP domain. Here, a Layer is instantiated and then executed inside a user defined function…it isn’t part of a Model at all. Notionally, it looks like this…

def vectorize_text(vectorize_layer, text):
  text = tf.expand_dims(text, -1)
  return vectorize_layer(text) #<== invokes vectorize_layer.call() ==

my_vectorization_layer = tf.keras.layers.TextVectorization(
    standardize=custom_standardization,
    max_tokens=max_features,
    output_mode='int',
    output_sequence_length=sequence_length)

some_text_as_vector = vectorize_text(my_vectorization_layer, some_text)

The TensorFlow doc for Layer starts its narrative with A layer is a callable object…. So you can use the instance call() method to do something, eg vectorize text, perform a convolution, etc. without ever building or training a full model.

In the NLP case, I had a Sequential model that was defined to accept vectorized strings into its first, Embedding, layer. That means the inputs look like this…

Vectorized review (<tf.Tensor: shape=(1, 250), dtype=int64, numpy=
array([[  86,   17,  260,    2,  222,    1,  571,   31,  229,   11, 2418,
           1,   51,   22,   25,  404,  251,   12,  306,  282,    0,    0,
           0,    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,
           0,    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,

which is not particularly human-readable. But by defining a new model that wraps the old one and add the text vectorization layer, you can directly pass in strings…

requres_vectorized_input_model = tf.keras.Sequential([
  layers.Embedding(max_features + 1, embedding_dim),
  layers.Dropout(0.2),
  layers.GlobalAveragePooling1D(),
  layers.Dropout(0.2),
  layers.Dense(1)])

accepts_unvectorized_strings_model = tf.keras.Sequential([
  my_vectorization_layer,
  requres_vectorized_input_model,
  layers.Activation('sigmoid')
])

I take the first model, which requires vectorization, and stack it below the instance of tf.keras.layers.TextVectorization that I built above. So some interesting ideas about creating a Layer as a Layer, then combining that with an existing Model as a Layer to make a new Model. The standalone Layer instance call() function can also be invoked in procedural code regardless of it having ever been used in a model or had its weights trained.

Now I can pass in human readable text strings and transparently see how the sentiment analysis is turning them into floating point values using primarily encoding in the Embedding layer.

examples = [
    "The movie was great. Hilarious. Beautiful!",
    "The movie was pretty good",
    "The movie was okay.",
    "The movie was terrible",
    "The worst most awful ridiculous pathetic movie ever!"
]

accepts_unvectorized_strings_model.predict(examples)

array([[0.7906576 ],
       [0.5475819 ],
       [0.4600771 ],
       [0.37825677],
       [0.08336403]], dtype=float32)

gent.spah · March 31, 2022, 3:31pm

Interesting stuff, interesting application. It seems to be very easy to predict accurately sentiment by using a simple notebook without complex transformations.

I think that many interesting combinations can be created with the use of layers for different applications. I had the idea that layers can be used independently of a model, to let’s say transform data through, which could be used further down the processing stream, but still I think in order to achieve a full learning cycle you need the properties of the model.

ai_curious · March 31, 2022, 3:35pm

Exactly. There is no learning when using a Layer outside a model…just computation.

Topic		Replies	Views
When to use Layers or Model as parent class Custom Models, Layers and Loss Functions with TF	6	289	November 21, 2021
Help with building custom model Custom Models, Layers and Loss Functions with TF week-3	8	483	August 24, 2023
Using factory functions or Class implementations. Just a matter of style? Custom Models, Layers and Loss Functions with TF week-2	2	358	November 18, 2023
What exactly does this command do. model2.layers[4] Convolutional Neural Networks	2	626	April 2, 2022
Residual network in Tensorflow advance Techniques Custom Models, Layers and Loss Functions with TF week-4	3	597	August 5, 2023

Inheriting from Layer vs Model for recurring building blocks

Related topics