Inheriting from Layer vs Model for recurring building blocks

I just came across an interesting use case in the NLP domain. Here, a Layer is instantiated and then executed inside a user defined function…it isn’t part of a Model at all. Notionally, it looks like this…

def vectorize_text(vectorize_layer, text):
  text = tf.expand_dims(text, -1)
  return vectorize_layer(text) #<== invokes vectorize_layer.call() ==

my_vectorization_layer = tf.keras.layers.TextVectorization(
    standardize=custom_standardization,
    max_tokens=max_features,
    output_mode='int',
    output_sequence_length=sequence_length)

some_text_as_vector = vectorize_text(my_vectorization_layer, some_text)

The TensorFlow doc for Layer starts its narrative with A layer is a callable object…. So you can use the instance call() method to do something, eg vectorize text, perform a convolution, etc. without ever building or training a full model.

In the NLP case, I had a Sequential model that was defined to accept vectorized strings into its first, Embedding, layer. That means the inputs look like this…

Vectorized review (<tf.Tensor: shape=(1, 250), dtype=int64, numpy=
array([[  86,   17,  260,    2,  222,    1,  571,   31,  229,   11, 2418,
           1,   51,   22,   25,  404,  251,   12,  306,  282,    0,    0,
           0,    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,
           0,    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,

which is not particularly human-readable. But by defining a new model that wraps the old one and add the text vectorization layer, you can directly pass in strings…

requres_vectorized_input_model = tf.keras.Sequential([
  layers.Embedding(max_features + 1, embedding_dim),
  layers.Dropout(0.2),
  layers.GlobalAveragePooling1D(),
  layers.Dropout(0.2),
  layers.Dense(1)])

accepts_unvectorized_strings_model = tf.keras.Sequential([
  my_vectorization_layer,
  requres_vectorized_input_model,
  layers.Activation('sigmoid')
])

I take the first model, which requires vectorization, and stack it below the instance of tf.keras.layers.TextVectorization that I built above. So some interesting ideas about creating a Layer as a Layer, then combining that with an existing Model as a Layer to make a new Model. The standalone Layer instance call() function can also be invoked in procedural code regardless of it having ever been used in a model or had its weights trained.

Now I can pass in human readable text strings and transparently see how the sentiment analysis is turning them into floating point values using primarily encoding in the Embedding layer.

examples = [
    "The movie was great. Hilarious. Beautiful!",
    "The movie was pretty good",
    "The movie was okay.",
    "The movie was terrible",
    "The worst most awful ridiculous pathetic movie ever!"
]

accepts_unvectorized_strings_model.predict(examples)

array([[0.7906576 ],
       [0.5475819 ],
       [0.4600771 ],
       [0.37825677],
       [0.08336403]], dtype=float32)
1 Like