Lambda layer, why multiplying by 200

In C4_W4_Lab_1_LSTM, it is written:

  • tf.keras.layers.Lambda(lambda x: x * 200)

Why 200 ?

My idea was that multiplying by 100 would provide some kind of percentage in the output in terms of probability. But what is the goal of multiplying by 200 ?

Did I miss something ?

1 Like

There is no probability involved in this time series problem.
Guess it’s a demo of a lambda layer.
You don’t need the last lambda layer at all. @DLAI-QA-Team FYI

Dear @Manu ,
Aside from the question of “why 200?” I want to correct an incorrect presumption. Multiplying by 100 had nothing to do with probability. The reason for the lambda layer previously, and its factor of 100, was to re-scale the values that were output by the RNN layer. The RNN output is like the output of hyperbolic tangent function, with all values lying between -1 and +1, as described in one of the videos in Course 4, Week 3.

Because the values in our series were positive values much greater than one, the 100 factor rescaled the RNN output to be similar in order of magnitude to our data.

In my own code with the RNN, I tested a lambda function that was (x + 1) * 50.

@balaji.ambresh submits that the lambda layer (with its factor of 200) is not needed at all in this exercise, which has LSTMs, not RNNs, and I will certainly believe him. I am ignorant (so far!) of its purpose when used on an LSTM output.

Sorry if my post was unclear. I was trying to hint the reader to get rid of the lambda layer but still get good results.

There are 2 models in the notebook. Both of them have a lambda layer doing a multiplication as the final layer.

My approach involves the following changes to each model:

  1. Removing the last lambda layer i.e. output is from the Dense node.
  2. Removing the custom learning rate callback
  3. Changing the optimizer to 'adam'.

These are the training results:

Original = The one from the course.
Changed = Based on the above changes.

First model (has the Bidirectional layer) at the end of 100 epochs:
Original: MAE=19.1179, loss=18.6217.
Changed: MAE=5.2348, loss=4.8212

Second model (doesn’t have the Bidirectional layer) at the end of 500 epochs:
Original: MAE=4.3817, loss=3.9108
Changed: MAE=1.7249, loss=1.3768

The last lambda layer helps push weights down since you want to last Dense layer output to be multiplied by 200 or 400 to get the prediction.

As far as the choice of numbers of 200 and 400 go, I don’t know why they were picked. So, I’ll vote on this post.

@cognion , @Manu please upvote the original post as well.

2 Likes