Help for understanding lambda layer

lirone · August 30, 2022, 7:58am

Hi,

I saw the video but I’m still not able to understand well the lambda layers for the LSTM model.

tf.keras.layers.Lambda(lambda x: tf.expand_dims(x, axis=-1), input_shape=[None]),
...
tf.keras.layers.Lambda(lambda x: x * 100.0)

Do you mind explaining me please ? I would be happy to get more insights about those two layers.

Thank you

gent.spah · August 30, 2022, 8:29am

Have a look at this recent post from yesterday:

lirone · August 30, 2022, 8:47am

Got it thank.
But on this example, why do we need to multiply at the end and expand dim at the beginning ?

dtisi9 · September 1, 2022, 9:09pm

Hello @lirone,

In the video on Lambda layers in week 3, Laurence Moroney says :

The first Lambda layer will be used to help us with our dimensionality.
If you recall when we wrote the window dataset helper function, it returned two-dimensional batches of Windows on the data, with the first being the batch size and the second the number of timestamps.
But an RNN expects three-dimensions: batch size, the number of timestamps, and the series dimensionality.
With the Lambda layer, we can fix this without rewriting our Window dataset helper function.
Using the Lambda, we just expand the array by one dimension.
By setting the input shape to none, we’re saying that the model can take sequences of any length.

Similarly, if we scale up the outputs by 100, we can help training.

In other words we needed the input dimension to be different from the one we get from the helper function we wrote to get the data. Thus, instead of rewriting the all helper function we can create a layer that expands the dimensionality. None of the “standard”, already present layers could do the job, then we built a Lamda layer.

The same idea for the last layer. We want to multiply the output by 100, and we can use a lambda layer to do that.

I hope it is clear (or clearer at least ) now,
Best,
Davide

lirone · September 4, 2022, 7:32pm

I got it ! Thank you @dtisi9 !
Do you mind explaining me gain why do we need to multiply the output by 100 ?

thank you

dtisi9 · September 5, 2022, 10:05am

Hello,

The main idea is that the default activation function in the RNN is the tanh ( hyperbolic tangent Wikipedia). The tanh function has a codomain in [-1,1] range, but normally the values of the time series we want to predict are in the 10s like 40s, 50s, 60s, and 70s, multiplying by 100 should help the training. It is a way to fill the gap between the predictions that you would have from the Dense Layer,and the real data you want to approximate.

Best,
Davide

stefan-yurii.malyk · October 14, 2022, 10:33am

Hey, @dtisi9. Just to clarify. We multiply by 100 to align the output of the model with actual data, right? So that model.predict will give accurate results and loss will be adequate. So if the units of ‘y’ are 1000s, we should multiply by 1000, and so on. Did I get it right?

dtisi9 · October 14, 2022, 11:47am

Yes, you got the main idea,it is not a rule just a way to maybe help the training

Topic		Replies	Views
C4_W3_Lab_1_RNN - Why use Lambda to reshape? Sequences, Time Series and Prediction week-3	1	477	April 10, 2023
Lambda layers - C4_W3_Lab1 Sequences, Time Series and Prediction week-3	7	584	June 10, 2022
Why do we expand dimentions with a Lambda? Sequences, Time Series and Prediction week-4	1	622	August 29, 2022
Help in Understanding Input Layers Sequences, Time Series and Prediction week-4	3	459	August 29, 2023
Course 4 Week 3 Quizz: Question about lambda layer Sequences, Time Series and Prediction week-3	4	437	August 31, 2023

Help for understanding lambda layer

Related topics