Why using lambda layer at the end?

Pouria_Baghaei · January 5, 2023, 9:11pm

I don’t understand why we need to scale the last layer as the model is linear and it is trained with the actual Y values. as far as I remember we did not change the scale of x,y input data why the output should be different ?! I know it has been explained that in RNN the output of simpleRnn is between -1 and 1 and we scale up the output ( however, that also does not make sense to me as we train the model base on the actual variables and Y values) but in C4W4_L3 lab we used convolution and LSTM. why we multiply the final Dense result by 400?

balaji.ambresh · January 6, 2023, 7:26am

Inputs to the model are not preprocessed to a small scale. Please see this topic on the effect of learning rate on model convergence based on the scale of data.
Since we want to compare the predicted and actual values of the model in the original scale, the lambda layer helps keeps the model weights low.

Pouria_Baghaei · January 6, 2023, 7:59am

I understand normalization and it’s benefits. but in our case we did not normalized the input layer. we just added (lambda x: x*400) after the last dense layer. My assumption is whatever output we get from the last dense layer, it will be multiplied by 400, am I right ?

balaji.ambresh · January 6, 2023, 8:54am

Your understanding is correct.

Pouria_Baghaei · January 6, 2023, 10:48pm

I think I got the idea, thanks! I have one more question, not related to this topic. I noticed in some courses ( I took couple of Andrew’s courses) we add bias parameter (X0=1) to the input variable X. But we did not do that in the past courses. Would you please explain why we do not add it to the input data?

Thanks,

Pouria

balaji.ambresh · January 7, 2023, 7:42am

Bias is a parameter we add to a processing unit. For instance, each node in a Dense layer has a bias term. Here’s an example:

In [2]: model = tf.keras.Sequential([
   ...: tf.keras.layers.Dense(2, input_shape=[10])])

In [3]: model.summary()
Model: "sequential_1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 dense_1 (Dense)             (None, 2)                 22        
                                                                 
=================================================================
Total params: 22
Trainable params: 22
Non-trainable params: 0
_________________________________________________________________

There are 2 units in the Dense layer. Each unit has 10 learnable parameters based on the input shape and 1 additional parameter based on the bias term. So, the total number of parameters are 2 * (10 + 1) = 22

We want to perform w^T \cdot X + b where b is the bias term. How we do it is upto the creator of the framework.
Should there be just 1 array, bias and weights are in the same array. This also requires that the dataset should include an additional entry for the bias term. It’s a lot easier to have a bias term excluded from the weights matrix since data preparation becomes a lot easier.

Pouria_Baghaei · January 7, 2023, 7:57am

So, if I am not using NN in tensorflow , for example I use logistic regression in sklearn, then I have you add the bais to the input X parameter (x0=1)?

balaji.ambresh · January 7, 2023, 8:56am

You don’t have to add a bias term for your input data when using an sklearn model.

Topic		Replies	Views
Lambda function when Creating the model Sequences, Time Series and Prediction week-4	1	600	October 18, 2022
Lambda layers - C4_W3_Lab1 Sequences, Time Series and Prediction week-3	7	584	June 10, 2022
Lambda Layers-lab C4_W3_Lab_1_RNN Sequences, Time Series and Prediction week-3	5	654	July 12, 2022
Lambda layer, why multiplying by 200 Sequences, Time Series and Prediction week-4	3	620	March 11, 2022
Help for understanding lambda layer Sequences, Time Series and Prediction week-3	7	658	October 14, 2022

Why using lambda layer at the end?

Related topics