Lambda Layers-lab C4_W3_Lab_1_RNN

Hi,

If you refer to the video ‘Lambda Layers’, Lawrence mentions that the default activation of RNN layers is tanh, so to get the originally values we multiply by 100.
My query is, since tanh activation results in values either 1 to -1, how does multiplying it with 100 work? What if input values have a range different from the one in the lab exercise?

Regards
Aroonima

Multiplying by 100 multiplies the value that comes out of the tanh layer.
A better way to do it is to use bring the input data to a smaller scale by \frac{x_i - \mu}{\sigma}, for each data point, before training. This way, you don’t have to keep playing around with the lambda layer as the dataset changes.

Hi Balaji,

I believe standardizing the input to center around mean is one thing. Getting the desired target value as prediction is another. So, even if we scale the inputs, the LSTM layer will produce an output in range -1 to 1 due to default tanh activation. If we change this activation to relu, this should work and we can do away with the lambda layer. So we can specify the activation according to the desired output, and the lambda layer has been introduced for better understanding of its use in this exercise.

In general, it’s good to restrict NN inputs to small range for the sake of faster converge. See this post for an example.

Let’s consider the 1st exercise in course 1 (housing price prediction based on number of rooms). Go ahead and set the ys to the actual price i.e. 50000 + num_rooms * 50000 and build the model, with a lambda layer say tf.keras.layers.Lambda(lambda x: x * 50000). Model loss is going to be nan during training.

I recommend standardizing features instead of using lambda layers or changing output activation of lstm to relu instead of tanh.

Sure, scaling inputs to smaller values would give better results. I shall trying scaling the inputs in this lab exercise. Its a bit of a challenge, as it requires a lot of coding since you cant use a standard scaler from sklearn directly here.

Thanks
Aroonima

You can use StandardScaler. Before creating a dataset, you can fit_transform on series_train.

scaler = StandardScaler()
dataset = windowed_dataset(scaler.fit_transform(np.expand_dims(series_train, axis=-1))[0])

Here are 3 more methods you’ll find useful to measure model performance after training:

  1. scaler.transform to transform new data with the learnt parameters from series_train
  2. scaler.inverse_transform to get the data back in the original scale.
  3. tensor_instance.numpy() to convert tf.Tensor to a numpy representation.