Input shape dimension for time series data

dsfasfuqwjoasjsad · October 25, 2024, 1:23am

Hello mentor, I noticed that in week 3 lab 1, the shape of input layer of RNN model is (window_size, 1), as shown below:

However, in week 2 lab 2, I noticed that the shape of Input layer of a dense model is (window_size,). I then tried to change it to (window_size, 1), and got the following results which is clearly wrong.

I am confused with the shape parameter. Would you explain the following to me ?

Thanks for your clarification !

after specifying shape=(window_size, 1) in the dense model, Why the output of the model is (None, 20, 1) ?
after specifying shape=(window_size, 1) in the dense model, why the #param of dense layer’s weights changed to 2 ? The original value is 21.
Why we have to specify shape=(window_size, 1) in the RNN model which makes us to add dimension to time series data while building tf.data.Dataset ?
Is it possible to have the same tf.data.Dataset for both RNN model and Dense model ?

balaji.ambresh · October 25, 2024, 8:49am

Each unit in a dense layer connects with all inputs to perform a dot product with its weights and finally we add 1 bias term. When it comes to time series, the same weights are applied to all timesteps.
If this is unclear, please take up deep learning specialization.

The Dense layer changes only the last dimension. Consider the following setup:

import tensorflow as tf
model = tf.keras.Sequential([
    tf.keras.layers.Input(shape=(20, 3)), 
    tf.keras.layers.Dense(units=4)])

model.summary() is as follows:

Model: "sequential_1"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type)                    ┃ Output Shape           ┃       Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ dense_1 (Dense)                 │ (None, 20, 4)          │            16 │
└─────────────────────────────────┴────────────────────────┴───────────────┘
 Total params: 16 (64.00 B)
 Trainable params: 16 (64.00 B)
 Non-trainable params: 0 (0.00 B)

The Dense layer has 4 units. Each unit connects with 3 features which makes it 4*3 weights and 4 bias units. So, the total number of parameters is 16. All timesteps are processed via vectorization in a single shot.

See the layer configuration here:

>>> model.layers[-1].weights
[<KerasVariable shape=(3, 4), dtype=float32, path=sequential_2/dense_2/kernel>, <KerasVariable shape=(4,), dtype=float32, path=sequential_2/dense_2/bias>]

Specifying the shape as (None, 20) works fine for a time series problem as long as there is just 1 feature per timestep. In this case, the NN just looks at it like any other problem. Since there are 20 inputs per row of a batch, the number of parameters of a dense layer is 20 for weights and 1 bias unit.

>>> tf.keras.Sequential([tf.keras.layers.Input(shape=(20, 1)), tf.keras.layers.Dense(units=1)]).summary()
Model: "sequential_2"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┓
┃ Layer (type)                         ┃ Output Shape                ┃         Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━┩
│ dense_2 (Dense)                      │ (None, 20, 1)               │               2 │
└──────────────────────────────────────┴─────────────────────────────┴─────────────────┘
 Total params: 2 (8.00 B)
 Trainable params: 2 (8.00 B)
 Non-trainable params: 0 (0.00 B)
>>> tf.keras.Sequential([tf.keras.layers.Input(shape=(20, )), tf.keras.layers.Dense(units=1)]).summary()
Model: "sequential_3"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┓
┃ Layer (type)                         ┃ Output Shape                ┃         Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━┩
│ dense_3 (Dense)                      │ (None, 1)                   │              21 │
└──────────────────────────────────────┴─────────────────────────────┴─────────────────┘
 Total params: 21 (84.00 B)
 Trainable params: 21 (84.00 B)
 Non-trainable params: 0 (0.00 B)
>>>

Time series problems have shape (BATCH SIZE, NUM TIMESTEPS, NUM FEATURES PER TIMESTEP). When number of features per timesep is 1, some people treat it as a typical regression problem and use a dense layer on the input.
An RNN processes 1 timestep at a time. Without setting return_sequences to True, only the output of the last timestep is returned.

For theoritical knowledge of recurrent neural networks, please see deep learning specialization (course 5). From an API perspective, the input shape is 3D and hence the reason for the additional 1 at the end.

Yes (as long as the the shape criteria are met). Based on my answers to the first 3 questions, you should know that the number of units that interact with the data is different between a Dense layer and an RNN layer. Consider printing model summary while using an RNN to see the difference.

dsfasfuqwjoasjsad · October 25, 2024, 4:11pm

Hello mentor, thank you so much ! I have fully understand the mechanism.

Also, I have finished the deep learning specialization but it’s some time ago. I will have to revise some contents.

Topic		Replies	Views
Help me understand the input_shape Sequences, Time Series and Prediction week-4	4	638	December 5, 2023
Understanding input_shape argument for Dense layer in Lab1 Introduction to TF for Artificial Intelligence ... week-1	2	517	March 25, 2023
C2_w1_lab02_CoffeeRoasting_TF # of parameters and shape Advanced Learning Algorithms week-1	4	424	August 15, 2023
Help in Understanding Input Layers Sequences, Time Series and Prediction week-4	3	458	August 29, 2023
Confusion with input_shape of Lambda layer and Conv1D layer for time series dataset Sequences, Time Series and Prediction week-4	17	708	March 27, 2023

Input shape dimension for time series data

Related topics