However, in week 2 lab 2, I noticed that the shape of Input layer of a dense model is (window_size,). I then tried to change it to (window_size, 1), and got the following results which is clearly wrong.
Each unit in a dense layer connects with all inputs to perform a dot product with its weights and finally we add 1 bias term. When it comes to time series, the same weights are applied to all timesteps.
If this is unclear, please take up deep learning specialization.
The Dense layer changes only the last dimension. Consider the following setup:
import tensorflow as tf
model = tf.keras.Sequential([
tf.keras.layers.Input(shape=(20, 3)),
tf.keras.layers.Dense(units=4)])
The Dense layer has 4 units. Each unit connects with 3 features which makes it 4*3 weights and 4 bias units. So, the total number of parameters is 16. All timesteps are processed via vectorization in a single shot.
Specifying the shape as (None, 20) works fine for a time series problem as long as there is just 1 feature per timestep. In this case, the NN just looks at it like any other problem. Since there are 20 inputs per row of a batch, the number of parameters of a dense layer is 20 for weights and 1 bias unit.
Time series problems have shape (BATCH SIZE, NUM TIMESTEPS, NUM FEATURES PER TIMESTEP). When number of features per timesep is 1, some people treat it as a typical regression problem and use a dense layer on the input.
An RNN processes 1 timestep at a time. Without setting return_sequences to True, only the output of the last timestep is returned.
For theoritical knowledge of recurrent neural networks, please see deep learning specialization (course 5). From an API perspective, the input shape is 3D and hence the reason for the additional 1 at the end.
Yes (as long as the the shape criteria are met). Based on my answers to the first 3 questions, you should know that the number of units that interact with the data is different between a Dense layer and an RNN layer. Consider printing model summary while using an RNN to see the difference.