I have written a very simple RNN and am trying to understand how it came up with the number of parameters. The code gives 24 as the number of learnable parameters for layer l1 but I was expecting 100. My understanding was that it concatenates the hidden vector (length 4 in this case) with the input vector (length 20 in this case) to give a combined input size of 24, and with a hidden layer size of 4 I should get 24x4 weights plus 4 biases = 100 parameters in total for layer l1. But on running the code below I am getting only 24 parameters for layer l1. It seems that it is counting the input vector size as being of size 1. What am I missing here? Is it not supposed to have a different weight for each step in my time series window of size 20 instead of a single weight for the entire window?
import tensorflow as tf
l0 = tf.keras.layers.Input(shape=(20,1))
l1 = tf.keras.layers.SimpleRNN(4)
l2 = tf.keras.layers.Dense(1)
model = tf.keras.models.Sequential([
l0,
l1,
l2
])
# Print the model summary
model.summary()
(20 features +4 hidden inputs ) x 4 hidden layers + 4 biases = (24) x 4 + 4 = 100