Hey,
I have been working on building a time series LSTM model to predict energy consumption of a building. My data set consists of date, energy consumption and temperature.
In the DeepLearning.AI TensorFlow course, Laurence Moroney uses either a single dense layer with the number of neurons=number of output predicted, or 2 dense layers with the first dense layer having more than one neuron (see architecture below) for its time series LSTM model.
e.g:
model = tf.keras.models.Sequential([
tf.keras.layers.Dense(10, input_shape=[window_size], activation=‘relu’),
tf.keras.layers.Dense(150, activation=‘relu’),
tf.keras.layers.Dense(1)
Could anyone kindly enlighten me on why 2 dense layers is better than just the one? And how do you choose how many neurons you put in the first dense layer?
For reference here is my time series LSTM model I have build:
lstm_model = Sequential()
lstm_model.add(LSTM(128, activation=“relu”, return_sequences=True, input_shape=(X_train.shape[1], X_train.shape[2])))
lstm_model.add(Dropout(0.1))
lstm_model.add(LSTM(128, activation=“relu”, return_sequences=True))
lstm_model.add(Dropout(0.1))
lstm_model.add(LSTM(40, activation=“relu”, return_sequences=False))
lstm_model.add(Dropout(0.1))
Output layer (predicting master_consumption)
lstm_model.add(Dense(10, activation=‘relu’))
lstm_model.add(Dense(1))
lstm_model.summary()
Model: “sequential”
Layer (type) Output Shape Param #
lstm (LSTM) (None, 48, 128) 69120
dropout (Dropout) (None, 48, 128) 0
lstm_1 (LSTM) (None, 48, 128) 131584
dropout_1 (Dropout) (None, 48, 128) 0
lstm_2 (LSTM) (None, 40) 27040
dropout_2 (Dropout) (None, 40) 0
dense (Dense) (None, 10) 410
dense_1 (Dense) (None, 1) 11
=================================================================
Total params: 228165 (891.27 KB)
Trainable params: 228165 (891.27 KB)
Non-trainable params: 0 (0.00 Byte)
Thanks for your help !