Here’s what our Course 5, Programming Assignment 1 (Building a Recurrent NN Step by Step) mentions about the “forget gate”:
"* The “forget gate” is a tensor containing values between 0 and 1.
- If a unit in the forget gate has a value close to 0, the LSTM will “forget” the stored state in the corresponding unit of the previous cell state.
- If a unit in the forget gate has a value close to 1, the LSTM will mostly remember the corresponding value in the stored state."
Based on that definition, shouldn’t it be called the “remember” gate?