The forget (vs. remember) gate

Here’s what our Course 5, Programming Assignment 1 (Building a Recurrent NN Step by Step) mentions about the “forget gate”:

"* The “forget gate” is a tensor containing values between 0 and 1.

  • If a unit in the forget gate has a value close to 0, the LSTM will “forget” the stored state in the corresponding unit of the previous cell state.
  • If a unit in the forget gate has a value close to 1, the LSTM will mostly remember the corresponding value in the stored state."

Based on that definition, shouldn’t it be called the “remember” gate?

Hi @sgomezgomez,

I can see your point, but that is the name :wink: .

I am personally quite comfortable emphasizing its ability to forget because it is the added feature in LSTM to forget. Without the “forget” gate, it always attempt to remember, so seems to me no point to emphasize how it can remember.

Cheers,
Raymond