Forget gate vs. Update gate in LSTM

Hi,
Could you please elaborate on the differences between the two?
The formulas for these two are quite similar in structure, but differ on using different parameters (W, b).
The “forget” is responsible on which units from previous layer to pass on to the next, and
The “update” is responsible on which units to pass on too.
I don’t understand why we need two gates doing the same thing?

You might want to consider watching the LSTM lectures again. Prof Ng covered all this in some detail. The instructions in the notebook also explain it. The two gates have their own weights because they are doing different things:

The purpose of the “forget” gate is to detect when some previously saved state (from an earlier “update” gate) is no longer relevant. The example they give is that a subject that was singular in the past has changed from singular to plural.

The purpose of the “update” gate is to figure out which new things that are happening at the current step are relevant to be saved because they may be needed later (and eventually forgotten by a later “forget” gate).

Thank you for the clarification.