Grokking LSTM and GRUs some questions (Week 1 and 2)

hi @talwrii

The idea of significance of lstm with similarity does come same as you mentioned for GRU.

The only difference here would be lstm uses activation function for this forget and update gate when it comes to an input.

The significance of GRU and LSTM holds significance when we have a very long sequence of data and the idea behind prediction of next word or translating a corpus(chunk) in a given long sequence of word can be addressed using GRU and LSTM.

I am sharing a link about your query on how lstm holds similarity with GRU,

Feel free to ask or give any feedback.

Regards
DP