How can LSTM or GRU decided what to forget or remember?

Hamin-Yoon · June 30, 2021, 8:27am

Hi, I was wondering how we can actually set which data to be fed to LSTM or GRU.
I am basing my inquiry on first week’s course, GRU and LSTM, but mainly lecture about GRU.

For instance, if there is a sentence : “the cat/cats , which is… , drinks/drink milk”
Due to vanishing gradient problem, RNN cannot handle this plural/singular matter.
So at the lecture with GRU,
Andrew Ng said that we can set update gate to 1 for “cat” and maintain update gate to 0 for words in a sentence in between commas, and we set 1 for “drinks/ drink” so that at the end memory cell can contain the info of noun and verb and learn the grammar itself. And he added that we can easily set update gate to 0 if (Wu[c,x] +bu) is negative since sigmoid will then return value close to 0.

My question is, how do we let it end up negative? Isn’t it algorithm’s own calculation that is conducted automatically? Or do we manually hardcode the update gate to 0 on each inputs that we want to mute? i thought algorithm learns itself what to forget or remember, but I am unsure if i understood correctly and i still wonder how LSTM or GRU decides what to keep(update) and what to forget.

It might be because I did not correctly undetstood the fundamental principle of algorithm calculation. So i will re study, but can you kindly explain how algorithms decide what information to keep or ignore?

TMosh · July 5, 2021, 5:10am

Did you find an answer to your question?

ajinkyaathlye · April 30, 2022, 10:14pm

I have the exact same question! It would be helpful if anyone could answer this!

jasonchen · July 25, 2022, 11:38pm

Have you found an answer to this question yet? I think many students have the same question trying to figure out what Andrew meant. He didn’t mean annually set the gates to 1 or 0. Please note that every gate has weights (e.g., Wu). We train the GRU or LSTM models, and the gates will automatically function as a gate, knowing when to be 1 or 0 (actually can be any values between 1 and 0). Hope this helps though there are a lot of details to be explained and this is only a summary.

Topic		Replies	Views
GRU relevant word to store in memory Sequence Models	1	305	November 4, 2023
Week 1 - Quiz Problem Sequence Models week-1	1	295	January 20, 2024
W1 Quiz: Inconsistent grading logic applied to Question 9: Update Gate and Forget Gate Sequence Models quiz-help , week-1	3	28	August 11, 2024
How do RNN's learn forget and update gates from a ground truth y? Sequence Models general	3	16	July 9, 2024
Deep Learning Specialization, Course 5, Week 1, quiz Sequence Models	2	537	September 11, 2022

How can LSTM or GRU decided what to forget or remember?

Related topics