LSTM video

Here Andrew says that with the peep-hole connection, there is a one-to-one connection between the elements of the memory unit and the elements of the gate.


Here if you see,

Each element in the memory unit impacts each element in the gate. Am I not understanding this properly?

Can you give a time mark for where you see this in the lecture?

(Update: I found it, the “peephole connection” is discussed starting at 6:51)

Based on this simplified notation that Andrew is using for RNNs, I’m not sure whether your expanded notation is correct.

you are looking at the wrong timestamp. Start watching from the beginning of 7th minute.

I watched the entire video, start to finish.

And I also watched the entire “Recurrent Neural Network Model” video, in order to understand the notation that Andrew is using.

Didn’t you find the term peep-hole connection?

Perhaps the most common one, is that instead of just having the gate values

be dependent only on a t-1, xt.

Sometimes people also sneak in there the value c t -1 as well.

This is called a peephole connection.

Play video starting at :7:9 and follow transcript7:09

Not a great name maybe, but if you see peephole connection,

what that means is that the gate values may depend not just on a t-1 but

and on x t but also on the previous memory cell value.

And the peephole connection can go into all three of these gates computations.

So that’s one common variation you see of LSTMs one technical

detail is that these are say 100 dimensional vectors.

If you have 100 dimensional hidden memory cell union.

So is this and so say fifth element of

c t-1 affects only the fifth element of the correspondent gates.

So that relationship is 1 to 1 where not every element

of the 100 dimensional c t-1 can affect all elements of the gates, but

instead the first element of c t-1 affects the first element of the gates.

Second element affects second elements and so on.

But if you ever read the paper and see someone

talk about the peephole connection, that’s what they mean,

that c t -1 is used to affect the gate value as well.

Here are the exact subtitles from the video.

Yes, I did. That segment of the video starts at the time mark I gave.

I don’t need the subtitles, I’ve watched the whole video.

Then do you find my equation correct now? Andrew says that people use this version of LSTM as well. I have just expanded the first term there.