LSTM vs bidirectional

I am following the course, and arrived at the video describing LSTM and bidirectional RNN.

I am sorry if this question feels unclear or dumb.

It feels like the motivation of these two techniques and the issue they are trying to resolve is the same : having parts of the inputs “far” from the current prediction change the current prediction : create “links” between distant input tokens.

Do we still need LSTM when we “fold” the input the way we do using bidirectional RNN ?

You can have uni-directional LSTM and bidirectional LSTM-based networks as well as other types based on different cells! I mean the RNN cells can be simple RNN, LSTM GRU.

1 Like

Right! The point of Bidirectional is that at any given timestep the cell can “see” (be influenced by or learn from) both the previous and the future timesteps. In a unidirectional LSTM (or any other type of RNN), that’s not true: it can only be influenced by what happened in the past timesteps.

The point of LSTM versus plain vanilla RNN is that it’s easier for it to learn from things that happened far in the past. Then when you add Bidirectional, it can learn from things far in the future as well.

2 Likes

Thanks to both of you.
My take here and problem solved by bidirectional RNN and RNN with LSTM :

  • bidirectional : take into account future events
  • LSTM : take into account “far” events (past or future depending on the architecture).
2 Likes