Can anyone clarify LSTMs to me? Better with an example

Dr.Stone · April 23, 2025, 1:47am

Provide as much detail you’ve acquired on it.

lukmanaj · April 23, 2025, 2:41pm

Great question!

If you’re starting out, I’d highly recommend Andrew Ng’s videos in Course 5 of the Deep Learning Specialization (Sequence Models) — he gives one of the most beginner-friendly walkthroughs of LSTMs. Also, Course 3 of the NLP Specialization (on Sequence Models for NLP) builds on that and applies LSTMs to real tasks like text generation and machine translation.

But what is an LSTM?

An LSTM (Long Short-Term Memory) is a type of recurrent neural network (RNN) designed to better capture patterns over longer sequences. Standard RNNs tend to “forget” earlier parts of the input if the sequence is long — LSTMs solve this by adding gates that control what to keep, update, or forget.

Each LSTM unit has:

A forget gate (what to discard from memory),
An input gate (what new information to store),
And an output gate (what to pass to the next layer/time step).

This gating mechanism helps it remember important context — like grammar or meaning — over long sentences.

Simple example

Let’s say you’re feeding an LSTM the sentence:

“The clouds are dark and it looks like it’s going to…”

You want it to predict the next word. The correct prediction might be “rain”.

An LSTM is able to use context from earlier in the sentence (“clouds”, “dark”) to realize that “rain” is a more likely continuation than, say, “snow” or “shine”.

Where a basic RNN might forget the early part (“The clouds”), an LSTM can retain that memory, allowing it to make better predictions.

Topic		Replies	Views
Understanding of LSTM NLP with Sequence Models week-module-3	7	501	June 21, 2023
C3W2: meaning of LSTM, e.g. processing one sequence w/ and w/o LSTM Natural Language Processing in TensorFlow week-module-3	3	27	March 26, 2025
How is LSTM connected with image captioning? NLP with Sequence Models week-module-3	4	402	August 17, 2023
The forget (vs. remember) gate Sequence Models coursera-platform	1	297	November 9, 2023
What is the different between using standard RNN and LSTM for time series prediction? Sequence Models coursera-platform	2	557	January 10, 2023

Can anyone clarify LSTMs to me? Better with an example

But what is an LSTM?

Simple example

Related topics