What lenght of a sequence should be for the RNN/LSTM bad working?

someone555777 · June 10, 2024, 11:23am

From week 1 assngment
Implementing this using just a Recurrent Neural Network (RNN) with LSTMs can work for short to medium length sentences but can result in vanishing gradients for very long sequences

I would like to clearlify, when I plan the NN when should I choose Attention models? From how long sequence?

someone555777 · June 10, 2024, 11:44am

Oh, I’ve found an answer in the next reading
You can see how this will be an issue for very long sentences (e.g. 100 tokens or more) because the context of the first parts of the input will have very little effect on the final vector passed to the decoder.

Really? 100 tokens is not looks like very long amount of incomming data. Do I understand correct, that another incomming data (objects from database, for example) is included in this 100 tokens?

gent.spah · June 10, 2024, 11:51am

A reasonable limit it says for one input is 100 tokens!

someone555777 · June 10, 2024, 11:52am

so, I can’t pass to NN additional objects from database on more than 100 tokens?

gent.spah · June 10, 2024, 11:53am

Yeah its better to use another type of model maybe transformers!

Topic		Replies	Views
Limitation of seq2seq without attention Sequence Models	2	677	June 5, 2022
How to handle very long sequence ?(6000 time steps) Sequence Models	1	556	July 28, 2021
Week2 - Learning Word Embeddings Sequence Models	2	537	August 7, 2022
Some general question about processing sequential data AI Discussions ai-discussions	1	17	January 15, 2025
Do Transformers obviate RNNs? Sequence Models week-1 , week-2 , ai-discussions	5	30	October 11, 2024

What lenght of a sequence should be for the RNN/LSTM bad working?

Related topics