RNNs predicting next word... (

DPawson · February 3, 2025, 1:55pm

Week 1. I’m not seeing a ‘link’ between generating a ‘possible’ next word, and answering a prompt?
Should I see such a link (RNN to LLM?)

gent.spah · February 3, 2025, 1:58pm

You mean something in between RNN and LLM? RNN is one type of NLP models, and LLMs are mostly based on transformer architecture.

DPawson · February 3, 2025, 2:45pm

Generating next word makes sense. From that to ‘answering a question’ seems a very big leap?

gent.spah · February 3, 2025, 2:47pm

This is not a course to explain those architectures in detail, you have DLS and NLP specializations that do that.

paulinpaloalto · February 3, 2025, 6:14pm

Yes, it is a big leap, but the larger point is that LLMs (at least at the current SoTA) are not really “thinking” in the sense that humans do. An answer to a question is just a sequence of words and the way it chooses the next word is based on patterns that it has learned by “digesting” the training corpus of actual word sequences that the system designers gathered for that purpose. It is just doing a sophisticated version of pattern matching based on the training data that was fed to it. If the sequence starts with the question you asked, then what are the patterns of word sequences that would be likely to follow that?

paulinpaloalto · February 3, 2025, 7:57pm

Just to belabor the point a little more: there is no sense in which the LLM “understands” the meaning of the question. All it is doing is repeating patterns that it has learned from. That’s why the quality of the training set is so critical: if you just feed it junk, then the output will also be junk. Just a high powered version of the traditional GIGO axiom … E.g. if you accidentally scrape some websites that contain non-factual statements, then you may well get equivalent statements in the output of the LLM.

Even with a very carefully vetted training set, the LLM will still sometimes produce “hallucinations” or confabulations in the output. The people working on this have spent quite a bit of effort using techniques like Reinforcement Learning to damp down the frequency of hallucinations, but (at least as far as I’ve heard as of early February 2025) no-one really has a solution to that problem.

DPawson · February 3, 2025, 8:04pm

Thanks. Still a mystery to me, internal to the transformer setup I guess.

TMosh · February 3, 2025, 9:07pm

LLM’s are just a really big statistical text sequence predictor.

The transformer architecture lets them use a large sequence of words, so it implements something that imitates context-awareness.

Igor_Pereverzev · February 4, 2025, 12:37pm

The link between generating the next possible word and answering a prompt lies in how language models predict text. While traditional RNNs generate sequences based on past context, LLMs (like transformers) take this further by using attention mechanisms to weigh the relevance of all previous tokens efficiently.

In essence, answering a prompt is an extension of predicting the next word but guided by broader context, learned patterns, and fine-tuned training.

Topic		Replies	Views
How does next word prediction work for language translation? Generative AI with Large Language Models week-module-1	1	677	July 20, 2023
Training of NLP models NLP with Attention Models ai-discussions	4	35	September 25, 2024
Question on how Base LLMs are trained Generative AI with Large Language Models week-module-2	4	435	August 3, 2023
Transformers architecture - Week 1 \| Coursera Generative AI with Large Language Models week-module-1	1	1089	December 2, 2023
How is ChatGPT able to generate a response given prompts AI Discussions ai-discussions , data-centric	3	205	February 12, 2023

RNNs predicting next word... (

Related topics