How LLMs work while doing mathematical calculations?

LLMs are trained on text data and they mostly generate text responses based on different use cases. I want to know how LLMs understands language of mathematics? If want to calculate 2^4 then how it is a language based problem?

Hi @Sahil_Arafat

Welcome to community.

This is a good question. And the answer is: The LLMs doesn’t understand anything. They generate text, code lines or even math calculations based on patterns.

That means that if an LLM model was trained with a lot of wrong math expressions, they will mimic it. In simple terms, they will repeat the same expressions they received as training as a pattern. Of course that the generated answer will depend of the prompt, but in general, they will generate the “best” answer base on a probabilistic approach.

They create a probability map of words and sequences of words based on large bodies of training data harvested from web pages, e-books, e-journals, etc

https://www.maastrichtuniversity.nl/large-language-models-and-education

Best regards
elirod

Barely, and not very well.

1 Like