LLMs are trained on text data and they mostly generate text responses based on different use cases. I want to know how LLMs understands language of mathematics? If want to calculate 2^4 then how it is a language based problem?
Welcome to community.
This is a good question. And the answer is: The LLMs doesn’t understand anything. They generate text, code lines or even math calculations based on patterns.
That means that if an LLM model was trained with a lot of wrong math expressions, they will mimic it. In simple terms, they will repeat the same expressions they received as training as a pattern. Of course that the generated answer will depend of the prompt, but in general, they will generate the “best” answer base on a probabilistic approach.
They create a probability map of words and sequences of words based on large bodies of training data harvested from web pages, e-books, e-journals, etc
https://www.maastrichtuniversity.nl/large-language-models-and-education
Best regards
elirod
Barely, and not very well.