LLM sequence probabilities

chumpro · October 23, 2024, 1:38pm

I have written a library for controlling LLM’s using a languages definition.

This is similar to using context free grammars to control LLM’s, except my grammars are not context free.

Now I want to do stuff with probabilities of tokens and sequences of tokens.

I consider adding a new feature to the grammar specification.
The idea is to have a way to adjust the probabilities of the various grammar branches. For example ( in pseudo notation ) a rule X:

X : Yes | No

This means that you except either ‘Yes’ or ‘No’ and nothing else. What I want to do is to be able to adjust the probabilities of the branches.

For example:

X : Yes * 0.5 | No

would mean that ‘Yes’ is now half as probable as ‘No’, in a “completely neutral context”. So the LLM would need the probability of ‘Yes’ to be twice as high as ‘No’ for it to be choose over ‘No’ as the next sequence of tokens. The idea is basically that LLM would have to be very certain that ‘Yes’ is the right word to continue with for it to choose that. In other words it would err on the side of caution, choosing ‘No’ in this example.

I understand that there are a couple of problems with this.

First of all converting logits to probabilities could loose a lot of precision, so even if the math was correct, doing this by looking at probabilities could possibly not work.

Secondly, I’m not sure we’re really talking about probabilities here. It may be that the selection of the next token is a matter of sorting order that doesn’t map to actual probabilities. It could be possible to map sorting order to “probabilities” and even normalize probabilities of a list of alternative, though.

Thirdly, I’m not sure if it is possible to compare probabilities of sequences of different lengths, or whether I would except a solution just looking at the first token in the sequence.

Does anyone have any experience with this? Perhaps I have misunderstood how LLM’s work.

Also, I want the weight mechanism to be easy to control for predictable outcome, and I understand that may be difficult to do well.

I’m hoping for some input on this.

If it is relevant, here is the project URL:

TMosh · October 23, 2024, 3:38pm

Have you studied Decision Trees?

chumpro · October 23, 2024, 6:31pm

No, I can’t say I’m familiar with Decision Trees. What is their relevance to what I’m trying to do?

TMosh · October 23, 2024, 6:32pm

Your description of using Yes/No thresholds (and potentially having to adjust the threshold values) brought decision trees to mind.

chumpro · October 23, 2024, 6:47pm

I understand. The Yes|No example is an example of a grammar.

I think I just want to be able to adjust probabilities. I’ve done this in an earlier version of this project. What I did there was to convert logits to probabilities, multiply by a factor and then convert back to logits. It didn’t seem like it was the correct way to do it. It didn’t work as expected.

In the current implementation, of what I call eponec, I deal with logits, in the LogitsProcessor, in the correct way according to the documentation and the examples shipped with transformers. Therefore I left out that feature. Now I want to put it back in, but I want to do that correctly and not hackishly.

TMosh · October 23, 2024, 6:49pm

Sorry, I don’t have much to offer in this.

Topic		Replies	Views
Classification with geneartive AI models AI Discussions ai-discussions	14	389	February 13, 2024
Log_prob in sampling_decode function NLP with Attention Models week-module-1	3	507	March 2, 2023
Differnce between Token , Weight and Parameter in a LLM AI Discussions ai-discussions	2	2310	August 11, 2024
Video on Starting and Ending Sentences (Intuition issue) NLP with Probabilistic Models week-module-2	1	379	September 15, 2023
Confused regarding UNQ_C10 NLP with Probabilistic Models week-module-3	2	527	March 14, 2023

LLM sequence probabilities

Related topics