I just thought I’d ask here to get some feedback and improve my understanding.
When using the Hallucinations validator in the course , often a LLM response although quite good is rejected by the validator .
For example
query = “What vegan options are available”
response = “The vegan options available are the Veggie Supreme pizza with vegan cheese (available upon request).”
validation message = Error Message: Validation failed for field with errors: The following sentences are hallucinated: [’ The vegan options available are the Veggie Supreme pizza with vegan cheese (available upon request).']
To me the above response seems OK.
I have coded up an alternative approach ::
- Input the response and process with the spacy to remove stop-words, spaces , punctation and extract PNOUNS, Nouns, Verbs
- Test using a fast algo (ahocorasick.Automaton) determine how many of the filtered response words are in the source corpus
- If a threshold number are in the corpus , then the response isn’t (probably) an hallucination.
Is the above approach valid ?