Other Models for "Build Your Own Hotdog vs. Not a hotdog"

First of all, thanks for a great course!

I’m currently trying to improve the hotdog classifier code. So I changed this line to use:

hotdog_or_not = outlines.generate.choice(
    vmodel,
    ['hotdog', 'not a hotdog'],
    sampler=greedy(),
)

So it is working but the last image is classified as “hotdog”. So I tried to change the model to “HuggingFaceTB/SmolVLM-500M-Instruct” but the performance got worse: everything is classified as “not a hotdog”.

Can someone please explain why is that the case? Are there other models that can provide better performance than “SmolVLM-256M-Instruct”? Note that I tried these 2 but getting some errors:

  • HuggingFaceTB/SmolVLM2-2.2B-Instruct
  • openbmb/MiniCPM-Llama3-V-2_5

It is not allowed to share solutions :face_with_peeking_eye:shushing_face:.

However, you did a good job to use your code! To solve the issue with the last image, just keep in consideration to set up a strict classification to exactly match one of the two valid answers

answer_regex = r"(hotdog|not a hotdog)"

hotdog_or_not = outlines.generate.text(
    vmodel,
    sampler=greedy(),
    regex=answer_regex
)



Hoping this can help you!