Llama2 70B model wrong?

Came here to see if anyone commented on this but perhaps it was still too early.

The section where the 70B model is being used to evaluate the responses of all the models when doing the neighbours problem.

Firstly, its own answer was wrong that they are not neighbours.
Secondly it’s evaluation that 7B and 70B were correct and 13B incorrect.

I know I’m not comparing like for like, this is just an illustration of the current gap. ChatGPT 4’s answer (without even needing to prompt about the ‘not enough information’:

Based on the information given:

  • Billy and Teddy are neighbours.
  • Teddy and Lenny are not neighbours.

There is no direct information provided about the relationship between Billy and Lenny in terms of being neighbours. Without further information, we cannot determine whether Billy and Lenny are neighbours or not.

EDIT: it occurred to me after that perhaps there’s a distinction for some people between neighbours and next-door-neighbours. If the definitions of neighbours is this extended version then the answers are correct. I’ll try it with perhaps a less ambiguous question.