Which local GPU is better for Inference

Which local gpu is better to run the AI models locally?

It depends. But in general as much VRAM as possible, and Nvidia Ada or Ampere generation preferable. I actually would go with Nvidia, unless you prefer configuring and setting up instead of working on AI :wink:
One of the “best” workstation GPUs currently available from Nvidia is RTX 6000 with 48GB VRAM, costs about 8k USD.
But a RTX3060 the version with 12GB VRAM for 300~400 USD is easy and cheap entry point.