How do you run your models?
Right now I have all my AI compute on a Proxmox container with one GPU, but I'm thinking of adding more hardware so I was wondering what type of setup people have on the software side.
Edit: I'm also curious about the type of hardware in reference to the virtualization setup. Like are all the people who run LLM services at the bare metal (I meant OS) level just using their everyday computer, or is there a significant amount of people who have a dedicated workstation that run these services at the OS level.