My personal experience is with Stable Diffusion rather than LLMs - but yeah, I consider the RTX 3060 12GB the cheapest sensible entry. Anything below 12GB runs out of vram too fast once you start increasing resolution or adding extensions (controlnet).Would a 12 GB RTX 3060 be good for their needs? Might be a way to save some money?
Basically my tier list for AI workloads right now would be about
- 3060 12GB
- 4060Ti 16GB
- 3090 24GB
- 4090 24GB
A 3090 is nominally slower than a 4080, but the minute Stable Diffusion runs out of vram and has to start caching into system RAM it bogs things down so much that a 3090 that hasn't yet run out of space will definitely get done first.
I don't bother recommending AMD cards for this because basically anything AI is programmed for CUDA, which is so much easier to run on nvidia cards.
I know your pain...An A100 80GB would be great but even pre-owned it's still way outside of my price range (like $20K;
Unfortunately I doubt that nvidia will release a high-vram consumer GPU anytime soon. They have a financial inscentive to drive any AI related sales towards their high margin enterprise cards, after all.
The possibility of a new RTX titan with 48GB VRAM (or something like that) seems nil.
I've thought about that, but as you say: still too slow for Stable DiffusionI got myself an M2 Max macbook pro to run that sort of thing. It's expensive, but at 96GB of RAM, it's got... 80-85GB of effective VRAM
I'll just stick to my 4090 and hope that someone does something clever with vram management.
Last edited: