Thread #737137393
File: sillytavern_logo.png (115.6 KB)
115.6 KB PNG
GLM, Deepseek, Kimi, or Gemini Flash for filthy poorfags such as myself?
>inb4 local
I don't have the hardware to run a decent model.
27 RepliesView Thread
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>737138529
31b is insanely good for a local model but it doesn't stack up to 300b+ cloud models, of couse
26b4a needs abliteration for lolishit because moes are better at refusing but it's fast and light enough i'm thinking about running it 24/7 on a second card as an assistant
>>
>>
>>
>>
>>
>>
>>
>>
>>737139337
https://www.reddit.com/r/SillyTavernAI/comments/1roxt1c/freaky_frankim stein_swansong_final_kimi_k25_think /
try this preset
>>
>>737138976
You can use smaller models for sure with 12gb vram. 24b would be pushing it but may work with cpu split. Some of the better dense models like gemma4 31b would probably be very slow unless you get at least 16gb vram or quant it into lobotomy.
If you've got the ram for it you could run mixture of expert models likes qwen3.5 35ba3b or gemma4 26ba4b. Those aren't as smart as dense models but are much faster when offloaded onto ram.