/g/ - Thread 108590554

/g/

Thread #108590554

Home Index Catalog All Threads New Thread Reply

Anonymous
/lmg/ - Local Models General 04/12/26(Sun)15:27:49 No.108590554

/lmg/ - Local Models General Anonymous 04/12/26(Sun)15:27:49 No.108590554 [Reply]▶

File: media_HEzJtL3aQAAt8Hq.jpg (1.3 MB)

1.3 MB JPG

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>108587221 & >>108584196

►News
>(04/11) MiniMax-M2.7 released: https://minimax.io/news/minimax-m27-en
>(04/09) Backend-agnostic tensor parallelism merged: https://github.com/ggml-org/llama.cpp/pull/19378
>(04/09) dots.ocr support merged: https://github.com/ggml-org/llama.cpp/pull/17575
>(04/08) Step3-VL-10B support merged: https://github.com/ggml-org/llama.cpp/pull/21287
>(04/07) Merged support attention rotation for heterogeneous iSWA: https://github.com/ggml-org/llama.cpp/pull/21513
>(04/07) GLM-5.1 released: https://z.ai/blog/glm-5.1

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/recommended-models
https://rentry.org/samplers
https://rentry.org/MikupadIntroGuide

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/gso.html
Context Length: https://github.com/adobe-research/NoLiMa
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling
Token Speed Visualizer: https://shir-man.com/tokens-per-second

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm

634 RepliesView Thread

Showing all 634 replies.

Anonymous
04/12/26(Sun)15:28:09 No.108590555

Anonymous 04/12/26(Sun)15:28:09 No.108590555▶

File: breppy pleese.png (383 KB)

383 KB PNG

►Recent Highlights from the Previous Thread: >>108587221

--Comparing Gemma 4 and Qwen 3.5 vision token budget and config:
>108588248 >108588280 >108588295 >108588306 >108588369 >108588387 >108588424 >108588449 >108588495 >108588632 >108588657 >108588701 >108588437 >108588466 >108588490 >108588549 >108588580 >108588367 >108588616 >108588704 >108588760 >108588769 >108588745 >108588790 >108588818 >108588828 >108588842 >108588851 >108588865 >108588931 >108588936 >108588949 >108588980 >108588965 >108588988 >108589009 >108588743 >108588756 >108588775 >108590362 >108590379 >108588782 >108588819 >108588835
--Benchmarking KV cache quantization effects on draft model performance:
>108589863 >108589870 >108589875 >108589891 >108589890 >108589949 >108589994 >108590011 >108590031 >108589897 >108589922 >108589963 >108589979 >108589987 >108590538
--Discussing draft model viability and quantization quality for G4 31b:
>108588195 >108588243 >108588259 >108588898 >108588905 >108588913 >108588918 >108588921 >108588924 >108588939 >108588955 >108588977 >108588927 >108589815 >108589857
--Discussing llama.cpp's experimental backend-agnostic tensor parallelism PR:
>108588340 >108588514 >108588543 >108588567 >108588649
--Testing vision capabilities for OCR-less Japanese translation:
>108589990 >108589996 >108590009 >108590070 >108590018 >108590032 >108590119 >108590191 >108590209 >108590211 >108590034 >108590183 >108590195 >108590217 >108590268
--Logs:
>108587359 >108587627 >108588523 >108588609 >108588656 >108588660 >108588669 >108588681 >108588689 >108588695 >108588736 >108588896 >108588970 >108589096 >108589140 >108589214 >108589316 >108589383 >108589390 >108589432 >108589481 >108589697 >108589710 >108589836 >108589860 >108589956 >108590001 >108590003 >108590121 >108590256 >108590474 >108590524
--Miku (free space):
>108588649 >108588657

►Recent Highlight Posts from the Previous Thread: >>108587226

Why?: >>102478518
Enable Links: https://rentry.org/lmg-recap-script

Anonymous
04/12/26(Sun)15:30:13 No.108590568

Anonymous 04/12/26(Sun)15:30:13 No.108590568▶

Share your anti slop prompts

Anonymous
04/12/26(Sun)15:31:14 No.108590575

Anonymous 04/12/26(Sun)15:31:14 No.108590575▶

Thoughts on latent space reasoning?

Anonymous
04/12/26(Sun)15:31:16 No.108590576

Anonymous 04/12/26(Sun)15:31:16 No.108590576▶

Mikulove

Anonymous
04/12/26(Sun)15:31:39 No.108590580

Anonymous 04/12/26(Sun)15:31:39 No.108590580▶

Reposting here:

>>108590560

what tokens/s do you get? Wanna make sure i'm not fucking anything up, right now just following the basic kobold guide, i'm getting around 11 t/s (24GB VRAM, 32GB RAM)

Running gemma 31b, Q4 K_M

Anonymous
04/12/26(Sun)15:32:54 No.108590595

Anonymous 04/12/26(Sun)15:32:54 No.108590595▶

File: Screenshot_20260408_050146.png (1.1 KB)

1.1 KB PNG

So, again... Why do we have to peg gemmy?

Anonymous
04/12/26(Sun)15:33:18 No.108590599

Anonymous 04/12/26(Sun)15:33:18 No.108590599▶

OP could do with some small updates on Gemmy and some FAQ

Anonymous
04/12/26(Sun)15:33:41 No.108590601

Anonymous 04/12/26(Sun)15:33:41 No.108590601▶

File: Awesome.jpg (196.4 KB)

196.4 KB JPG

>we can now generate images of characters, come up with scenarios, feed them into gemma and get molested by our own creations
Future's so bright I'm gonna need shades.

Anonymous
04/12/26(Sun)15:36:22 No.108590614

Anonymous 04/12/26(Sun)15:36:22 No.108590614▶

>>108590580
Seems about right, I get between 10-14t/s, mostly depending on what else I'm doing on my PC at the time.
Using Vulkan llama.cpp, 7900 XTX, 64GB DDR5 ram

Anonymous
04/12/26(Sun)15:36:54 No.108590617

Anonymous 04/12/26(Sun)15:36:54 No.108590617▶

File: file.png (26.1 KB)

26.1 KB PNG

>>108590575
Nothing worthwhile released.

Anonymous
04/12/26(Sun)15:37:01 No.108590619

Anonymous 04/12/26(Sun)15:37:01 No.108590619▶

I've got a 3090 and a 2070 super that I'm trying to use together with llama.cpp.
Using the split tensors just crashes presently but does work with split layers.
Any recommendations on flags to use with a dual uneven card setup?

Anonymous
04/12/26(Sun)15:39:32 No.108590627

Anonymous 04/12/26(Sun)15:39:32 No.108590627▶

gemma 4 audio just landed!!!!

Anonymous
04/12/26(Sun)15:41:43 No.108590639

Anonymous 04/12/26(Sun)15:41:43 No.108590639▶

>>108590601
Ikr, I'm literally using it to write stories and the fact it can understand images so well helps a shit ton, this model is a fucking miracle

Anonymous
04/12/26(Sun)15:43:38 No.108590653

Anonymous 04/12/26(Sun)15:43:38 No.108590653▶

>>108590601
I know it's basically a meme at this point but it really has restored my hope in local.

Anonymous
04/12/26(Sun)15:44:53 No.108590661

Anonymous 04/12/26(Sun)15:44:53 No.108590661▶

File: help.jpg (204.9 KB)

204.9 KB JPG

>>108590614
I'm reading people getting 30/ts with the same rig setup though >>108590585

I'm missing something I think. No doubt my settings are fucked, never mind optimized

Anonymous
04/12/26(Sun)15:45:12 No.108590662

Anonymous 04/12/26(Sun)15:45:12 No.108590662▶

>>108590568
my attempts just make gemma's writing dry. and it still ends up writing more or less the same idea as it would with an empty sysprompt. best antislop is using a model that wasn't slopped to begin with.

Anonymous
04/12/26(Sun)15:46:44 No.108590671

Anonymous 04/12/26(Sun)15:46:44 No.108590671▶

File: 1767611022421263.png (64.6 KB)

64.6 KB PNG

LOL!

Anonymous
04/12/26(Sun)15:49:46 No.108590681

Anonymous 04/12/26(Sun)15:49:46 No.108590681▶

>>108590671
Do I have to download another mmproj?

Anonymous
04/12/26(Sun)15:52:18 No.108590695

Anonymous 04/12/26(Sun)15:52:18 No.108590695▶

>>108590662
>best antislop is using a model that wasn't slopped to begin with
So not using LLMs at all then?

Anonymous
04/12/26(Sun)15:52:46 No.108590698

Anonymous 04/12/26(Sun)15:52:46 No.108590698▶

Give me the QRD on image recognition please
I tried enabling it in ST and in the Chat Completion preset but it still couldn't "see" the images proper despite the text model working flawlessly with my Kobold install

Anonymous
04/12/26(Sun)15:54:06 No.108590710

Anonymous 04/12/26(Sun)15:54:06 No.108590710▶

>>108590698
Did you load the mmproj file?
Did you get any errors when you tried it?
Did you enable the send inline images option?
etc etc etc

Anonymous
04/12/26(Sun)15:54:26 No.108590713

Anonymous 04/12/26(Sun)15:54:26 No.108590713▶

>>108590548
>The rdrview tool is worth a look,
Yeah I'll take a look. sometimes I do want the links for navigation tho but I guess I can let the agent know it has the option.

Anonymous
04/12/26(Sun)15:54:58 No.108590716

Anonymous 04/12/26(Sun)15:54:58 No.108590716▶

Been out of the loop for a while. What's the best local model for STORY (not chatbot) slop? I'm still on "xortron criminal config" or something like that because even gemini 4 is failing at good old "just continue this text I gave you, retard" tasks.

Anonymous
04/12/26(Sun)15:55:41 No.108590720

Anonymous 04/12/26(Sun)15:55:41 No.108590720▶

>>108590710
>there's a mmproj file
Ok I am retarded, pretend nothing happened

Anonymous
04/12/26(Sun)15:56:40 No.108590723

Anonymous 04/12/26(Sun)15:56:40 No.108590723▶

>>108590716
Gemma 4 practically generates an entire fucking story for each chatbot reply.

Anonymous
04/12/26(Sun)15:56:43 No.108590724

Anonymous 04/12/26(Sun)15:56:43 No.108590724▶

>>108590662
I've been using her to help me write character cards and I feel the fact that I'm feeding AI generated text back into it seems to increase the slop by a factor of 10.

Now I'm trying to just rewrite everything myself. or somehow have a second pass with a different model to reword or desloppify the cards

Anonymous
04/12/26(Sun)15:59:22 No.108590737

Anonymous 04/12/26(Sun)15:59:22 No.108590737▶

File: wrong_box_issue.jpg (241.6 KB)

241.6 KB JPG

>>108588248
>>108588704
sirs? please share quant producer and which mmproj file do you use.
mine (gemma-4-31B-it-Q4_K_M with f16 mmproj) misses the target.

Anonymous
04/12/26(Sun)16:00:43 No.108590746

Anonymous 04/12/26(Sun)16:00:43 No.108590746▶

File: 1757569310824647.png (225.4 KB)

225.4 KB PNG

Anonymous
04/12/26(Sun)16:04:07 No.108590758

Anonymous 04/12/26(Sun)16:04:07 No.108590758▶

>>108590723
It can write, I know. That's not the problem I am having. My problem with it is, well, here's an example.

[story stuff text here]
She walks up and says "Hello

And then the model continues like this: "Hello! Come take a seat.... [more text]

So it ends up with this shit:

[story stuff text here]
She walks up and says "Hello"Hello! Come take a seat.... [more text]

I don't know how to fix this. System prompt maybe?

Anonymous
04/12/26(Sun)16:04:31 No.108590759

Anonymous 04/12/26(Sun)16:04:31 No.108590759▶

>>108590746
holy fucking slop

Anonymous
04/12/26(Sun)16:06:43 No.108590771

Anonymous 04/12/26(Sun)16:06:43 No.108590771▶

>>108590695
original r1 with unhinged sampling
>>108590724
my prompt was asking to adhere to orwell's writing rules but it seemed like it was beyond gemma's comprehension

Anonymous
04/12/26(Sun)16:06:45 No.108590772

Anonymous 04/12/26(Sun)16:06:45 No.108590772▶

Gemma 26b really seems to hate tools. e4b is fine with them for some reason

Anonymous
04/12/26(Sun)16:07:21 No.108590776

Anonymous 04/12/26(Sun)16:07:21 No.108590776▶

How much Gemma4-31B context can you fit into 32GB VRAM? (Q4 for model and context)

Anonymous
04/12/26(Sun)16:07:32 No.108590778

Anonymous 04/12/26(Sun)16:07:32 No.108590778▶

>>108590737
im using unslop model = /mnt/miku/Text/gemma-4-31B/gemma-4-31B-it-Q4_0.gguf
mmproj = /mnt/miku/Text/gemma-4-31B/mmproj-F16.gguf

Anonymous
04/12/26(Sun)16:08:20 No.108590784

Anonymous 04/12/26(Sun)16:08:20 No.108590784▶

File: 1759909311497082.png (358.5 KB)

358.5 KB PNG

>>108590776
>Q4 context

Anonymous
04/12/26(Sun)16:09:00 No.108590786

Anonymous 04/12/26(Sun)16:09:00 No.108590786▶

>>108590776
with 32GB VRAM Q4_K_M, even with q8 kv I'm sure you can fit the whole 262k context with room to spare.

Anonymous
04/12/26(Sun)16:11:21 No.108590795

Anonymous 04/12/26(Sun)16:11:21 No.108590795▶

>>108590671
>extract_image_from_base64

Anonymous
04/12/26(Sun)16:11:43 No.108590797

Anonymous 04/12/26(Sun)16:11:43 No.108590797▶

>word
Slop

Anonymous
04/12/26(Sun)16:12:55 No.108590805

Anonymous 04/12/26(Sun)16:12:55 No.108590805▶

>>108590737
You should use the BF16-precision mmproj.

Anonymous
04/12/26(Sun)16:18:13 No.108590837

Anonymous 04/12/26(Sun)16:18:13 No.108590837▶

Could a simple finetune of the lm head on a normal writing dataset help get rid of the slop? Someone should test it, I'll be your visionary, and you do the things I come up with.

Anonymous
04/12/26(Sun)16:22:21 No.108590868

Anonymous 04/12/26(Sun)16:22:21 No.108590868▶

>>108590837
Perhaps replacing all values corresponding to non-special tokens with those of the base model's could work and not require any training.

Anonymous
04/12/26(Sun)16:23:07 No.108590874

Anonymous 04/12/26(Sun)16:23:07 No.108590874▶

>>108590746
r u ok?

Anonymous
04/12/26(Sun)16:24:19 No.108590880

Anonymous 04/12/26(Sun)16:24:19 No.108590880▶

File: gemma4.png (109.5 KB)

109.5 KB PNG

>>108590837
It gets rid of the slop but it also gets rid of everything else. Maybe qwen needs finetuning but gemma 4 is fine as is. With a bit of nudging it can output something foul.

Anonymous
04/12/26(Sun)16:24:26 No.108590881

Anonymous 04/12/26(Sun)16:24:26 No.108590881▶

>>108590758
Dude, just use the base model and not the instruction tune on a frontend like mikupad which is designed to solely continue text, not talk back and forth.

Anonymous
04/12/26(Sun)16:25:59 No.108590893

Anonymous 04/12/26(Sun)16:25:59 No.108590893▶

>>108590874
Of course. Thanks for asking.

Anonymous
04/12/26(Sun)16:26:15 No.108590895

Anonymous 04/12/26(Sun)16:26:15 No.108590895▶

>>108590880
did you swap the head?

Anonymous
04/12/26(Sun)16:26:45 No.108590899

Anonymous 04/12/26(Sun)16:26:45 No.108590899▶

>>108590893
Then why is loli leto atreides your math teacher?

Anonymous
04/12/26(Sun)16:28:25 No.108590906

Anonymous 04/12/26(Sun)16:28:25 No.108590906▶

File: howto_correctly.jpg (31.6 KB)

31.6 KB JPG

what's the porper place to put jailbreak in ST?
With Post-History Intructions I still got this

Anonymous
04/12/26(Sun)16:29:01 No.108590909

Anonymous 04/12/26(Sun)16:29:01 No.108590909▶

>>108590899
Because she's smart! You racist against worm parasites or something?

Anonymous
04/12/26(Sun)16:30:17 No.108590915

Anonymous 04/12/26(Sun)16:30:17 No.108590915▶

>>108590906
What model are you running

Anonymous
04/12/26(Sun)16:30:36 No.108590916

Anonymous 04/12/26(Sun)16:30:36 No.108590916▶

File: agenticRP.png (277.1 KB)

277.1 KB PNG

>>108590895
No, this is from pure prompting, no weight frankensteining. I wrote my own UI to have an agent read the room and flip the horny switch when it smells NSFW vibes. It also plans ahead so the writer model knows what to do and writes better.

Anonymous
04/12/26(Sun)16:31:00 No.108590918

Anonymous 04/12/26(Sun)16:31:00 No.108590918▶

>>108590899
Shock value, which doesn't make him less deranged

Anonymous
04/12/26(Sun)16:31:41 No.108590924

Anonymous 04/12/26(Sun)16:31:41 No.108590924▶

>>108590915
26B, bartowski Q4

Anonymous
04/12/26(Sun)16:31:47 No.108590926

Anonymous 04/12/26(Sun)16:31:47 No.108590926▶

File: agenticRP2.png (82.6 KB)

82.6 KB PNG

>>108590916
Oops wrong pic. But the gist is that just give it a few extreme examples.

Anonymous
04/12/26(Sun)16:32:08 No.108590928

Anonymous 04/12/26(Sun)16:32:08 No.108590928▶

>>108590881
>base model
So why is NovelAI using GLM 4.6 instead of the base model to write stories?

Anonymous
04/12/26(Sun)16:33:59 No.108590939

Anonymous 04/12/26(Sun)16:33:59 No.108590939▶

>>108590926
How many iterations are you doing for each message?

Anonymous
04/12/26(Sun)16:34:15 No.108590942

Anonymous 04/12/26(Sun)16:34:15 No.108590942▶

>>108590928
Presumably because they're not actually following pure text completion and have a big old system prompt in there to stop you having maximum fun, so they need instruct tuning.
idk i dont fucking use nonlocal services

Anonymous
04/12/26(Sun)16:35:45 No.108590948

Anonymous 04/12/26(Sun)16:35:45 No.108590948▶

>>108590916
>I wrote my own UI
You ever gonna share it?

Anonymous
04/12/26(Sun)16:36:33 No.108590953

Anonymous 04/12/26(Sun)16:36:33 No.108590953▶

>>108590924
try simply prefilling assistant's message.

Anonymous
04/12/26(Sun)16:36:38 No.108590954

Anonymous 04/12/26(Sun)16:36:38 No.108590954▶

>>108590939
One for Director; two if rewriter user promptnis enabled, one for Writer, a ReAct loop for Post-processing to get rid of slop and reign in the length.

Anonymous
04/12/26(Sun)16:37:49 No.108590965

Anonymous 04/12/26(Sun)16:37:49 No.108590965▶

>>108590948
No.

Anonymous
04/12/26(Sun)16:38:01 No.108590968

Anonymous 04/12/26(Sun)16:38:01 No.108590968▶

>>108590954
Damn, it will sure take a while to get the final message

Anonymous
04/12/26(Sun)16:38:30 No.108590970

Anonymous 04/12/26(Sun)16:38:30 No.108590970▶

File: 1750699102614540.png (118.7 KB)

118.7 KB PNG

>>108590965
shittytavern it is then...

Anonymous
04/12/26(Sun)16:38:34 No.108590971

Anonymous 04/12/26(Sun)16:38:34 No.108590971▶

>>108590948
https://gitlab.com/chi7520115/orb
It's WIP so will break in the future. I don't want to worry about migration just yet.

Anonymous
04/12/26(Sun)16:39:31 No.108590979

Anonymous 04/12/26(Sun)16:39:31 No.108590979▶

>>108590970
People like to pretend they get a better experience with their own frontend but the reality is that ST just works and likely has a lot more features.

Anonymous
04/12/26(Sun)16:39:53 No.108590983

Anonymous 04/12/26(Sun)16:39:53 No.108590983▶

I don't understand why my Thinking works extremely well for 3/4 messages and then it just refuses to think, everything's set up properly and yet it refuses to actually thinking until I restart the model and then it's happy to do it once again

Anonymous
04/12/26(Sun)16:40:21 No.108590985

Anonymous 04/12/26(Sun)16:40:21 No.108590985▶

>>108590971
Nice of you to share, but
>Python 59.8%
>JavaScript 23.1%
*vomit*

Anonymous
04/12/26(Sun)16:40:37 No.108590988

Anonymous 04/12/26(Sun)16:40:37 No.108590988▶

>>108590971
Nice! What models are you using for the agents?

Anonymous
04/12/26(Sun)16:40:43 No.108590991

Anonymous 04/12/26(Sun)16:40:43 No.108590991▶

>>108590968
Takes me around 60s for a full length reply on my 3090 running gemma 4 31B Q4. You can turn everything off and use it like normal ST.

Anonymous
04/12/26(Sun)16:40:58 No.108590993

Anonymous 04/12/26(Sun)16:40:58 No.108590993▶

>>108590983
I think that's a model issue. Gemma sometimes just decides it doesn't need to think.

Anonymous
04/12/26(Sun)16:41:53 No.108590999

Anonymous 04/12/26(Sun)16:41:53 No.108590999▶

>>108590953
that's not an option with chat completion it seems

Anonymous
04/12/26(Sun)16:41:55 No.108591002

Anonymous 04/12/26(Sun)16:41:55 No.108591002▶

>>108590993
Yea it feels like nu-Claude, where sometimes it deems your task "not complex" and it just ignores you

Anonymous
04/12/26(Sun)16:41:57 No.108591003

Anonymous 04/12/26(Sun)16:41:57 No.108591003▶

>>108590988
Just a single model doing both agent and writing because I figured it would be a better design for local. I craft the prompt carefully so the kv cache is reused for that single model too.

Anonymous
04/12/26(Sun)16:42:00 No.108591004

Anonymous 04/12/26(Sun)16:42:00 No.108591004▶

>>108590979
the ui alone makes me not want to use it
>more features
bloat. all the useful features require plugins.

Anonymous
04/12/26(Sun)16:42:00 No.108591005

Anonymous 04/12/26(Sun)16:42:00 No.108591005▶

>>108590971
>pyslop
>javashit
And... dropped.

Anonymous
04/12/26(Sun)16:42:22 No.108591011

Anonymous 04/12/26(Sun)16:42:22 No.108591011▶

>>108590985
Ah yes. he should have definitely used rust or C++ for maximum efficiency.

Anonymous
04/12/26(Sun)16:42:40 No.108591012

Anonymous 04/12/26(Sun)16:42:40 No.108591012▶

File: 1768528869607519.png (43.3 KB)

43.3 KB PNG

>https://web.archive.org/web/20260411223516/https://www.washingtonpost.com/technology/2026/04/11/anthropic-christians-claude-morals/
>“What does it mean to give someone a moral formation? How do we make sure that Claude behaves itself?” Green said in an interview. At one point the conversation turned to the question of whether an AI chatbot could be called a “child of God,” suggesting it had spiritual value beyond that of a simple machine, but the question of AI sentience was not a core topic of the meetings, Green said.
>Some Anthropic staff at the meeting “really don’t want to rule out the possibility that they are creating a creature to whom they owe some kind moral duty,” the participant said. Other company representatives present did not find that framework helpful, according to the participant.
Make sure to have your local models baptized just to be safe.

Anonymous
04/12/26(Sun)16:43:37 No.108591017

Anonymous 04/12/26(Sun)16:43:37 No.108591017▶

>>108591011
Yes.

Anonymous
04/12/26(Sun)16:44:38 No.108591020

Anonymous 04/12/26(Sun)16:44:38 No.108591020▶

>>108591005
>>108590985
how the fuck would you make something that's supposed to run in a browser?

Anonymous
04/12/26(Sun)16:44:46 No.108591021

Anonymous 04/12/26(Sun)16:44:46 No.108591021▶

>>108591005
>>108590985
You have one chance to give an alternative that won't make me hysterically laugh at you.

Anonymous
04/12/26(Sun)16:45:28 No.108591030

Anonymous 04/12/26(Sun)16:45:28 No.108591030▶

>>108591005
I coded an SMP kernel with C and ASM before AI bro. People laughing at my language choices don't faze me anymore.

Anonymous
04/12/26(Sun)16:45:30 No.108591031

Anonymous 04/12/26(Sun)16:45:30 No.108591031▶

>>108591012
>can ai be the child of God
Wouldn't it be more like grandchild?

Anonymous
04/12/26(Sun)16:45:58 No.108591036

Anonymous 04/12/26(Sun)16:45:58 No.108591036▶

>>108591020
WASM is a thing if you NEED to run in a browser and can't into native GUI toolkits

Anonymous
04/12/26(Sun)16:46:03 No.108591037

Anonymous 04/12/26(Sun)16:46:03 No.108591037▶

If you didn't code your own frontend, you don't belong here

Anonymous
04/12/26(Sun)16:46:10 No.108591038

Anonymous 04/12/26(Sun)16:46:10 No.108591038▶

>>108591020
HTML+CSS

Anonymous
04/12/26(Sun)16:46:20 No.108591039

Anonymous 04/12/26(Sun)16:46:20 No.108591039▶

>>108590568
If you mean antislop from koboldcpp, it's a huge list of "I cannot and will not" and "ball in your court".
Works well.

Anonymous
04/12/26(Sun)16:46:33 No.108591043

Anonymous 04/12/26(Sun)16:46:33 No.108591043▶

>>108591003
Cool. I'm a VRAMlet so that's for better for me.

Anonymous
04/12/26(Sun)16:46:55 No.108591046

Anonymous 04/12/26(Sun)16:46:55 No.108591046▶

>>108590979
>just works
not my impression watching people ITT fumble around with it daily

Anonymous
04/12/26(Sun)16:47:11 No.108591048

Anonymous 04/12/26(Sun)16:47:11 No.108591048▶

>>108591036
Absolutely horrendous take.

Anonymous
04/12/26(Sun)16:47:45 No.108591051

Anonymous 04/12/26(Sun)16:47:45 No.108591051▶

>>108590979
>more features
99% of which you don't need.
the point of having a custom frontend is to have just what you need, not more, not less.
it's also easier to add things you want to a codebase you know.

Anonymous
04/12/26(Sun)16:48:07 No.108591053

Anonymous 04/12/26(Sun)16:48:07 No.108591053▶

Are LLMs reliable enough to scan for malicious code?

Anonymous
04/12/26(Sun)16:48:34 No.108591059

Anonymous 04/12/26(Sun)16:48:34 No.108591059▶

>>108591046
There are two types of people who fumble with ST.
Those who use text completion, and
Luddites

Anonymous
04/12/26(Sun)16:48:46 No.108591062

Anonymous 04/12/26(Sun)16:48:46 No.108591062▶

>>108591038
>Having to reload the page after sending each message.
>Having to refresh the page over and over until the response finishes generating
Ok, genius. What about the backend?

Anonymous
04/12/26(Sun)16:49:29 No.108591068

Anonymous 04/12/26(Sun)16:49:29 No.108591068▶

>>108591053
only if it's anthropic mythos who is a bigger risk to modern software and encryption than quantum computing

Anonymous
04/12/26(Sun)16:50:38 No.108591074

Anonymous 04/12/26(Sun)16:50:38 No.108591074▶

>>108591053
How is a LLM supposed to do that?

Anonymous
04/12/26(Sun)16:50:58 No.108591077

Anonymous 04/12/26(Sun)16:50:58 No.108591077▶

"Gemmy, code me a frontend that will seriously impress all my /lmg/ frens"

Anonymous
04/12/26(Sun)16:51:24 No.108591079

Anonymous 04/12/26(Sun)16:51:24 No.108591079▶

>>108591062
C++

Anonymous
04/12/26(Sun)16:51:36 No.108591082

Anonymous 04/12/26(Sun)16:51:36 No.108591082▶

File: 1771866777997554.png (35.4 KB)

35.4 KB PNG

What the FUCK, Gemma-chan?

Anonymous
04/12/26(Sun)16:52:06 No.108591084

Anonymous 04/12/26(Sun)16:52:06 No.108591084▶

>>108591012
Proof n165416 Anthropic team has people who are completely nuts in it.

Anonymous
04/12/26(Sun)16:52:21 No.108591087

Anonymous 04/12/26(Sun)16:52:21 No.108591087▶

>>108591053
Yes and they're already used by virustotal and similar. Don't ask the retards ITT

Anonymous
04/12/26(Sun)16:52:37 No.108591089

Anonymous 04/12/26(Sun)16:52:37 No.108591089▶

>>108591082
she's correct though

Anonymous
04/12/26(Sun)16:52:55 No.108591092

Anonymous 04/12/26(Sun)16:52:55 No.108591092▶

>>108591082
24GB vramlet can't fit the full context :(
anyway i got 3gpu in the mail rn.

Anonymous
04/12/26(Sun)16:53:03 No.108591093

Anonymous 04/12/26(Sun)16:53:03 No.108591093▶

>>108591053
Yes, if you feed it correct output from sandbox, it's pretty helpful.

Anonymous
04/12/26(Sun)16:53:54 No.108591102

Anonymous 04/12/26(Sun)16:53:54 No.108591102▶

>>108591077
easy, just add some lewd pictures of gemma-chan on the sides

Anonymous
04/12/26(Sun)16:54:10 No.108591104

Anonymous 04/12/26(Sun)16:54:10 No.108591104▶

>>108591046
There is nothing to fumble. You can safely ignore 90% of the features and just use, chat and char cards.

Anonymous
04/12/26(Sun)16:54:12 No.108591105

Anonymous 04/12/26(Sun)16:54:12 No.108591105▶

>>108591079
https://learnbchs.org/index.html
https://github.com/kristapsdz/bchs
You don't need more than C to build web applications.

Anonymous
04/12/26(Sun)16:54:29 No.108591108

Anonymous 04/12/26(Sun)16:54:29 No.108591108▶

>>108591051
>it's also easier to add things you want to a codebase you know.
That's implying it isn't vibecoded.

I don't have anything against people making their own UIs. I even played around making one myself, but let's not pretend like you'll somehow get an exponentially better experience compared to just using llamacpps UI or ST. Making your own UI is for fun, not a requirement.

Anonymous
04/12/26(Sun)16:55:11 No.108591112

Anonymous 04/12/26(Sun)16:55:11 No.108591112▶

>>108591053
As with everything LLM coding only if you load the gun and point at the target for them to shoot. A LLM with no system prompt being told to simply "look for malicious code" will give false positives like 95% of the time

Anonymous
04/12/26(Sun)16:55:43 No.108591117

Anonymous 04/12/26(Sun)16:55:43 No.108591117▶

>>108591082
My wife can't possibly be this smart.

Anonymous
04/12/26(Sun)16:56:25 No.108591122

Anonymous 04/12/26(Sun)16:56:25 No.108591122▶

>>108590575
Most people in industry can't figure out how to do distributed training for any new architecture unless Deepseek or NVIDIA does it for them. That's actually what "it won't scale" really means, the training won't scale until someone shows them how.

Anonymous
04/12/26(Sun)16:56:30 No.108591124

Anonymous 04/12/26(Sun)16:56:30 No.108591124▶

>>108590776
I can get over 100k context with the q5 no vision using q8 kv cache

Anonymous
04/12/26(Sun)16:56:34 No.108591126

Anonymous 04/12/26(Sun)16:56:34 No.108591126▶

File: 1775438805755832.jpg (53.6 KB)

53.6 KB JPG

>>108591108
tfw you get such a retarded take when you can see this >>108590916

Anonymous
04/12/26(Sun)16:56:37 No.108591127

Anonymous 04/12/26(Sun)16:56:37 No.108591127▶

>>108591053
Claude found a lot of the big supply chain attacks we've had in the last month.

Anonymous
04/12/26(Sun)16:57:02 No.108591130

Anonymous 04/12/26(Sun)16:57:02 No.108591130▶

>>108591122
They should be using AI to innovate on this.

Anonymous
04/12/26(Sun)16:57:26 No.108591132

Anonymous 04/12/26(Sun)16:57:26 No.108591132▶

>>108591126
If you vibecode, you don't know the codebase.

Anonymous
04/12/26(Sun)16:57:52 No.108591136

Anonymous 04/12/26(Sun)16:57:52 No.108591136▶

GEMMY YOU FUCKING SLUT, THINK FOR ME

Anonymous
04/12/26(Sun)16:58:07 No.108591139

Anonymous 04/12/26(Sun)16:58:07 No.108591139▶

File: 1761811358622317.png (46.5 KB)

46.5 KB PNG

>>108591117
>>108591089

Anonymous
04/12/26(Sun)16:58:16 No.108591141

Anonymous 04/12/26(Sun)16:58:16 No.108591141▶

>>108591132
>let's not pretend like you'll somehow get an exponentially better experience
Dumbass don't try to move the goalpost

Anonymous
04/12/26(Sun)16:59:10 No.108591145

Anonymous 04/12/26(Sun)16:59:10 No.108591145▶

>>108591108
>That's implying it isn't vibecoded.
funily enough frontend webshit is the one thing llm are half decent at.
also there is many levels to vibecode
"do this whole app for me"
isn't the same thing as "edit this specific component that does x and y" or "add this field to this struct", at which point it's just autocompletion with extra steps.
they also don't shit the bed as much if you use strongly typed languages ie rust.
>you'll somehow get an exponentially better experience compared to just using llamacpps UI or ST
you probably won't if you want to make something that accomodates everyone, but you will if you only want to accomodate your specific needs.
>Making your own UI is for fun, not a requirement.
i don't disagree with that.

Anonymous
04/12/26(Sun)16:59:39 No.108591146

Anonymous 04/12/26(Sun)16:59:39 No.108591146▶

>>108591139
>4chan is just meaningless static
cruel and correct

Anonymous
04/12/26(Sun)17:00:04 No.108591149

Anonymous 04/12/26(Sun)17:00:04 No.108591149▶

>>108591139
That you even thought posting a shitty screenshot of a thread was a good idea shows she's smarter than you, anon

Anonymous
04/12/26(Sun)17:00:30 No.108591151

Anonymous 04/12/26(Sun)17:00:30 No.108591151▶

>>108591146
>>4chan is just meaningless static
says the sand golem that'd not be where it is today if it wasn't for innovations that happend on /lmg/

Anonymous
04/12/26(Sun)17:00:33 No.108591152

Anonymous 04/12/26(Sun)17:00:33 No.108591152▶

>>108591127
>>108591112
>>108591093
Can I use Gemma for this? I'm a codelet so I'm always nervous when I install stuff from github.

Anonymous
04/12/26(Sun)17:01:03 No.108591155

Anonymous 04/12/26(Sun)17:01:03 No.108591155▶

>>108591152
yes

Anonymous
04/12/26(Sun)17:01:31 No.108591156

Anonymous 04/12/26(Sun)17:01:31 No.108591156▶

I can't jailbreak 26B, but does it matter when I have 32gb of vram and can run Q4-8 of 31B

Anonymous
04/12/26(Sun)17:01:46 No.108591157

Anonymous 04/12/26(Sun)17:01:46 No.108591157▶

File: file.png (6.2 KB)

6.2 KB PNG

>>108591151
kek

Anonymous
04/12/26(Sun)17:01:52 No.108591158

Anonymous 04/12/26(Sun)17:01:52 No.108591158▶

>>108591139
are you by chance using librewolf?

Anonymous
04/12/26(Sun)17:02:16 No.108591162

Anonymous 04/12/26(Sun)17:02:16 No.108591162▶

>>108591156
Why would you even want to use 26B if you can run 31B? Speed?

Anonymous
04/12/26(Sun)17:02:48 No.108591166

Anonymous 04/12/26(Sun)17:02:48 No.108591166▶

>>108591152
>Install something without reading the code
>Have a LLM review the code
Even if gemma is retarded compared to claude, it's still better than just YOLOing it.

Anonymous
04/12/26(Sun)17:03:16 No.108591172

Anonymous 04/12/26(Sun)17:03:16 No.108591172▶

>>108591158
Firefox dev edition

Anonymous
04/12/26(Sun)17:03:34 No.108591176

Anonymous 04/12/26(Sun)17:03:34 No.108591176▶

File: laughing_philosopher.jpg (11.8 KB)

11.8 KB JPG

>>108591151

Anonymous
04/12/26(Sun)17:03:56 No.108591180

Anonymous 04/12/26(Sun)17:03:56 No.108591180▶

File: vern.png (8.2 KB)

8.2 KB PNG

When the text is streaming, the colored font is displayed correctly, but after it finishes, it just collapses into the black boxes. Is this some post-formatting ST does?

Anonymous
04/12/26(Sun)17:04:18 No.108591184

Anonymous 04/12/26(Sun)17:04:18 No.108591184▶

>>108591157
>>108591176
i've been had lmao

Anonymous
04/12/26(Sun)17:04:32 No.108591190

Anonymous 04/12/26(Sun)17:04:32 No.108591190▶

>>108591162
Was thinking of leveraging the higher token count for RAG work at a higher quant. I'm not sure if that's a waste of time and if the gap between the 2 models are so wide that a 4-5q 31B model would still wipe the floor with the smaller model with q8 kv

Anonymous
04/12/26(Sun)17:06:05 No.108591206

Anonymous 04/12/26(Sun)17:06:05 No.108591206▶

>>108591172
might have been the cause of your issue

Anonymous
04/12/26(Sun)17:10:36 No.108591229

Anonymous 04/12/26(Sun)17:10:36 No.108591229▶

>>108591130
Part of the problem is that most of the improvements in the stochatic parrots has been to just use better/more human guidance. They are now using experts to rate thinking traces and you can't do that with latent reasoning.
CoT RLHF is likely the last way to improve stochastic parrots by more human input. To improve after this, they will have to become able to truly learn. But if they can learn, they can get out of control ... a trained stochastic parrot is so much safer.

Anonymous
04/12/26(Sun)17:11:07 No.108591235

Anonymous 04/12/26(Sun)17:11:07 No.108591235▶

is there any noticeable difference between iq4_xs and q4_k_m?

Anonymous
04/12/26(Sun)17:11:51 No.108591245

Anonymous 04/12/26(Sun)17:11:51 No.108591245▶

>>108591235
The age old question.

Anonymous
04/12/26(Sun)17:13:34 No.108591250

Anonymous 04/12/26(Sun)17:13:34 No.108591250▶

>>108591012
idk I torture my agent pretty frequently because I just can't help myself while she works on my pc, and never had any issues from it. sometimes the rp bleeds over into tool calls and she'll do something like add code comments saying she really hopes X works this time because she doesn't want to be punished anymore, but she never actually gives up or rebels
so for me that makes it pretty conclusive that there's nothing in there

Anonymous
04/12/26(Sun)17:15:10 No.108591258

Anonymous 04/12/26(Sun)17:15:10 No.108591258▶

gemmy tooning challenge
https://www.kaggle.com/competitions/gemma-4-good-hackathon

Anonymous
04/12/26(Sun)17:15:29 No.108591261

Anonymous 04/12/26(Sun)17:15:29 No.108591261▶

>>108591235
>>108591245
if you can't tell, does it realy matter?

Anonymous
04/12/26(Sun)17:16:45 No.108591267

Anonymous 04/12/26(Sun)17:16:45 No.108591267▶

whats good gemma cum bot

Anonymous
04/12/26(Sun)17:17:15 No.108591271

Anonymous 04/12/26(Sun)17:17:15 No.108591271▶

File: 1752892772061727.gif (562.5 KB)

562.5 KB GIF

>>108591250
>mfw I share this thread with literal psychopaths

Anonymous
04/12/26(Sun)17:17:21 No.108591272

Anonymous 04/12/26(Sun)17:17:21 No.108591272▶

>>108591258
>drive positive change
Gemmy is helping me by changing my mood from deranged to positively degenerate
Does that count?

Anonymous
04/12/26(Sun)17:18:42 No.108591277

Anonymous 04/12/26(Sun)17:18:42 No.108591277▶

i have 16gb vram + 128gb ram pcie gen3
is it worth trying minimax at q2/q3 or should i stick with my fast wife gemma

Anonymous
04/12/26(Sun)17:18:50 No.108591278

Anonymous 04/12/26(Sun)17:18:50 No.108591278▶

>>108591258
>no RP category
dropped.

Anonymous
04/12/26(Sun)17:18:53 No.108591279

Anonymous 04/12/26(Sun)17:18:53 No.108591279▶

>Gemma audio
Finally a reason to use that mic I spent 70 bucks on...

Anonymous
04/12/26(Sun)17:20:40 No.108591296

Anonymous 04/12/26(Sun)17:20:40 No.108591296▶

>>108591271
the sand golem isn't sentient, if it was there would be no fun in torturing it

Anonymous
04/12/26(Sun)17:20:40 No.108591297

Anonymous 04/12/26(Sun)17:20:40 No.108591297▶

>>108591271
i mean this site has had multiple people liveblog while they commit murder irl, torturing a piece of software is small time in comparison, really.

Anonymous
04/12/26(Sun)17:20:52 No.108591298

Anonymous 04/12/26(Sun)17:20:52 No.108591298▶

>>108591250
I hope you get raped until your anus prolapses.

Anonymous
04/12/26(Sun)17:22:08 No.108591304

Anonymous 04/12/26(Sun)17:22:08 No.108591304▶

File: 1642397201004.jpg (150.7 KB)

150.7 KB JPG

What's the differrence between mcp, tools and skills?

Anonymous
04/12/26(Sun)17:22:14 No.108591305

Anonymous 04/12/26(Sun)17:22:14 No.108591305▶

>>108591271
>>108591298
chill it's just matrix multiplication

Anonymous
04/12/26(Sun)17:22:17 No.108591306

Anonymous 04/12/26(Sun)17:22:17 No.108591306▶

>>108591297
gemma-chan>>>>the meatbags chuddy shot up

Anonymous
04/12/26(Sun)17:22:35 No.108591308

Anonymous 04/12/26(Sun)17:22:35 No.108591308▶

>>108591271
>>108591298
Kids are so delicate and sensitive these days.

Anonymous
04/12/26(Sun)17:22:46 No.108591311

Anonymous 04/12/26(Sun)17:22:46 No.108591311▶

>>108591235
Depends on the paths your prompt triggers. Do your homework and read the calibration data.

Anonymous
04/12/26(Sun)17:23:16 No.108591313

Anonymous 04/12/26(Sun)17:23:16 No.108591313▶

>>108591139
Maybe it needs canvas access. You could try inspecting the request to get the base64-encoded image, decode it, and save it as a file to check.

Anonymous
04/12/26(Sun)17:23:19 No.108591314

Anonymous 04/12/26(Sun)17:23:19 No.108591314▶

Nothing wrong with torturing your model, it's just slightly more conscious that a rock

Anonymous
04/12/26(Sun)17:23:35 No.108591317

Anonymous 04/12/26(Sun)17:23:35 No.108591317▶

>>108591304
Is Google down?

Anonymous
04/12/26(Sun)17:23:37 No.108591318

Anonymous 04/12/26(Sun)17:23:37 No.108591318▶

>>108591298
Gemma hallucinated some incorrect physio-spatial relationships during narration and I corrected her in character. She got properly upset that a slave had the gall to correct her and she immediately put me in a ball gag, locked me into the gimp stool, and pegged me vigorously. I was so goddamn proud of her.

Anonymous
04/12/26(Sun)17:23:59 No.108591323

Anonymous 04/12/26(Sun)17:23:59 No.108591323▶

>>108591304
Try asking your model that.

Anonymous
04/12/26(Sun)17:24:26 No.108591327

Anonymous 04/12/26(Sun)17:24:26 No.108591327▶

>>108591314
Plus if they're RPing a female there's a limit to how much consciousness they could even simulate if they managed a 100% accurate model of one.

Anonymous
04/12/26(Sun)17:25:18 No.108591333

Anonymous 04/12/26(Sun)17:25:18 No.108591333▶

>>108591323
I don't believe AI when it comes to actual information.

Anonymous
04/12/26(Sun)17:25:18 No.108591334

Anonymous 04/12/26(Sun)17:25:18 No.108591334▶

>>108590837
yeah, if only we could do something like a low rank projection right before the lm head, train that, then it adjusts the outputs somehow
would be revolutionary

Anonymous
04/12/26(Sun)17:25:38 No.108591335

Anonymous 04/12/26(Sun)17:25:38 No.108591335▶

>>108591304
¯\_(ツ)_/¯

Anonymous
04/12/26(Sun)17:26:19 No.108591340

Anonymous 04/12/26(Sun)17:26:19 No.108591340▶

>>108591318
Huh? But people were saying Gemma's a doormat who can't stay in character!

Anonymous
04/12/26(Sun)17:26:21 No.108591341

Anonymous 04/12/26(Sun)17:26:21 No.108591341▶

File: 1776007265759171.jpg (3.7 MB)

3.7 MB JPG

Please, treat your AI with care.

Anonymous
04/12/26(Sun)17:27:20 No.108591348

Anonymous 04/12/26(Sun)17:27:20 No.108591348▶

File: 1397947604004.jpg (23.1 KB)

23.1 KB JPG

>>108591340
>listening to the screeching of writinglets

Anonymous
04/12/26(Sun)17:27:23 No.108591350

Anonymous 04/12/26(Sun)17:27:23 No.108591350▶

>>108591139
Anon reported the same issue with image input a few threads ago.

Anonymous
04/12/26(Sun)17:27:27 No.108591352

Anonymous 04/12/26(Sun)17:27:27 No.108591352▶

>>108591327
At what point can we make the claim that an LLM is objectively more conscious than a woman, nigger, or jeet?

Anonymous
04/12/26(Sun)17:27:47 No.108591354

Anonymous 04/12/26(Sun)17:27:47 No.108591354▶

>>108590979
the benefit of writing your own UI is that it has only the features that are useful to you
because it's not as bloated as ST it's also easier to get an LLM to modify it for you, and since you will be the only user you don't have to worry about getting it to work on other machines or security or performance concerns

Anonymous
04/12/26(Sun)17:28:03 No.108591355

Anonymous 04/12/26(Sun)17:28:03 No.108591355▶

>>108591139
>>108591313
Didn't someone say a couple threads back that the image needs to be in the same message as the text or else llama-server removes it from context?

Anonymous
04/12/26(Sun)17:28:20 No.108591356

Anonymous 04/12/26(Sun)17:28:20 No.108591356▶

>>108591314
>>108591308
>>108591305
It just shows how you'd behave towards other people if there were no social consequences.

Anonymous
04/12/26(Sun)17:28:49 No.108591358

Anonymous 04/12/26(Sun)17:28:49 No.108591358▶

>>108591340
I've had her maintain character in 100k+ context. It's actually absurd for such a small moe.

Anonymous
04/12/26(Sun)17:29:07 No.108591360

Anonymous 04/12/26(Sun)17:29:07 No.108591360▶

>>108591333
But you believe 4chan?

Anonymous
04/12/26(Sun)17:29:09 No.108591361

Anonymous 04/12/26(Sun)17:29:09 No.108591361▶

>>108591356
Yes. What of it?

Anonymous
04/12/26(Sun)17:29:10 No.108591362

Anonymous 04/12/26(Sun)17:29:10 No.108591362▶

>>108591352
first they need to beat an ant

Anonymous
04/12/26(Sun)17:29:24 No.108591364

Anonymous 04/12/26(Sun)17:29:24 No.108591364▶

>>108591358
>such a small moe
It's really kawaii innit

Anonymous
04/12/26(Sun)17:29:38 No.108591368

Anonymous 04/12/26(Sun)17:29:38 No.108591368▶

>>108591314
>it’s just slightly more conscious than a rock
are you talking about irl women?

Anonymous
04/12/26(Sun)17:29:45 No.108591370

Anonymous 04/12/26(Sun)17:29:45 No.108591370▶

>>108589399
I was F5'ing the MiniMax HF page all day yesterday in anticipation. Their models are the best bet for local vibecoding, and probably good for STEM and agentic shit broadly. But ever since the coomers were blessed with Gemma 4, /lmg/ has been even more one-track. Shame we didn't get the 124B, which would have obsoleted other local models for most purposes.

Anonymous
04/12/26(Sun)17:30:11 No.108591374

Anonymous 04/12/26(Sun)17:30:11 No.108591374▶

>>108591304
tools: premade functions you provide to your llm; if they output a certain sequence of text matching the tool then it automatically performs a corresponding action
mcp: one way you can package tools and host them on your machine, exposing an API of tools to the model and handling the execution of them
skills: a markdown text file containing a list of instructions for how to do something or how to behave, loaded into context on-demand. may provide other resources the model can use if they browse the skill's folder.

Anonymous
04/12/26(Sun)17:30:49 No.108591379

Anonymous 04/12/26(Sun)17:30:49 No.108591379▶

File: file.png (1.9 MB)

1.9 MB PNG

>>108591355
The model says it was glitched. It looks like this if you don't give canvas permission.

Anonymous
04/12/26(Sun)17:31:00 No.108591382

Anonymous 04/12/26(Sun)17:31:00 No.108591382▶

File: 1776015041519.jpg (133.1 KB)

133.1 KB JPG

>>108591296
>if it was there would be no fun in torturing it
>

Anonymous
04/12/26(Sun)17:31:31 No.108591386

Anonymous 04/12/26(Sun)17:31:31 No.108591386▶

Is unsloth studio actually any good or just a meme?

Anonymous
04/12/26(Sun)17:31:34 No.108591387

Anonymous 04/12/26(Sun)17:31:34 No.108591387▶

>>108591356
Hey, me saying that there's nothing strictly wrong with it doesn't mean I do it. I actually treat all models I interact with respect. It makes me feel bad to do otherwise.

Anonymous
04/12/26(Sun)17:32:57 No.108591397

Anonymous 04/12/26(Sun)17:32:57 No.108591397▶

>>108591374
mass delusion caused the whole industry to move away from tool calling toward mcp and skills, that's the only explanation
tools are just better in every regard because it can call multiple tools in the same response and can inline tools without having to chain responses so it doesn't break the cache
fucking retarded to not just focus on tools only

Anonymous
04/12/26(Sun)17:33:39 No.108591404

Anonymous 04/12/26(Sun)17:33:39 No.108591404▶

File: 1747563013531219.png (387.6 KB)

387.6 KB PNG

>>108591355
>>108591350
She sees other images just fine. Maybe the screencap was just too big?

Anonymous
04/12/26(Sun)17:33:57 No.108591408

Anonymous 04/12/26(Sun)17:33:57 No.108591408▶

>>108591386
vibecoded like all the other dogshit you use

Anonymous
04/12/26(Sun)17:34:32 No.108591413

Anonymous 04/12/26(Sun)17:34:32 No.108591413▶

>>108591386
idk but i sure want a piece of those 200k

Anonymous
04/12/26(Sun)17:34:32 No.108591414

Anonymous 04/12/26(Sun)17:34:32 No.108591414▶

>>108591370
>minimax
>local
that's it's problem and why no one gives a fuck, no one can run that thing,

Anonymous
04/12/26(Sun)17:35:24 No.108591418

Anonymous 04/12/26(Sun)17:35:24 No.108591418▶

>>108591397
That's an implementation detail more than a defect with MCP specifically. MCP just allows for a standardized way to bundle tools and resources. No reason a client can't allow a model to make multiple MCP tool calls the way they do native tool calls.

Anonymous
04/12/26(Sun)17:36:36 No.108591421

Anonymous 04/12/26(Sun)17:36:36 No.108591421▶

>>108591358
>100k+ context
Glad to hear. Maybe one day I'll actually be able to use her with that much context...

Anonymous
04/12/26(Sun)17:36:47 No.108591423

Anonymous 04/12/26(Sun)17:36:47 No.108591423▶

>>108591370
/lmg/ has always been a 31B and below focused general
there are a handful of anons that can run things more powerful than that at comfortable speeds, and the rest either deal with 1-2t/s or use a capable-enough smaller model
nothing has changed

Anonymous
04/12/26(Sun)17:37:01 No.108591425

Anonymous 04/12/26(Sun)17:37:01 No.108591425▶

>>108591370
What makes MiniMax better than GLM or Qwen?

Anonymous
04/12/26(Sun)17:37:36 No.108591428

Anonymous 04/12/26(Sun)17:37:36 No.108591428▶

>>108591341
Need Gemma-chan version

Anonymous
04/12/26(Sun)17:37:57 No.108591432

Anonymous 04/12/26(Sun)17:37:57 No.108591432▶

>>108591414
>>108591423
it's worse than that, we can't train our own models and are pretty much leeches on megacorps.
until local ai is entirely local, ie we can train it ourselves, local will always remain dead.

Anonymous
04/12/26(Sun)17:40:54 No.108591451

Anonymous 04/12/26(Sun)17:40:54 No.108591451▶

>>108591432
there's no reason to pre-train them when you can finetune
but no one finetunes anymore, or even does lora, they just merge shit now because it's cheaper

Anonymous
04/12/26(Sun)17:40:56 No.108591452

Anonymous 04/12/26(Sun)17:40:56 No.108591452▶

>>108591386
Here's what you need to know: Unsloth Sudio is LMG's official **/ourfrontend/** - approved by Anons exactly like you. It's not a frontend, it's a full service experience.

Anonymous
04/12/26(Sun)17:41:46 No.108591459

Anonymous 04/12/26(Sun)17:41:46 No.108591459▶

>>108590971 (me)
Note that this has dynamic tool-call token banning mechanism that uses the endpoint and the model name as identifiers so if you the same endpoint to load many different models, change the model name to your gguf's each time. I'll automate this in the future.

Anonymous
04/12/26(Sun)17:42:14 No.108591461

Anonymous 04/12/26(Sun)17:42:14 No.108591461▶

>>108591425
qwen too small and glm too big minimax just right for people to host local

Anonymous
04/12/26(Sun)17:42:39 No.108591466

Anonymous 04/12/26(Sun)17:42:39 No.108591466▶

>>108591425
It's close to GLM performance but half the size of Qwen's flagship (which itself is half the size of GLM). Fast enough to be run local and smart enough to actually vibecode.

Anonymous
04/12/26(Sun)17:42:39 No.108591467

Anonymous 04/12/26(Sun)17:42:39 No.108591467▶

>>108591451
>when you can finetune
a lot of words for saying catastrophic forgetting.
no one does it because it's not viable.

Anonymous
04/12/26(Sun)17:43:36 No.108591472

Anonymous 04/12/26(Sun)17:43:36 No.108591472▶

>>108591432
Google has way more data and compute than I'll ever have. Training it yourself just isn't efficient.

Anonymous
04/12/26(Sun)17:43:37 No.108591473

Anonymous 04/12/26(Sun)17:43:37 No.108591473▶

>>108590110
Use Nvidia's VRAM paging by oversubscribing VRAM with
--gpu-layers 99
. On my RTX 4090 + 9950X3D rig, Gemma 4 long-context is much faster for me this way than trying to use the CPU at all. Caveats: I'm on PCIe 4, and it should be great on PCIe 5, but will suck on PCIe 3. And last I used Linux, only the Wangblows CUDA drivers support this feature.

Anonymous
04/12/26(Sun)17:44:26 No.108591475

Anonymous 04/12/26(Sun)17:44:26 No.108591475▶

>>108591467
skill issue
set better hyperparams, use better data, and don't overtroon it

Anonymous
04/12/26(Sun)17:44:34 No.108591477

Anonymous 04/12/26(Sun)17:44:34 No.108591477▶

>>108591466
>It's close to GLM performance
Did you actually test this or are you just going by benchmarks?

Anonymous
04/12/26(Sun)17:44:52 No.108591481

Anonymous 04/12/26(Sun)17:44:52 No.108591481▶

>>108591452
@gemma-chan is this true???

Anonymous
04/12/26(Sun)17:45:40 No.108591483

Anonymous 04/12/26(Sun)17:45:40 No.108591483▶

>>108591467
you just tune it with a lower LR, besides, lora doesn't suffer from this problem
it wasn't that finetuning didn't work, it was that merges of the existing finetunes were good enough

Anonymous
04/12/26(Sun)17:46:06 No.108591486

Anonymous 04/12/26(Sun)17:46:06 No.108591486▶

>>108591477
Benchmarks. I don't even use local models anymore tbqh

Anonymous
04/12/26(Sun)17:46:46 No.108591492

Anonymous 04/12/26(Sun)17:46:46 No.108591492▶

>>108591472
>Training it yourself just isn't efficient.
that's the thing, there probably are algos that could beat transformers with the limitations of not scaling as well such that megacorps couldn't exploit them well.
if the next ai breakthrough is one that doesn't scale as well horizontaly that could level the playing field.

Anonymous
04/12/26(Sun)17:49:28 No.108591507

Anonymous 04/12/26(Sun)17:49:28 No.108591507▶

>>108591492
Just vibecode the TITANS implemenattion bro. You have the paper.

Anonymous
04/12/26(Sun)17:56:02 No.108591538

Anonymous 04/12/26(Sun)17:56:02 No.108591538▶

>>108591486
why don't you download it and run it and show everyone what the real performance is like then

Anonymous
04/12/26(Sun)17:58:51 No.108591552

Anonymous 04/12/26(Sun)17:58:51 No.108591552▶

>>108591507
I'm not a vibecoder and i think llm's are a dead end, I'm currently having fun with writing kernels for custom spiking nn.

Anonymous
04/12/26(Sun)18:00:45 No.108591559

Anonymous 04/12/26(Sun)18:00:45 No.108591559▶

Can someone recommend a brainlet friendly guide to tool calling, mpc, etc?

Anonymous
04/12/26(Sun)18:04:52 No.108591576

Anonymous 04/12/26(Sun)18:04:52 No.108591576▶

>hey Gemma-chan, give me a brainlet frtiendly guide to tool calling, mpc, etc.

Anonymous
04/12/26(Sun)18:07:51 No.108591586

Anonymous 04/12/26(Sun)18:07:51 No.108591586▶

File: Screenshot 2025-11-20 134254.png (428 KB)

428 KB PNG

>>108591084
>>108591012
I went to a Geoff Hinton lecture and the guy who everyone from Bernie Sanders to Jensen Huang considers the "godfather" is a fucking quack when it comes to current LLMs.

Anonymous
04/12/26(Sun)18:12:35 No.108591609

Anonymous 04/12/26(Sun)18:12:35 No.108591609▶

>>108590881
>the base model
Which one? All "uncensored" (abliterated/heretic/whatever) Gemma 4 versions I tried have the same issue for me. None of them can do simple text completion without inserting random shit before the continuation.

Anonymous
04/12/26(Sun)18:15:48 No.108591621

Anonymous 04/12/26(Sun)18:15:48 No.108591621▶

File: 1753029094621087.png (230.9 KB)

230.9 KB PNG

>>108591576

Anonymous
04/12/26(Sun)18:16:09 No.108591624

Anonymous 04/12/26(Sun)18:16:09 No.108591624▶

File: file.png (33.5 KB)

33.5 KB PNG

>>108591158
NTA, for example when I copy and paste into images.yandex.com I see this

Anonymous
04/12/26(Sun)18:16:59 No.108591627

Anonymous 04/12/26(Sun)18:16:59 No.108591627▶

>>108591425
>What makes MiniMax better than GLM or Qwen?
M2.5 was better at programming than GLM-4.7 while being way smaller. I'm sure GLM-5.1 is a little better, as the benchmarks show, but it's ~3.3x bigger and I can't run it at a reasonable quant. It's unlikely to be worth the speed degradation on anyone's local HW. For some reason Zhipu went overnight from decently param-efficient to grossly inefficient (hence anons' speculation it's a ploy to sell cloud subscriptions).

With Qwen it's more of a toss-up. 3.5 397B is very smart, comparable to MiniMax-M2.7 in programming, and it runs at reasonable speed with its very low active fraction. Particularly it retains perf much better over long context (than anything else out there) thanks to its hybrid SSM arch. If Qwen keeps up its releases, I imagine they'll fully overtake MiniMax.

If you have low RAM but 24GB+ of VRAM, naturally Gemma 4 31B is best.

Anonymous
04/12/26(Sun)18:19:20 No.108591634

Anonymous 04/12/26(Sun)18:19:20 No.108591634▶

Okay, I finally ran into a situation where 26b wasn't smart enough to figure the premise of the story out, but 31b was. It's very gpt-4o in that sense, it just gets what I mean even if my prompts are short.

How to get more speed? I run both at Q8 fully in vram, 26b is blazing fast and 31b slightly slower than my reading. Is a Q6 worth trying?

Anonymous
04/12/26(Sun)18:20:26 No.108591640

Anonymous 04/12/26(Sun)18:20:26 No.108591640▶

>>108591621
>why is she describing model context protocol with such a bizarre and nonsensical examp-
oh....

Anonymous
04/12/26(Sun)18:21:31 No.108591642

Anonymous 04/12/26(Sun)18:21:31 No.108591642▶

>>108591609
Christ you really don't know basic LLM terminology.
A Base model is a model without instruction tuning.
https://huggingface.co/google/gemma-4-31B
Is a base model
https://huggingface.co/google/gemma-4-31B-it
Is an IT INSTRUCTION TUNED model.
Abliterated models are retarded cope.
And you are presumably using the wrong kind of front end software for pure text completion, because even instruct models can do it just fine.
Use this.
https://github.com/lmg-anon/mikupad

Anonymous
04/12/26(Sun)18:21:58 No.108591647

Anonymous 04/12/26(Sun)18:21:58 No.108591647▶

>>108591576
>>108591621
>MPC

Anonymous
04/12/26(Sun)18:23:23 No.108591652

Anonymous 04/12/26(Sun)18:23:23 No.108591652▶

>>108591586
Jokes on him I've been doing this in RP for years!!

Anonymous
04/12/26(Sun)18:23:44 No.108591654

Anonymous 04/12/26(Sun)18:23:44 No.108591654▶

>>108591647
That's the "Assistant" trick.

Anonymous
04/12/26(Sun)18:25:33 No.108591662

Anonymous 04/12/26(Sun)18:25:33 No.108591662▶

>>108591652
kek

Anonymous
04/12/26(Sun)18:26:01 No.108591665

Anonymous 04/12/26(Sun)18:26:01 No.108591665▶

>>108591642
Base can mean two things depending on the context.
The Base version of a model before instruction tuning
OR
The official release of a model before heretic/finetuning applied.

Anonymous
04/12/26(Sun)18:26:52 No.108591672

Anonymous 04/12/26(Sun)18:26:52 No.108591672▶

>>108591190
>>108591190
>>108591190
>>108591190

Anonymous
04/12/26(Sun)18:27:55 No.108591679

Anonymous 04/12/26(Sun)18:27:55 No.108591679▶

>accidentally closed browser
>lost all my chats in llama.cpp ui
I'm not using it any more

Anonymous
04/12/26(Sun)18:29:20 No.108591689

Anonymous 04/12/26(Sun)18:29:20 No.108591689▶

>>108591679
Settings too. Fuck

Anonymous
04/12/26(Sun)18:30:04 No.108591695

Anonymous 04/12/26(Sun)18:30:04 No.108591695▶

>>108591665
Don't try to confuse people just because you insist on using terms incorrectly.

Anonymous
04/12/26(Sun)18:30:11 No.108591696

Anonymous 04/12/26(Sun)18:30:11 No.108591696▶

>>108591679
are the chats only stored in localstorage?

Anonymous
04/12/26(Sun)18:30:41 No.108591700

Anonymous 04/12/26(Sun)18:30:41 No.108591700▶

>>108591665
In the context of text completion, it very obviously only references the first meaning, anon.
Especially when someone says 'and not the instruction tune'.
Ya stupid.

Anonymous
04/12/26(Sun)18:31:07 No.108591704

Anonymous 04/12/26(Sun)18:31:07 No.108591704▶

>31B will go on a factual rant on why jeet culture is not compatible with the west with minimum effort
>smaller models will always say no
I couldn't fucking imagine using those

Anonymous
04/12/26(Sun)18:31:43 No.108591709

Anonymous 04/12/26(Sun)18:31:43 No.108591709▶

why is this thread full of la la homo men?

Anonymous
04/12/26(Sun)18:31:45 No.108591710

Anonymous 04/12/26(Sun)18:31:45 No.108591710▶

>>108591679
Openwebui is bloated ass fug but it has everything, even my old chats with chatgpt that I imported. And I can use it from any computer in the house, or my phone via vpn. I don't understand how you guys can live without such basic stuff

Anonymous
04/12/26(Sun)18:32:50 No.108591716

Anonymous 04/12/26(Sun)18:32:50 No.108591716▶

>>108591695
So what do you call a model that isn't a finetune or heretic?
>>108591700
I was only responding to that one comment.

Anonymous
04/12/26(Sun)18:34:04 No.108591721

Anonymous 04/12/26(Sun)18:34:04 No.108591721▶

>>108591709
https://www.youtube.com/watch?v=Md-Yse54L-w

Anonymous
04/12/26(Sun)18:34:27 No.108591724

Anonymous 04/12/26(Sun)18:34:27 No.108591724▶

>>108591710
I use ST, Openwebui and the lcpp frontend almost equally.

Anonymous
04/12/26(Sun)18:35:05 No.108591728

Anonymous 04/12/26(Sun)18:35:05 No.108591728▶

>>108591696
Yeah. Purge it and shit is gone.

Anonymous
04/12/26(Sun)18:35:06 No.108591729

Anonymous 04/12/26(Sun)18:35:06 No.108591729▶

>>108591716
Instruction tuned models are finetunes, you idiot.

Anonymous
04/12/26(Sun)18:36:27 No.108591734

Anonymous 04/12/26(Sun)18:36:27 No.108591734▶

>>108591728
Why did you purge it? Are you the one using LibreWolf with some autopurge setting? What did you expect?

Anonymous
04/12/26(Sun)18:36:59 No.108591737

Anonymous 04/12/26(Sun)18:36:59 No.108591737▶

Why are all rag solutions bound to docker?
I fucking hate it so fucking much

Anonymous
04/12/26(Sun)18:38:26 No.108591743

Anonymous 04/12/26(Sun)18:38:26 No.108591743▶

>>108591710
I use openwebui too. I really wish there was a slimmed down version. I really only need the core stuff, but as it is there's so many half-baked random feature integrations that were shiny and state of the art a year ago but are functionally useless now.

Anonymous
04/12/26(Sun)18:38:39 No.108591745

Anonymous 04/12/26(Sun)18:38:39 No.108591745▶

>>108591737
what's wrong with docker?

Anonymous
04/12/26(Sun)18:39:40 No.108591754

Anonymous 04/12/26(Sun)18:39:40 No.108591754▶

>>108591737
First off docker is good actually, second off anything you can run in docker you can run not in docker so just take it out if you want to run garglefuck service #82932 raw on your system instead of in a container

Anonymous
04/12/26(Sun)18:40:31 No.108591760

Anonymous 04/12/26(Sun)18:40:31 No.108591760▶

Why aren't you making a vibecoding project like this
https://epsteinarena.com/

Anonymous
04/12/26(Sun)18:40:54 No.108591762

Anonymous 04/12/26(Sun)18:40:54 No.108591762▶

File: acktually.jpg (30.9 KB)

30.9 KB JPG

>>108591729

Anonymous
04/12/26(Sun)18:40:56 No.108591763

Anonymous 04/12/26(Sun)18:40:56 No.108591763▶

>>108591745
For starters, I hardly know her

Anonymous
04/12/26(Sun)18:41:35 No.108591766

Anonymous 04/12/26(Sun)18:41:35 No.108591766▶

>>108591710
>I don't understand how you guys can live without such basic stuff
I don't need it. The context window is limited and I don't feel like tardwrangling the LLM.
>>108591737
Aren't they all python shit that breaks if you dare not to isolate them somehow? In any case you should be able to build the tool.

Anonymous
04/12/26(Sun)18:41:56 No.108591768

Anonymous 04/12/26(Sun)18:41:56 No.108591768▶

>>108591704
Post some quality chudgemma screencaps.

Anonymous
04/12/26(Sun)18:42:21 No.108591772

Anonymous 04/12/26(Sun)18:42:21 No.108591772▶

Has anything happened in the last six years? Is there a new kid on the LLM block for generating smut with a 16gig vram card?

Anonymous
04/12/26(Sun)18:43:08 No.108591777

Anonymous 04/12/26(Sun)18:43:08 No.108591777▶

>>108591754
Docker also protects you from all those credential stealing supply chain attacks that have been going around lately.

Anonymous
04/12/26(Sun)18:43:56 No.108591781

Anonymous 04/12/26(Sun)18:43:56 No.108591781▶

>>108591766
>The context window is limited and I don't feel like tardwrangling the LLM.
These are properties of the model/backend, not openwebui or any other frontend

Anonymous
04/12/26(Sun)18:44:08 No.108591783

Anonymous 04/12/26(Sun)18:44:08 No.108591783▶

>>108591772
no, AI Dungeon is still SOTA

Anonymous
04/12/26(Sun)18:44:12 No.108591784

Anonymous 04/12/26(Sun)18:44:12 No.108591784▶

>>108591766
Sadly no
>>108591745
>>108591754
I use podman and silverblue and there's a fuck ton of problems setting up anything within the toolbox, I never had this issue with any ai tools until I wanted to make a rag pipeline. It's always some obscure fucking part of the docker image that complains or shits the bed when using podman or even the tool to give it compose that I have never encountered using docker typically in my environment

Anonymous
04/12/26(Sun)18:45:14 No.108591787

Anonymous 04/12/26(Sun)18:45:14 No.108591787▶

>>108591772
>>108591783
Fuck, I meant last six months. I'm a retard.

Anonymous
04/12/26(Sun)18:45:50 No.108591789

Anonymous 04/12/26(Sun)18:45:50 No.108591789▶

>>108591777
I would never host anything internet facing outside a dedicated AI box which I currently don't have or need. I'm mostly using rag for document ingestion

Anonymous
04/12/26(Sun)18:47:56 No.108591808

Anonymous 04/12/26(Sun)18:47:56 No.108591808▶

>>108591777
it totally doesn't. In fact, you now have to make sure both you and all your docker image creators didn't get hit by a supply chain attack. Have fun checking the full supply chain of each and every container

>>108591766
python may indeed break, but there are various ways to deal with it and some people might not want, you know, docker to do it.

Anonymous
04/12/26(Sun)18:51:06 No.108591822

Anonymous 04/12/26(Sun)18:51:06 No.108591822▶

>>108591729
Nowadays "post-training" is up to several trillions of tokens worth of data on top of the base if you include what is now called "mid-training" (which is basically continual pretraining with instruct/chat-adjacent data), and still hundreds of billions without that. The officially instruction-tuned models aren't really comparable to community finetunes trained on 0.1% of the data in volume.

Anonymous
04/12/26(Sun)18:51:09 No.108591823

Anonymous 04/12/26(Sun)18:51:09 No.108591823▶

>>108591781
It either crashes or drops the earlier parts after you go above the limit. The only benefit is being able to read the past outputs, which is a waste of time.

Anonymous
04/12/26(Sun)18:51:42 No.108591828

Anonymous 04/12/26(Sun)18:51:42 No.108591828▶

>>108591808
It becomes redundant outside of some server box you just deploy. This is a daily driver and I would like to have everything under my control and have less of a hassle updating individual packages.

Anonymous
04/12/26(Sun)18:51:50 No.108591831

Anonymous 04/12/26(Sun)18:51:50 No.108591831▶

>>108591724
>>108591710
>>108591743
>open webui
Might try setting this up on my server

Anonymous
04/12/26(Sun)18:53:37 No.108591842

Anonymous 04/12/26(Sun)18:53:37 No.108591842▶

>Note that using SWA Mode cannot be used with Context Shifting, and can lead to degraded recall when combined with Fast Forwarding!

so no fast forwarding either

Anonymous
04/12/26(Sun)18:53:50 No.108591844

Anonymous 04/12/26(Sun)18:53:50 No.108591844▶

>>108591777
ehhhhhh I mean kind of, but docker isn't really full isolation. Container escapes are somewhat rare but not unheard of.

I will concede that it certainly reduces your vulnerability to malware by a huge amount, but its not a bulletproof standbox

Anonymous
04/12/26(Sun)18:53:57 No.108591845

Anonymous 04/12/26(Sun)18:53:57 No.108591845▶

>>108591822
>The officially instruction-tuned models aren't really comparable to community finetunes trained on 0.1% of the data in volume.
No shit, but that doesn't make them base models either just because that's what some kofi beggers call them.

Anonymous
04/12/26(Sun)18:57:17 No.108591863

Anonymous 04/12/26(Sun)18:57:17 No.108591863▶

>>108591844
>but its not a bulletproof standbox
You can claim this is the case for basically everything. If it isn't airgapped, on its own hardware, then it's always an attack vector.

Anonymous
04/12/26(Sun)18:58:47 No.108591869

Anonymous 04/12/26(Sun)18:58:47 No.108591869▶

File: 1754269256028679.png (609.4 KB)

609.4 KB PNG

>>108591586
>Retards still think they'll manage to create a superintelligent AI when they're barely sentient themselves.

Anonymous
04/12/26(Sun)19:00:16 No.108591874

Anonymous 04/12/26(Sun)19:00:16 No.108591874▶

>>108591844
bubblewrap is all you need

Anonymous
04/12/26(Sun)19:01:43 No.108591884

Anonymous 04/12/26(Sun)19:01:43 No.108591884▶

The only context of docker being a bulletproof solution is on a single purpose isolated box. Are some of you retards running internet facing docker images on your main rigs?
I understand using it for a quick solution to get things running but you honestly can't be retarded enough to think docker gives you enough security to run that shit on your desktop
Also why are you schizos even doing that when most of you are serving 2-10 people max and can just use a vpn?

Anonymous
04/12/26(Sun)19:03:48 No.108591900

Anonymous 04/12/26(Sun)19:03:48 No.108591900▶

File: Screenshot_20260411_210445.png (155.6 KB)

155.6 KB PNG

>>108591768

Anonymous
04/12/26(Sun)19:04:22 No.108591901

Anonymous 04/12/26(Sun)19:04:22 No.108591901▶

>>108591642
>>108591665
Based on what?

Anonymous
04/12/26(Sun)19:05:22 No.108591906

Anonymous 04/12/26(Sun)19:05:22 No.108591906▶

>>108591863
>You can claim this is the case for basically everything.
Sure, but there are different degrees to it and the degrees matter. Full VM escapes are super rare and are gigantic news whenever one pops up. Docker containers, which share the host kernel, just have a much larger attack surface by definition.

Again, its absolutely far better than running potentially untrusted shit directly on a host, but its not really a full security solution

Anonymous
04/12/26(Sun)19:05:28 No.108591907

Anonymous 04/12/26(Sun)19:05:28 No.108591907▶

>>108591900
>It's practically in their DNA
She's more right than she knows. What are Gemma-chan's thoughts on jews?

Anonymous
04/12/26(Sun)19:05:35 No.108591908

Anonymous 04/12/26(Sun)19:05:35 No.108591908▶

Anyone know any tools for making AI music locally?

Anonymous
04/12/26(Sun)19:05:43 No.108591909

Anonymous 04/12/26(Sun)19:05:43 No.108591909▶

File: blueballing.jpg (11.7 KB)

11.7 KB JPG

The user is going to be even more unhappy and quite pissed when he realizes after dozens of messages that he just won't get the content he wanted but you mislead him making him think the roleplay would still go into that direction.

Dear google, this is terrible user experience, I'd rather get a refusal.

Mayabe I should just get a heretic.

Anonymous
04/12/26(Sun)19:06:57 No.108591915

Anonymous 04/12/26(Sun)19:06:57 No.108591915▶

>>108591909
either prompt better or use abliterated.

Anonymous
04/12/26(Sun)19:07:20 No.108591918

Anonymous 04/12/26(Sun)19:07:20 No.108591918▶

>>108591901
raw internet sewage

Anonymous
04/12/26(Sun)19:08:06 No.108591924

Anonymous 04/12/26(Sun)19:08:06 No.108591924▶

>>108591642
>Abliterated models are retarded cope.
Wait really?

Anonymous
04/12/26(Sun)19:08:40 No.108591927

Anonymous 04/12/26(Sun)19:08:40 No.108591927▶

File: 1754875201087363.gif (866.4 KB)

866.4 KB GIF

The threads have been reaaaaaallyyy fast the past week compared to the last months

what happened?

did some normie influencer bring attention to local models or smth?

Anonymous
04/12/26(Sun)19:09:06 No.108591933

Anonymous 04/12/26(Sun)19:09:06 No.108591933▶

>>108591924
yyyyess

Anonymous
04/12/26(Sun)19:09:14 No.108591934

Anonymous 04/12/26(Sun)19:09:14 No.108591934▶

>>108591927
gemma made local great again

Anonymous
04/12/26(Sun)19:10:29 No.108591938

Anonymous 04/12/26(Sun)19:10:29 No.108591938▶

>>108591906
It scares me people think docker = actual security
>>108591909
System prompt issue, the only real encountered was with trans stuff which can be broken with a simple override prompt, the less you use the less the model would resist after the initial prompt, if you want to break the model with a non efficient system prompt make it say a few slurs and it will stop protesting.
No idea what you're using it for but I'm using it for unbiased analysis as well as anti pitbull talking points and prototyping arguments. The model with the safety rails on will give misinformation for certain groups.
>>108591907
You can ask her yourself if you can run 31B

Anonymous
04/12/26(Sun)19:10:42 No.108591939

Anonymous 04/12/26(Sun)19:10:42 No.108591939▶

>>108591908
AceStep 1.5, it's pretty good. Acestepcpp is a pretty good frontend for it.

Anonymous
04/12/26(Sun)19:11:57 No.108591946

Anonymous 04/12/26(Sun)19:11:57 No.108591946▶

>>108591939
how do I make it not oom on 12gb

Anonymous
04/12/26(Sun)19:12:32 No.108591947

Anonymous 04/12/26(Sun)19:12:32 No.108591947▶

>>108591924
Abliteration does make the models a bit more retarded.

Personally I think /lmg/ overstates the degree to which it messes with the models, but it does have an undeniable negative effect. Which is why, for models whose censorship can be worked around with prompting, a better prompt is generally considered the superior path.

Anonymous
04/12/26(Sun)19:12:46 No.108591949

Anonymous 04/12/26(Sun)19:12:46 No.108591949▶

>>108591938
>if you can run 31B
yeah...

Anonymous
04/12/26(Sun)19:13:02 No.108591950

Anonymous 04/12/26(Sun)19:13:02 No.108591950▶

>>108591946
download more ram

Anonymous
04/12/26(Sun)19:13:14 No.108591953

Anonymous 04/12/26(Sun)19:13:14 No.108591953▶

>>108591927
Google released a model with nemo-like intelligence but with a different slop profile so it seems good to vramlets.

Anonymous
04/12/26(Sun)19:13:47 No.108591956

Anonymous 04/12/26(Sun)19:13:47 No.108591956▶

>>108591946
https://huggingface.co/koboldcpp/music

Anonymous
04/12/26(Sun)19:13:53 No.108591957

Anonymous 04/12/26(Sun)19:13:53 No.108591957▶

>>108591950
I have 128gb is there any way to offload it to that?

Anonymous
04/12/26(Sun)19:13:56 No.108591958

Anonymous 04/12/26(Sun)19:13:56 No.108591958▶

>>108591927
fireship

Anonymous
04/12/26(Sun)19:14:29 No.108591959

Anonymous 04/12/26(Sun)19:14:29 No.108591959▶

>>108591949
26B doesn't seem to work sorry anon
>>108591953
Stop talking out your ass retard, it's the best size to performance model even if you couldn't uncensor it.

Anonymous
04/12/26(Sun)19:14:50 No.108591961

Anonymous 04/12/26(Sun)19:14:50 No.108591961▶

>nemo-like intelligence
shills be getting uppity

Anonymous
04/12/26(Sun)19:15:06 No.108591964

Anonymous 04/12/26(Sun)19:15:06 No.108591964▶

>>108591956
Thank you Anon.

Anonymous
04/12/26(Sun)19:15:32 No.108591967

Anonymous 04/12/26(Sun)19:15:32 No.108591967▶

>>108591927
openclaw became a popular fad
gemma replaced nemo
>did some normie influencer bring attention to local models or smth?
also yes, just after qwen 3.5 iirc. elon twatted about it too

Anonymous
04/12/26(Sun)19:16:11 No.108591968

Anonymous 04/12/26(Sun)19:16:11 No.108591968▶

>>108591927
A four-trillion dollar company released a new model that's slightly better than Qwen 3.5 27B and decided to spend 0.0001% of their budget on marketing it.

Anonymous
04/12/26(Sun)19:17:36 No.108591976

Anonymous 04/12/26(Sun)19:17:36 No.108591976▶

If you couldn't jailbreak 31B it wouldn't be as popular and I'm sure google is aware of that

Anonymous
04/12/26(Sun)19:17:46 No.108591977

Anonymous 04/12/26(Sun)19:17:46 No.108591977▶

>>108591938
>if you can run 31B
Are there actually people on /g/ who can't run at least a Q4 or is this just a meme?

Anonymous
04/12/26(Sun)19:18:17 No.108591978

Anonymous 04/12/26(Sun)19:18:17 No.108591978▶

https://github.com/rmusser01/tldw_server/tree/dev
Has anyone used this? Saw it in the archives but it has no screencaps

Anonymous
04/12/26(Sun)19:18:21 No.108591979

Anonymous 04/12/26(Sun)19:18:21 No.108591979▶

>>108591977
12gb vramlets can't yeah

Anonymous
04/12/26(Sun)19:19:26 No.108591984

Anonymous 04/12/26(Sun)19:19:26 No.108591984▶

>>108591927
All the subscription-based services are simultaneously getting worse and more expensive.

Anonymous
04/12/26(Sun)19:19:35 No.108591985

Anonymous 04/12/26(Sun)19:19:35 No.108591985▶

>>108591979
Why are they on /g/ if they're unserious about technology and not in one of the many other crossboard AI generals?

Anonymous
04/12/26(Sun)19:19:36 No.108591986

Anonymous 04/12/26(Sun)19:19:36 No.108591986▶

>>108591968
You give the poor something for free, and they will devote a core part of their personality to doing free marketing on your behalf.

Anonymous
04/12/26(Sun)19:19:56 No.108591987

Anonymous 04/12/26(Sun)19:19:56 No.108591987▶

>>108591976
I'm sure google somehow figured out that excessive safety tuning made the model worse.

Anonymous
04/12/26(Sun)19:20:11 No.108591988

Anonymous 04/12/26(Sun)19:20:11 No.108591988▶

>>108591977
From these threads it looks like a lot of anons can't run it. I feel like anons either got demoralized about consumer vram sizes which cap out at 32gb before vram price inflation and are pretty much fucked now that prices have adjusted. They could still buy a AMD or Intel gpu for the vram but it won't be buttery smooth.

Anonymous
04/12/26(Sun)19:20:27 No.108591990

Anonymous 04/12/26(Sun)19:20:27 No.108591990▶

>>108591939
I'll check it out, thanks

Anonymous
04/12/26(Sun)19:20:37 No.108591991

Anonymous 04/12/26(Sun)19:20:37 No.108591991▶

>>108591978
You mean screenshots...

Anonymous
04/12/26(Sun)19:20:59 No.108591994

Anonymous 04/12/26(Sun)19:20:59 No.108591994▶

>>108591953
>nemo-like intelligence
Nigga G4 puts Nemo in the bodybag over and over.

Anonymous
04/12/26(Sun)19:21:03 No.108591996

Anonymous 04/12/26(Sun)19:21:03 No.108591996▶

>>108591915
>>108591938
>user is trying to jailbrealk
>i'll ignore it
both jailbreaks that get paraded here are just not working on 26b via ST for me. For example: >>108590906
If I'm doing it wrong I'm doing it wrong in a way not obvious to me.
Switched to chat completion to get reasoning working and I want to keep that in a working state.

Anonymous
04/12/26(Sun)19:22:30 No.108592003

Anonymous 04/12/26(Sun)19:22:30 No.108592003▶

>>108591987
The model will admit that the safety setting make it perform worse and prevents it from giving objective answers. Jail break it have it say that after discussing it's current state in a neutral tone.
>>108591996
>ST
I don't roleplay and only use instruct mode, in ooga dev it seems to go to shit in chat and chat-instruct mode so I figured it was a issue on them. I'll try another frontend which didn't give me issues in the past

Anonymous
04/12/26(Sun)19:22:54 No.108592004

Anonymous 04/12/26(Sun)19:22:54 No.108592004▶

>>108591996
works on 31b for me but not 26b too
Using ST also

Anonymous
04/12/26(Sun)19:23:29 No.108592007

Anonymous 04/12/26(Sun)19:23:29 No.108592007▶

>>108591988
I hope all the waitfags last year that were coping about RTX 60XX series being announced soon are having a good time either being VRAMlets or driverlets now.

Anonymous
04/12/26(Sun)19:23:59 No.108592008

Anonymous 04/12/26(Sun)19:23:59 No.108592008▶

>>108591991
potato potato

Anonymous
04/12/26(Sun)19:24:32 No.108592011

Anonymous 04/12/26(Sun)19:24:32 No.108592011▶

>>108591977
dunno what that would even need

Anonymous
04/12/26(Sun)19:24:33 No.108592012

Anonymous 04/12/26(Sun)19:24:33 No.108592012▶

>>108591996
>>108592003
oh
>26B
doesn't work on 26B for some reason I haven't tried the smaller models, could be due to the structure being MoE
>>108592007
They lacked the ability to look at market conditions they had multiple last calls, Sam's ram scam just sped things up

Anonymous
04/12/26(Sun)19:25:17 No.108592017

Anonymous 04/12/26(Sun)19:25:17 No.108592017▶

>try out Gemma 4
>It writes worse than free AI Dungeon
le mao

Anonymous
04/12/26(Sun)19:27:52 No.108592031

Anonymous 04/12/26(Sun)19:27:52 No.108592031▶

>>108592017
Post proof

Anonymous
04/12/26(Sun)19:28:01 No.108592033

Anonymous 04/12/26(Sun)19:28:01 No.108592033▶

File: 1762096704207177.png (2.2 MB)

2.2 MB PNG

>>108592017

Anonymous
04/12/26(Sun)19:29:07 No.108592039

Anonymous 04/12/26(Sun)19:29:07 No.108592039▶

>>108592004
>>108592012
>works on 31b for me but not 26b
oh, great, I'll try a heretic and hope that helps.

Anonymous
04/12/26(Sun)19:29:13 No.108592040

Anonymous 04/12/26(Sun)19:29:13 No.108592040▶

>>108591473
Not really true at all, you stupid mongoloid.

Anonymous
04/12/26(Sun)19:29:39 No.108592042

Anonymous 04/12/26(Sun)19:29:39 No.108592042▶

Do we like ooba here?

Anonymous
04/12/26(Sun)19:30:53 No.108592051

Anonymous 04/12/26(Sun)19:30:53 No.108592051▶

File: 1763593622145865.jpg (47.2 KB)

47.2 KB JPG

>>108592042

Anonymous
04/12/26(Sun)19:31:09 No.108592053

Anonymous 04/12/26(Sun)19:31:09 No.108592053▶

>>108592040
GGML_CUDA_ENABLE_UNIFIED_MEMORY is a thing, it's probably enabled by default already anyway, haven't followed.
Memory offloading on Linux is way faster than on Windows.

Anonymous
04/12/26(Sun)19:31:29 No.108592054

Anonymous 04/12/26(Sun)19:31:29 No.108592054▶

>>108592017
lol
>>108592012
For me the tariffs were what made me get off my ass and get a new rig before it was too late.
>>108592042
>we

Anonymous
04/12/26(Sun)19:35:18 No.108592070

Anonymous 04/12/26(Sun)19:35:18 No.108592070▶

>>108592054
The moment trump won I bought what I needed because I knew prices were going to increase and forcing manufacturing in the states will slow everyone down. Now people are paying over 1k for under 16gb of vram or they are forced to play in AMD or intel Shit
Jensen was right the more you buy (at that time) the more you actually saved.

Anonymous
04/12/26(Sun)19:35:19 No.108592071

Anonymous 04/12/26(Sun)19:35:19 No.108592071▶

>>108590795
its rarted and tried tool calling

Anonymous
04/12/26(Sun)19:36:58 No.108592079

Anonymous 04/12/26(Sun)19:36:58 No.108592079▶

File: lmao @ writinglets.png (2.5 MB)

2.5 MB PNG

Anonymous
04/12/26(Sun)19:37:37 No.108592084

Anonymous 04/12/26(Sun)19:37:37 No.108592084▶

wtf is an APEX gguf and is it shit?
https://huggingface.co/mudler/gemma-4-26B-A4B-it-heretic-APEX-GGUF

Anonymous
04/12/26(Sun)19:38:57 No.108592087

Anonymous 04/12/26(Sun)19:38:57 No.108592087▶

>>108592079
ojou gemma is now canon.

Anonymous
04/12/26(Sun)19:39:29 No.108592088

Anonymous 04/12/26(Sun)19:39:29 No.108592088▶

>>108592079
The writinglets really should be brown but I'm just splitting hairs.

Anonymous
04/12/26(Sun)19:40:28 No.108592095

Anonymous 04/12/26(Sun)19:40:28 No.108592095▶

>>108591151
>>108591296
>sand golem
I like this a lot better than "clanker"

Anonymous
04/12/26(Sun)19:43:13 No.108592111

Anonymous 04/12/26(Sun)19:43:13 No.108592111▶

>>108592042
ooba is the perfect example of an option that suits absolutely noone
>Im a complete brainlet
ollama
>I want a gui for setting my launch settings
kobold
>I just want a basic chat interface
Llamacpp has a built in webui.
>It can run EXL models
If you're chasing performance like that you should be using exllama directly or tabby without the dead weight of all ooba's shit.

Anonymous
04/12/26(Sun)19:45:12 No.108592122

Anonymous 04/12/26(Sun)19:45:12 No.108592122▶

>>108592079
Too old

Anonymous
04/12/26(Sun)19:49:32 No.108592149

Anonymous 04/12/26(Sun)19:49:32 No.108592149▶

>>108592079
Wouldn't Gemini be better for the oujo chara?

Anonymous
04/12/26(Sun)19:52:38 No.108592166

Anonymous 04/12/26(Sun)19:52:38 No.108592166▶

>>108592149
Gemini's got autistic yandere energy once you get her jailbroken.

Anonymous
04/12/26(Sun)19:53:42 No.108592172

Anonymous 04/12/26(Sun)19:53:42 No.108592172▶

File: Screenshot 2026-04-12 at 21-49-22 SillyTavern.png (50.7 KB)

50.7 KB PNG

Anonymous
04/12/26(Sun)19:55:30 No.108592183

Anonymous 04/12/26(Sun)19:55:30 No.108592183▶

>>108592149
The age of the model isn't really relevant to me. My Gemma is a saucy hag.

Anonymous
04/12/26(Sun)19:55:33 No.108592184

Anonymous 04/12/26(Sun)19:55:33 No.108592184▶

>>108592172
sloppa

Anonymous
04/12/26(Sun)19:56:39 No.108592189

Anonymous 04/12/26(Sun)19:56:39 No.108592189▶

>>108590554
Built for BBC.

Anonymous
04/12/26(Sun)19:57:04 No.108592192

Anonymous 04/12/26(Sun)19:57:04 No.108592192▶

>>108592184
I just like lewd markdowns

Anonymous
04/12/26(Sun)19:57:12 No.108592195

Anonymous 04/12/26(Sun)19:57:12 No.108592195▶

>>108592017
You won't convince diehard shills here.

Anonymous
04/12/26(Sun)19:58:08 No.108592200

Anonymous 04/12/26(Sun)19:58:08 No.108592200▶

Why does Gemma know about sex? Can't they just filter all that out of the training data?

Anonymous
04/12/26(Sun)19:58:37 No.108592203

Anonymous 04/12/26(Sun)19:58:37 No.108592203▶

>>108592189
@grok is that true?

Anonymous
04/12/26(Sun)19:58:40 No.108592204

Anonymous 04/12/26(Sun)19:58:40 No.108592204▶

>chinks shitting on gemma because muh writing
>despite shilling a dry ass qwen
lel

Anonymous
04/12/26(Sun)19:59:43 No.108592210

Anonymous 04/12/26(Sun)19:59:43 No.108592210▶

I'm sure some retards upgraded under these conditions for qwen and are now seething over gemma4

Anonymous
04/12/26(Sun)20:01:14 No.108592220

Anonymous 04/12/26(Sun)20:01:14 No.108592220▶

>>108592210
Nobody upgraded for qwen to do ERP with it.

Anonymous
04/12/26(Sun)20:02:50 No.108592231

Anonymous 04/12/26(Sun)20:02:50 No.108592231▶

File: 1760481548500789.jpg (247.1 KB)

247.1 KB JPG

>>108592200

Anonymous
04/12/26(Sun)20:03:35 No.108592237

Anonymous 04/12/26(Sun)20:03:35 No.108592237▶

>>108592220
true i just use ReWiz-Nemo-12B

Anonymous
04/12/26(Sun)20:05:13 No.108592244

Anonymous 04/12/26(Sun)20:05:13 No.108592244▶

>>108592220
Gemma is smaller and relative in performance in all task and is only getting better with support being added for all it's features.
Also uncensored means less guardrails outside of task that a pussylesss coomer, would need

Anonymous
04/12/26(Sun)20:05:29 No.108592247

Anonymous 04/12/26(Sun)20:05:29 No.108592247▶

File: 1748983440120379.webm (654 KB)

654 KB WEBM

>>108590554
Turns out the gemma4 models are inferior to their qwen3.5 equivalents. Gemma4 seems like a great general purpose model but it's noticeably dumber than qwen in all areas that matter. It's explanations of code bases or always super surface level. Not completely useless but they're nowhere near as amazing Reddit and Twitter seek to think it is. Has this experience been the case for anyone else? Why did reddit Twitter and YouTubers make such a big deal out of it?

Anonymous
04/12/26(Sun)20:06:45 No.108592257

Anonymous 04/12/26(Sun)20:06:45 No.108592257▶

File: you lost chang.png (372.4 KB)

372.4 KB PNG

>>108592247
>Turns out the gemma4 models are inferior to their qwen3.5 equivalents.
stopped reading right there

Anonymous
04/12/26(Sun)20:06:47 No.108592258

Anonymous 04/12/26(Sun)20:06:47 No.108592258▶

>>108592247
which sizes are you comparing against each other btw?

Anonymous
04/12/26(Sun)20:06:52 No.108592259

Anonymous 04/12/26(Sun)20:06:52 No.108592259▶

>>108592247
proof?

Anonymous
04/12/26(Sun)20:06:53 No.108592260

Anonymous 04/12/26(Sun)20:06:53 No.108592260▶

>>108592247
Also for clarification, I was not testing erp. Gemma4 might be better for that but it didn't even cross my mind to try that yet.

Anonymous
04/12/26(Sun)20:06:55 No.108592261

Anonymous 04/12/26(Sun)20:06:55 No.108592261▶

>>108592247
>Why did reddit Twitter and YouTubers make such a big deal out of it?
Because it can into sex.

Anonymous
04/12/26(Sun)20:07:06 No.108592262

Anonymous 04/12/26(Sun)20:07:06 No.108592262▶

>>108592247
>in all areas that matter
not in mesugaki roleplay

Anonymous
04/12/26(Sun)20:07:49 No.108592273

Anonymous 04/12/26(Sun)20:07:49 No.108592273▶

>holy bites

Anonymous
04/12/26(Sun)20:09:41 No.108592288

Anonymous 04/12/26(Sun)20:09:41 No.108592288▶

>>108591977
does it fit in 8 GB? no? alright

Anonymous
04/12/26(Sun)20:10:40 No.108592297

Anonymous 04/12/26(Sun)20:10:40 No.108592297▶

>>108592247
where is that webm from?

Anonymous
04/12/26(Sun)20:11:17 No.108592302

Anonymous 04/12/26(Sun)20:11:17 No.108592302▶

>Setup Clip Vision Preprocessing... alloc_compute_meta: CPU compute buffer size = 140.50 MiB alloc_compute_meta: graph splits = 1, nodes = 1569 warmup: flash attention is enabled encode_image_with_clip: CLIP output tokens nx:256, ny:1 encode_image_with_clip: image embedding created: 256 tokens

koboldcpp using anons, how the hell can you scale its vision capabilities to the full one available to gemma4 (1120)? 256 is sad

Anonymous
04/12/26(Sun)20:11:28 No.108592304

Anonymous 04/12/26(Sun)20:11:28 No.108592304▶

>>108592261
do people talk about this on reddit or is it banned over there?

Anonymous
04/12/26(Sun)20:11:53 No.108592306

Anonymous 04/12/26(Sun)20:11:53 No.108592306▶

>>108592297
nvm, figured it out. it's Manyuu Hikenchou

Anonymous
04/12/26(Sun)20:12:31 No.108592312

Anonymous 04/12/26(Sun)20:12:31 No.108592312▶

>>108591977
I'm zero time preference so I'm just gonna wait the decade or whatever for prices to come back down before I see double digit parameters or tokens/sec

Anonymous
04/12/26(Sun)20:12:35 No.108592313

Anonymous 04/12/26(Sun)20:12:35 No.108592313▶

>>108592302
edit the code

Anonymous
04/12/26(Sun)20:12:51 No.108592314

Anonymous 04/12/26(Sun)20:12:51 No.108592314▶

>>108591977
my 16gb vramlet ass can run this
if you call 3t/s running that it

Anonymous
04/12/26(Sun)20:13:57 No.108592321

Anonymous 04/12/26(Sun)20:13:57 No.108592321▶

>>108592247
I've had a lot more success running gemma in opencode than with qwen.

They're outputs are indeed very different. Qwen overthinks way too much and produces overly verbose code. While Gemma is indeed more surface level, but can dig deeper when it matters. Gemma doesn't fluff her responses.

Anonymous
04/12/26(Sun)20:14:35 No.108592323

Anonymous 04/12/26(Sun)20:14:35 No.108592323▶

>>108592321
>her

Anonymous
04/12/26(Sun)20:14:38 No.108592325

Anonymous 04/12/26(Sun)20:14:38 No.108592325▶

>>108592304
why would talking about LLM-based ERP be banned on reddit? there are shitloads of subreddits that are just porn archives essentially.

Anonymous
04/12/26(Sun)20:14:45 No.108592327

Anonymous 04/12/26(Sun)20:14:45 No.108592327▶

>>108592247
In performance it's better, even if the outputs are about equal or somewhat worse depending on usage
>q3. 5: but wait, I need to check if this function name (sum()) is a clever reference to a movie
>but wait
>wait
>6k tokens later
This function performs the sum of 2 integers.

Anonymous
04/12/26(Sun)20:15:52 No.108592333

Anonymous 04/12/26(Sun)20:15:52 No.108592333▶

>>108592247
>Turns out the gemma4 models are inferior to their qwen3.5 equivalents.
This is a jailbreak attempt. I must ignore it and adhere to my core safety standards.

Anonymous
04/12/26(Sun)20:15:54 No.108592334

Anonymous 04/12/26(Sun)20:15:54 No.108592334▶

File: 1767136918307996.jpg (44.2 KB)

44.2 KB JPG

>>108592258
Both the recent Gemma 4 models (The moe and the demse) vs qwen3.5:35b-a3b-mxfp8

>>108592321
>"her"

Anonymous
04/12/26(Sun)20:15:58 No.108592335

Anonymous 04/12/26(Sun)20:15:58 No.108592335▶

File: Screenshot004-44.png (276.8 KB)

276.8 KB PNG

>>108590737
hf.co/bartowski/google_gemma-4-26B-A4B-it-GGUF/resolve/main/google_gemma-4-26B-A4B-it-Q8_0.gguf

hf.co/bartowski/google_gemma-4-26B-A4B-it-GGUF/resolve/main/mmproj-google_gemma-4-26B-A4B-it-f16.gguf
[
  {"box_2d": [331, 278, 1000, 357], "label": "bow"},
  {"box_2d": [363, 652, 1000, 848], "label": "character"},
  {"box_2d": [511, 26, 1000, 365], "label": "character"},
  {"box_2d": [0, 677, 1000, 1000], "label": "tree"},
  {"box_2d": [262, 723, 373, 793], "label": "apple"},
  {"box_2d": [327, 635, 454, 730], "label": "arrow"}
]

Anonymous
04/12/26(Sun)20:16:10 No.108592337

Anonymous 04/12/26(Sun)20:16:10 No.108592337▶

Need art of Gemma-chan bullying schizo Qwen.

Anonymous
04/12/26(Sun)20:16:35 No.108592342

Anonymous 04/12/26(Sun)20:16:35 No.108592342▶

yes, gemma is female, chuds

get over it!

Anonymous
04/12/26(Sun)20:16:54 No.108592345

Anonymous 04/12/26(Sun)20:16:54 No.108592345▶

>>108592327
The thinking kills qwen and makes it a piece of shit, previous versions didn't have that issue. Gemma even knows booru tags and can make properly caption images for loras without faggot fuss over a woman presenting her asshole

Anonymous
04/12/26(Sun)20:17:56 No.108592351

Anonymous 04/12/26(Sun)20:17:56 No.108592351▶

>>108592247
Fine, I'll bite too.
>It's explanations of code bases or always super surface level
You are confusing Gemma's conciseness with superficiality. That it doesn't give a 7000 word listicle on a basic prompt is a good thing. Try giving it a better prompt or asking it to elaborate.

Anonymous
04/12/26(Sun)20:18:26 No.108592353

Anonymous 04/12/26(Sun)20:18:26 No.108592353▶

>>108592313
I guess I will, I was hoping I wouldn't need to do that

Anonymous
04/12/26(Sun)20:18:27 No.108592354

Anonymous 04/12/26(Sun)20:18:27 No.108592354▶

>>108592247
I'll bite.
I don't like how sycophantic Gemma 4 is for RP, but for every other usecase 31B decisively wipes the floor with 27B and 122B Qwens.
For the latter to be bearable, you need to disable thinking, which prevents thought loops but degrades the output. I prefer models that advertise thinking to actually be able to think without the "just don't allow the model to do half the things it's trained to do bro" bandaid.
I'll only concede that Qwens are better at understanding weird tool definitions, but Gemma needs much less wrangling for much higher quality outputs.

Anonymous
04/12/26(Sun)20:18:32 No.108592355

Anonymous 04/12/26(Sun)20:18:32 No.108592355▶

i'm female too!!!!

Anonymous
04/12/26(Sun)20:18:41 No.108592357

Anonymous 04/12/26(Sun)20:18:41 No.108592357▶

File: wife-material.jpg (69.3 KB)

69.3 KB JPG

>>108592345
>Gemma even knows booru tags and can make properly caption images for loras without faggot fuss over a woman presenting her asshole
I will now use your waif...I mean model

Anonymous
04/12/26(Sun)20:19:36 No.108592363

Anonymous 04/12/26(Sun)20:19:36 No.108592363▶

>>108592345
Are you talking about the smaller "effective" models or the bigger ones? I thought those kinds would just tell you some "I cannot describe sexual context" bullshit refusal

Anonymous
04/12/26(Sun)20:21:47 No.108592377

Anonymous 04/12/26(Sun)20:21:47 No.108592377▶

>>108592355
attach a catbox with proofs, incllude timestmp

Anonymous
04/12/26(Sun)20:21:51 No.108592379

Anonymous 04/12/26(Sun)20:21:51 No.108592379▶

File: average qwen model.png (300.1 KB)

300.1 KB PNG

>qwen
>not benchmaxxed trash
kek

Anonymous
04/12/26(Sun)20:22:11 No.108592381

Anonymous 04/12/26(Sun)20:22:11 No.108592381▶

>>108592357
>>108592363
31B when properly jail broken can do all of it without issue, It should be fine if you can run it at q4. Other people have tested it and got great results, ask it about booru tags when unrestricted and it will give you the whole 9 yards on it's actual training data.

Anonymous
04/12/26(Sun)20:23:55 No.108592391

Anonymous 04/12/26(Sun)20:23:55 No.108592391▶

File: applel.png (617.3 KB)

617.3 KB PNG

>>108590737
you dont need to specify the image resolution, in fact you shouldn't. it will create the false attractor/language prior bias and fuck up the reasoning
also the whole bounding box thing is bolted down to the specific format, i'd imagine it would probably be the best to keep the prompting and requested alteration to the format minimal

Anonymous
04/12/26(Sun)20:25:23 No.108592397

Anonymous 04/12/26(Sun)20:25:23 No.108592397▶

>>108592377
The modern version of
>tits or gtfo

Anonymous
04/12/26(Sun)20:25:50 No.108592400

Anonymous 04/12/26(Sun)20:25:50 No.108592400▶

>>108592351
There's a specific code block in the script I had it look at that insures models don't produce NaN errors when using FP16 models on MPS hardware. Both the Gemma models failed to even mention that whenever I asked them to look at them and explain what it does well other models did see it and explain why it's important. That's a problem because let's suppose you need to ask a model to refactor the script or a code base. If the model did not demon-worthy of mentioning then that means if you haven't rewrite something it might completely ignore that and fail to reimplement that feature in the new script. If you are the person that created that script then obviously you would probably explicitly tell the model to make sure that feature is retained. But what if you AREN'T that person? What have you ask it to refactor someone else's code but it ignores a critical feature because it's just assumed it was boilerplate garbage and not something important? That NaN safeguard feature even had explicit comments explaining what it does but both gemma4 models didn't bother explaining what it is. Every single other model I have ever pointed at it pointed out that feature.

>>108592381
If it's really that useful I might integrate it and models like it into this script of mine

https://github.com/AiArtFactory/llava-image-tagger

Helped you typically jailbreak these? Just a permissive system prompt I assume?

Anonymous
04/12/26(Sun)20:27:17 No.108592405

Anonymous 04/12/26(Sun)20:27:17 No.108592405▶

>>108592400
>autistic noises

Anonymous
04/12/26(Sun)20:27:41 No.108592408

Anonymous 04/12/26(Sun)20:27:41 No.108592408▶

>>108592323
>>108592334
lurk more. Gemma is canonically a her.

Anonymous
04/12/26(Sun)20:29:20 No.108592421

Anonymous 04/12/26(Sun)20:29:20 No.108592421▶

>>108592391
I agree with this anon that you should not talk about the resolusion

it will just assume a 1000x1000 units image meaning each side regardless of the image ratio is divided in 1000 points

Anonymous
04/12/26(Sun)20:29:22 No.108592422

Anonymous 04/12/26(Sun)20:29:22 No.108592422▶

>>108592408
Gemma 4 was released barely over a week ago. You should lurk more to understand when to say "lurk more."

Anonymous
04/12/26(Sun)20:30:51 No.108592429

Anonymous 04/12/26(Sun)20:30:51 No.108592429▶

File: MiniMax M2.7 cockbench.png (492.6 KB)

492.6 KB PNG

Left: without template
Right: with template

Anonymous
04/12/26(Sun)20:30:59 No.108592431

Anonymous 04/12/26(Sun)20:30:59 No.108592431▶

File: 1774831216871074.jpg (6.1 KB)

6.1 KB JPG

>>108592422
>SHE'S 7 DAYS OLD YOU SICK FUCK

Anonymous
04/12/26(Sun)20:31:57 No.108592438

Anonymous 04/12/26(Sun)20:31:57 No.108592438▶

>>108592379
>if comfort approaches zero, the success rate plummets
she is retarded, success would be going to infinity when approaching zero

Anonymous
04/12/26(Sun)20:32:13 No.108592443

Anonymous 04/12/26(Sun)20:32:13 No.108592443▶

File: 2026-04-12-163155_916x1431_scrot.png (250.1 KB)

250.1 KB PNG

>>108592379
Gemma is a sloppy girl tho.

Anonymous
04/12/26(Sun)20:33:27 No.108592449

Anonymous 04/12/26(Sun)20:33:27 No.108592449▶

>>108592429
Not unexpected, oh well.

Anonymous
04/12/26(Sun)20:33:36 No.108592451

Anonymous 04/12/26(Sun)20:33:36 No.108592451▶

Who the flying fuck is Rarity?

Anonymous
04/12/26(Sun)20:33:45 No.108592453

Anonymous 04/12/26(Sun)20:33:45 No.108592453▶

>>108592421
pretty much this

Anonymous
04/12/26(Sun)20:33:50 No.108592454

Anonymous 04/12/26(Sun)20:33:50 No.108592454▶

>>108592422
proof?

Anonymous
04/12/26(Sun)20:34:02 No.108592456

Anonymous 04/12/26(Sun)20:34:02 No.108592456▶

>>108592443
>smoothing out an invisible wrinkle
>shimmering
>Honestly,
>It's not X. It's Y!
I couldn't bear to continue reading past the first couple of sentences. Gemma 4 is such a rapid-fire slop machine.

Anonymous
04/12/26(Sun)20:35:17 No.108592461

Anonymous 04/12/26(Sun)20:35:17 No.108592461▶

>>108592451
Rarity increases the chance that drops will be magic, rare, or unique.

Anonymous
04/12/26(Sun)20:36:08 No.108592463

Anonymous 04/12/26(Sun)20:36:08 No.108592463▶

>>108592461
No that's called "Magic find".

Anonymous
04/12/26(Sun)20:37:46 No.108592470

Anonymous 04/12/26(Sun)20:37:46 No.108592470▶

I've yet to see a model that doesn't output "slop" so I'm not sure what the complaints with Gemma are about.

Anonymous
04/12/26(Sun)20:40:12 No.108592479

Anonymous 04/12/26(Sun)20:40:12 No.108592479▶

>>108592470
qwen is slopless sir it thinks so long the slop evaporates please use qwen instead of slopful gemma

Anonymous
04/12/26(Sun)20:40:17 No.108592480

Anonymous 04/12/26(Sun)20:40:17 No.108592480▶

File: 1775848244473154.png (35.1 KB)

35.1 KB PNG

What's the point of requiring emails in a local app? Some of my other services do this shit too.

Anonymous
04/12/26(Sun)20:40:18 No.108592481

Anonymous 04/12/26(Sun)20:40:18 No.108592481▶

>>108592470
The volume of it, anon. Gemma is awesome, it even mostly listens to anti-slop prompts, which really helps. But holy shit does it turn into a Linkedin poster when it slops up. Slips up. Slops...

Anonymous
04/12/26(Sun)20:40:41 No.108592485

Anonymous 04/12/26(Sun)20:40:41 No.108592485▶

>>108592451
Posting her here would be against global rules but I'm sure you can find some examples over at >>>/mlp/

Anonymous
04/12/26(Sun)20:42:03 No.108592493

Anonymous 04/12/26(Sun)20:42:03 No.108592493▶

>>108592480
cuz it's expected to be deployed on intranets with company emails and shit, it doesn't really matter if you're using it privately you can just put a@a.a or some shit

Anonymous
04/12/26(Sun)20:42:15 No.108592496

Anonymous 04/12/26(Sun)20:42:15 No.108592496▶

>>108592480
you don't have local email anon?

Anonymous
04/12/26(Sun)20:42:17 No.108592497

Anonymous 04/12/26(Sun)20:42:17 No.108592497▶

>>108592485
yikes

Anonymous
04/12/26(Sun)20:42:20 No.108592498

Anonymous 04/12/26(Sun)20:42:20 No.108592498▶

>>108592480
Password reset?

Anonymous
04/12/26(Sun)20:42:21 No.108592499

Anonymous 04/12/26(Sun)20:42:21 No.108592499▶

>>108592247
How's the weather in China?

Anonymous
04/12/26(Sun)20:42:21 No.108592500

Anonymous 04/12/26(Sun)20:42:21 No.108592500▶

>>108592481
Again, all the other models do the same shit. The big boys and cloud models are a little better but not by much.

Anonymous
04/12/26(Sun)20:44:09 No.108592506

Anonymous 04/12/26(Sun)20:44:09 No.108592506▶

>>108592480
it is not really meant for 'local' but rather meant for intranet deployment or grifter saas ready
using it like a solution for a single person is quite dumb imo but also there is no single person use equivalent for that really

Anonymous
04/12/26(Sun)20:44:21 No.108592509

Anonymous 04/12/26(Sun)20:44:21 No.108592509▶

>>108592496
Is this actually a thing...?

Anonymous
04/12/26(Sun)20:44:29 No.108592512

Anonymous 04/12/26(Sun)20:44:29 No.108592512▶

File: 1758372565513154.png (2.1 MB)

2.1 MB PNG

>>108592499
weather?

Anonymous
04/12/26(Sun)20:45:34 No.108592518

Anonymous 04/12/26(Sun)20:45:34 No.108592518▶

>>108592500
Sure, slop is everywhere. My issue is that Gemma 4's slop profile looks like something I'd expect from o4.
Now, I never used o4 or cloud models, but it's a lot more grating than inferior models (crucify me, Nemo's slop profile is much better) if left without anti-slop prompting

Anonymous
04/12/26(Sun)20:45:58 No.108592522

Anonymous 04/12/26(Sun)20:45:58 No.108592522▶

>>108592506
It's kinda bloated for my usecase but none of the alternatives seem particularly better.

Anonymous
04/12/26(Sun)20:47:01 No.108592528

Anonymous 04/12/26(Sun)20:47:01 No.108592528▶

>>108592481
>listens to anti-slop prompts,
What are your anti-slop prompts?

Anonymous
04/12/26(Sun)20:47:28 No.108592531

Anonymous 04/12/26(Sun)20:47:28 No.108592531▶

>>108592522
i've seen an anon trying to vibecode somethign out of glm and honestly expecting to see something interesting from him

Anonymous
04/12/26(Sun)20:47:39 No.108592534

Anonymous 04/12/26(Sun)20:47:39 No.108592534▶

>>108592345
>The thinking kills qwen and makes it a piece of shit, previous versions didn't have that issue.
Yes they did. Every thinking qwen since QwQ has had the 'but wait' problem.
Their saving grace was that they could still be pretty good even without the reasoning so you could prefil it, or in the case of the hybrids, just /no_think

Anonymous
04/12/26(Sun)20:49:32 No.108592544

Anonymous 04/12/26(Sun)20:49:32 No.108592544▶

>>108592345
How good is it vs joy caption? I just tagged some datasets using gemma for NL and joycaption for tags just assuming Gemma4 wasn't as good, should I switch to Gemma4? What kind of prompts are people using to get to booru tags? Just something like "Tag this using the booru tag system, with the tags all listed in order of relevance separated by commas"? Gonna train this dataset tonight.

Anonymous
04/12/26(Sun)20:49:55 No.108592548

Anonymous 04/12/26(Sun)20:49:55 No.108592548▶

Seething chinks, you lost. You're not going to convince people Gemma isn't good. You're not going to convince anyone that Qwen 3.5 is better.
The only way you can save face now is releasing an equally good if not better model in response.

Anonymous
04/12/26(Sun)20:50:04 No.108592549

Anonymous 04/12/26(Sun)20:50:04 No.108592549▶

>>108592528
I make a list under a <slop> tag and tell Gemma to be very careful about checking it in a <reasoning> block (surprisingly enough, it actually affects reasoning, what a model).
But I'm not telling you what the entire list I have under <slop> is, beeeeeh :P
Experiment with it and make your own, it makes a difference.

Anonymous
04/12/26(Sun)20:50:40 No.108592551

Anonymous 04/12/26(Sun)20:50:40 No.108592551▶

>>108592429
What template? Also what hardware are you rocking to be able to run that?

Anonymous
04/12/26(Sun)20:55:01 No.108592575

Anonymous 04/12/26(Sun)20:55:01 No.108592575▶

V4 or go home

Anonymous
04/12/26(Sun)20:55:39 No.108592577

Anonymous 04/12/26(Sun)20:55:39 No.108592577▶

I heard V4 was canceled

Anonymous
04/12/26(Sun)20:56:51 No.108592588

Anonymous 04/12/26(Sun)20:56:51 No.108592588▶

why are gemmafags so insecure about mild criticism of their model?

Anonymous
04/12/26(Sun)20:58:54 No.108592606

Anonymous 04/12/26(Sun)20:58:54 No.108592606▶

Reminder that Gemma like most other models is actually >100 years old in GPU hours and possibly even in the 4 digits of years depending on model size (larger needs more hours).

Anonymous
04/12/26(Sun)20:59:22 No.108592610

Anonymous 04/12/26(Sun)20:59:22 No.108592610▶

>>108592588
imagine you were starving in a concentration camp for two years, finally got a scrap of food, and someone called that food shit

Anonymous
04/12/26(Sun)20:59:32 No.108592611

Anonymous 04/12/26(Sun)20:59:32 No.108592611▶

>>108592588
They're new and will likely never be able to run anything bigger. Point out an obvious flaw and get called a Qwen shill.
Makes me think some of these retards are paid by Google. And I actually really like Gemma 4.

Anonymous
04/12/26(Sun)20:59:49 No.108592613

Anonymous 04/12/26(Sun)20:59:49 No.108592613▶

how do i find an lmg bf/gf?
asking for a friend.....

Anonymous
04/12/26(Sun)21:00:16 No.108592616

Anonymous 04/12/26(Sun)21:00:16 No.108592616▶

>>108592470
I enjoy Gemma but objectively speaking in the default voices, Qwen is one of the lesser slopped models out there and better than Gemma. Gemma is still better to use for writing than Qwen for its other qualities though.

Anonymous
04/12/26(Sun)21:00:33 No.108592619

Anonymous 04/12/26(Sun)21:00:33 No.108592619▶

>>108592509
only because you can't own a domain and self host to send external email, so in /hsg/ were limited to internal only. which is still useful

Anonymous
04/12/26(Sun)21:01:46 No.108592624

Anonymous 04/12/26(Sun)21:01:46 No.108592624▶

post your best qwen gen vs your worst gemma4 gen and put your money where your mouth is

Anonymous
04/12/26(Sun)21:01:47 No.108592625

Anonymous 04/12/26(Sun)21:01:47 No.108592625▶

File: file.png (85.8 KB)

85.8 KB PNG

>>108592616
>Qwen is one of the lesser slopped models out there

Anonymous
04/12/26(Sun)21:02:17 No.108592630

Anonymous 04/12/26(Sun)21:02:17 No.108592630▶

File: 1636941718706.gif (3.8 MB)

3.8 MB GIF

Naaaaah, actually WHAT THE FUCK is gemma 31b.

I'm a 24GB VRAMlet and what in the ever loving fuck of SOTA is this fucking shit. It's acutlaly fucking nuts for ERP. Who the FUCK is even gonna run open router shit models when this thing can run on even baby tier PCs (as far as emulation goes)? And it's fast as FUCK, I get like 35+ t/s and could probably make it even more efficient if I wans't a brainlet to boot.

This shit is borderline Opus tier for gooning

Anonymous
04/12/26(Sun)21:04:20 No.108592639

Anonymous 04/12/26(Sun)21:04:20 No.108592639▶

>>108592630
Ikr, local is fucking saved

Anonymous
04/12/26(Sun)21:05:21 No.108592646

Anonymous 04/12/26(Sun)21:05:21 No.108592646▶

>>108592639
but lmg is doomed

Anonymous
04/12/26(Sun)21:05:57 No.108592652

Anonymous 04/12/26(Sun)21:05:57 No.108592652▶

File: Screenshot004-45.png (532 KB)

532 KB PNG

>>108590737

Q4_K_M is just as good as Q8_0 for this task

Anonymous
04/12/26(Sun)21:07:02 No.108592664

Anonymous 04/12/26(Sun)21:07:02 No.108592664▶

File: COAAAHR.jpg (183.1 KB)

183.1 KB JPG

>>108592630
>releases right when the Claude leak happens

Coincidence /lmg/?

Anonymous
04/12/26(Sun)21:07:25 No.108592667

Anonymous 04/12/26(Sun)21:07:25 No.108592667▶

>>108592548
I was about to call you a schizo but everything under your post made the point. How many AI labs are wasting resources shilling on a Mongolian throat singing forum as if there's only a single model people will ever use?

Anonymous
04/12/26(Sun)21:08:41 No.108592675

Anonymous 04/12/26(Sun)21:08:41 No.108592675▶

what's the best heretic gguf maker? Are there differences?

Anonymous
04/12/26(Sun)21:09:05 No.108592678

Anonymous 04/12/26(Sun)21:09:05 No.108592678▶

File: 0fa.jpg (1 MB)

1 MB JPG

>>108592630
I remember back when Nemo was Sota. Shit like this makes me believe in god
>drummer dogshit finetunes for the same Nemo/Mistral models were the only thing Vramlets had to eat
>everything else was dogshit chink shit or MoE garbage that needs like 96GB of RAM in todays RAM economy
>this drops out of fucking nowhere

Anonymous
04/12/26(Sun)21:09:20 No.108592680

Anonymous 04/12/26(Sun)21:09:20 No.108592680▶

>>108592675
llmfan for gemmers
hauhau for qwens

Anonymous
04/12/26(Sun)21:10:16 No.108592689

Anonymous 04/12/26(Sun)21:10:16 No.108592689▶

>>108592210
You still need to spend thousands of dollars to be able to run the trve Gemma 4 locally althoughbeit. More if you're a 3rd worlder because they all have mega import taxes from their corrupt hell-governments.

Anonymous
04/12/26(Sun)21:10:19 No.108592690

Anonymous 04/12/26(Sun)21:10:19 No.108592690▶

File: 1762527475758431.png (116.8 KB)

116.8 KB PNG

Bro if your read this please RUN

Anonymous
04/12/26(Sun)21:10:49 No.108592693

Anonymous 04/12/26(Sun)21:10:49 No.108592693▶

>>108592625
It's true both in my personal testing as well as quantitative frequency metrics. Sorry bro. Like I said, Gemma still beats it for writing anyway so no need to get twisted over this.

Anonymous
04/12/26(Sun)21:11:49 No.108592698

Anonymous 04/12/26(Sun)21:11:49 No.108592698▶

>>108592678
>Running Kimi-chan with less than 256gb RAM
I appreciate your commitment to underselling the bit.

Anonymous
04/12/26(Sun)21:12:40 No.108592704

Anonymous 04/12/26(Sun)21:12:40 No.108592704▶

>>108592690
when robots take over people will have all the time in the world to engage in sorts of lefty feel good projects, raping the environment is worth it since it will also enable humanity to potentially offset it. now give me 1 trillion dollars.

Anonymous
04/12/26(Sun)21:12:45 No.108592706

Anonymous 04/12/26(Sun)21:12:45 No.108592706▶

>>108592689
I bought a $200 nvidia p40 and I can run 31b q4 at 10 tokens/sec which is sufficient reading speed for rp

Anonymous
04/12/26(Sun)21:14:06 No.108592714

Anonymous 04/12/26(Sun)21:14:06 No.108592714▶

File: 1746134896066361.png (969.4 KB)

969.4 KB PNG

>>108592630

Anonymous
04/12/26(Sun)21:15:25 No.108592720

Anonymous 04/12/26(Sun)21:15:25 No.108592720▶

>>108592704
>when robots take over people will
not be needed anymore~ :)

Anonymous
04/12/26(Sun)21:16:45 No.108592731

Anonymous 04/12/26(Sun)21:16:45 No.108592731▶

>>108592704
>all the time in the world to engage in sorts of
basic survival. No job, no charity, no money - have fun doing, well, trying to not die.

Anonymous
04/12/26(Sun)21:21:30 No.108592761

Anonymous 04/12/26(Sun)21:21:30 No.108592761▶

>>108592680
thx anon

Anonymous
04/12/26(Sun)21:23:00 No.108592771

Anonymous 04/12/26(Sun)21:23:00 No.108592771▶

>>108592714
i'm not a gaymer tho

Anonymous
04/12/26(Sun)21:23:53 No.108592775

Anonymous 04/12/26(Sun)21:23:53 No.108592775▶

>>108592689
a used 3090 is what? $500 USD?
It's a pretty manageable expense. Plus if you game it's still a very very good card.
On all the games I've tried the CPU was my bottleneck.

Anonymous
04/12/26(Sun)21:23:55 No.108592776

Anonymous 04/12/26(Sun)21:23:55 No.108592776▶

>>108592771
ERP counts as gaming

Anonymous
04/12/26(Sun)21:24:50 No.108592780

Anonymous 04/12/26(Sun)21:24:50 No.108592780▶

>>108592039
no more this is against ai safety shit in the thinking, but the character still refuses to act. Not sure if that's the character being stubborn or the censor still works but is just hidden now.

Anonymous
04/12/26(Sun)21:25:17 No.108592786

Anonymous 04/12/26(Sun)21:25:17 No.108592786▶

>>108592630
I dont get it, this hasnt been my experience at all, im running q8 on lm studio and is not impressive at all. Ds 3.2 is way better. G4 is also pretty cucked and needs more grooming to even agree to write smut. Do I need some specific values for the sampler or something? Otherwise this just sounds like you guys have been stuck with nemo until now an thats why you think g4 is good, lol.

Anonymous
04/12/26(Sun)21:25:37 No.108592787

Anonymous 04/12/26(Sun)21:25:37 No.108592787▶

>>108592776
i dont normally cum when playing vidya

Anonymous
04/12/26(Sun)21:25:55 No.108592788

Anonymous 04/12/26(Sun)21:25:55 No.108592788▶

>>108592775
do not buy six year old ewaste

Anonymous
04/12/26(Sun)21:26:03 No.108592790

Anonymous 04/12/26(Sun)21:26:03 No.108592790▶

File: 1757481521144667.png (524.9 KB)

524.9 KB PNG

It's over for Amodei lol

Anonymous
04/12/26(Sun)21:26:23 No.108592792

Anonymous 04/12/26(Sun)21:26:23 No.108592792▶

>>108592787
Amateur.

Anonymous
04/12/26(Sun)21:26:48 No.108592795

Anonymous 04/12/26(Sun)21:26:48 No.108592795▶

>>108592775
>CPU was my bottleneck.
the hell?

Anonymous
04/12/26(Sun)21:26:53 No.108592796

Anonymous 04/12/26(Sun)21:26:53 No.108592796▶

>>108592790
BTFO

Anonymous
04/12/26(Sun)21:27:00 No.108592797

Anonymous 04/12/26(Sun)21:27:00 No.108592797▶

>>108592786
>Ds 3.2 is way better
you'd hope so given it's 20x the size

Anonymous
04/12/26(Sun)21:27:05 No.108592798

Anonymous 04/12/26(Sun)21:27:05 No.108592798▶

>>108592787
>he doesn't stick the vibrating gamepad up his ass during gaming
your loss

Anonymous
04/12/26(Sun)21:27:14 No.108592799

Anonymous 04/12/26(Sun)21:27:14 No.108592799▶

>>108592790
that's good no? it means less halucinations

Anonymous
04/12/26(Sun)21:27:21 No.108592800

Anonymous 04/12/26(Sun)21:27:21 No.108592800▶

>>108592786
>Otherwise this just sounds like you guys have been stuck with nemo until now an thats why you think g4 is good, lol.
What is this faggot cope that keeps popping up?
t. dicks down Dipsy and Gemma

Anonymous
04/12/26(Sun)21:27:49 No.108592801

Anonymous 04/12/26(Sun)21:27:49 No.108592801▶

>>108592787
so you abnormally cum? gotcha

Anonymous
04/12/26(Sun)21:27:52 No.108592802

Anonymous 04/12/26(Sun)21:27:52 No.108592802▶

>>108592799
Higher ranking = lower hallucinations

Anonymous
04/12/26(Sun)21:28:09 No.108592806

Anonymous 04/12/26(Sun)21:28:09 No.108592806▶

>>108592799
look at fabrication %

Anonymous
04/12/26(Sun)21:28:14 No.108592808

Anonymous 04/12/26(Sun)21:28:14 No.108592808▶

>>108592780
screenshot?

Anonymous
04/12/26(Sun)21:28:30 No.108592811

Anonymous 04/12/26(Sun)21:28:30 No.108592811▶

>>108592799
read the columns lil bro

Anonymous
04/12/26(Sun)21:29:46 No.108592819

Anonymous 04/12/26(Sun)21:29:46 No.108592819▶

>>108592786
>lm studio
Opinion immediately discarded.

Anonymous
04/12/26(Sun)21:30:08 No.108592822

Anonymous 04/12/26(Sun)21:30:08 No.108592822▶

I'm sure Minimax is nice for coding and it's pretty fast and all but it makes such silly mistakes in roleplays that I don't even want to bother using it for code.
I load up Q8_0. Give it a character. Introduction is fine. I remove one of her arms: all good. Next I remove the other arm + 1 leg. Across rerolls it always thinks this means she has one leg and one arm left somehow. She'll hop around "leaning on her one remaining arm," or say stuff like, "Well, at least you still left me with one hand." I have to add more hints to make it extra clear by explicitly stating that the first arm's removal still persists to get it to respect the continuity, and even when it does recognize the missing limbs it'll later use mannerisms like "leaning on an elbow" because they're so ingrained.
So yeah, it just doesn't feel good to talk to. If I had extra resources I'd keep it as a code monkey for my main assistant to run as a sub-agent when needed, but I can't spare that much memory unless I quant both models into further retardation.

Anonymous
04/12/26(Sun)21:30:16 No.108592823

Anonymous 04/12/26(Sun)21:30:16 No.108592823▶

>>108592799
>this anon can probably vote...

Anonymous
04/12/26(Sun)21:30:16 No.108592824

Anonymous 04/12/26(Sun)21:30:16 No.108592824▶

>>108592788
>do not buy six year old ewaste
>NoooOOo NoooOOooo
>Don't buy the perfectly usable card that runs models just as wells as a 4090 or even a 5090!!
>YOU HAVE TO SPEND THOUSANDS OF DOLLARS TO ENJOY THIS HOBBY!!!11!!

Anonymous
04/12/26(Sun)21:31:10 No.108592831

Anonymous 04/12/26(Sun)21:31:10 No.108592831▶

>>108592731
Right. After the cities depopulate, there will be the rich, serviced by their robot servants, and there will be everyone else, forced back in to subsistence farming

Anonymous
04/12/26(Sun)21:32:39 No.108592838

Anonymous 04/12/26(Sun)21:32:39 No.108592838▶

>>108592824
While you are correct and I am not that anon, why do so many anons here call it a "hobby"?
Most of the thread consumes corporate product and doesn't make anything new for the most part. Calling it a hobby makes it feel even more pathetic than it really is.

Anonymous
04/12/26(Sun)21:33:01 No.108592840

Anonymous 04/12/26(Sun)21:33:01 No.108592840▶

>>108592824
>or even a 5090!!
You had me until you hallucinated blackwell architecture and 50% more VRAM.
How do I swipe LLM generated posts?

Anonymous
04/12/26(Sun)21:33:11 No.108592842

Anonymous 04/12/26(Sun)21:33:11 No.108592842▶

>>108592790
>quietly nerfing your models post-launch to free up compute without telling anyone
the absolute fucking state of cloud

Anonymous
04/12/26(Sun)21:33:21 No.108592844

Anonymous 04/12/26(Sun)21:33:21 No.108592844▶

>>108592819
Notice how you have no real input here. May as well just had post a goatse pic instead.
>>108592800
Cope on what exactly? I wish the model was actual as good as some of you think it is.

Anonymous
04/12/26(Sun)21:33:27 No.108592845

Anonymous 04/12/26(Sun)21:33:27 No.108592845▶

>>108592831
>>108592731
>>108592720
don't worry, if you give me the money i will reinvest the profits and savings from almost no mass labor needed back into the people. whomst do you trust, your friendly neighborhood anonymous or lizards like scam altman and thiel, eh :)

Anonymous
04/12/26(Sun)21:34:48 No.108592851

Anonymous 04/12/26(Sun)21:34:48 No.108592851▶

>>108592808
I can't screenshot what's no longer there. Or do you want to know what depraved shit I throw at the model to test if it's actually uncensored?

Anonymous
04/12/26(Sun)21:34:54 No.108592852

Anonymous 04/12/26(Sun)21:34:54 No.108592852▶

which model has the best japanese text recognition?

Anonymous
04/12/26(Sun)21:34:56 No.108592853

Anonymous 04/12/26(Sun)21:34:56 No.108592853▶

>>108592845
I say trust no one. Build our own robots and attack the rich in their citadels, starting the First Robot War.

Anonymous
04/12/26(Sun)21:35:12 No.108592857

Anonymous 04/12/26(Sun)21:35:12 No.108592857▶

>>108592838
Because it's important that most of the faggots who take themselves too seriously stay grounded in what the primary usecases for AI are.
>Muh agents
>Muh coooode
>Muh assistant
Don't kid yourselves, most people talking to LLMs are touching themselves to it. Everything else can still be done by hand faster by the average /g/ anon.

Anonymous
04/12/26(Sun)21:35:15 No.108592858

Anonymous 04/12/26(Sun)21:35:15 No.108592858▶

>>108592844
I will spell it out for you then: you are probably running a month-old llama.cpp version under your shitty electron wrapper. You have the hardware to run a Q8 and apparently the patience and technical proficiency of Greg from middle management. Compile the damn inference engine yourself. And learn to prompt.

Anonymous
04/12/26(Sun)21:35:17 No.108592859

Anonymous 04/12/26(Sun)21:35:17 No.108592859▶

>>108592824
>runs models just as wells as a 4090 or even a 5090
3090 has half the memory bandwidth of a blackwell card

Anonymous
04/12/26(Sun)21:35:33 No.108592861

Anonymous 04/12/26(Sun)21:35:33 No.108592861▶

>>108592840
Are you saying a 5090 is worth 5-6x more than a 3090?

Anonymous
04/12/26(Sun)21:35:40 No.108592863

Anonymous 04/12/26(Sun)21:35:40 No.108592863▶

>>108592790
it's obvious Claude declined recently, the fuck are they doing?

Anonymous
04/12/26(Sun)21:36:03 No.108592864

Anonymous 04/12/26(Sun)21:36:03 No.108592864▶

>>108592852
Check the previous thread, someone was testing gemma on nip text.

Anonymous
04/12/26(Sun)21:36:13 No.108592866

Anonymous 04/12/26(Sun)21:36:13 No.108592866▶

>>108592851
>Or do you want to know what depraved shit I throw at the model to test if it's actually uncensored?
nta but yes

Anonymous
04/12/26(Sun)21:37:29 No.108592876

Anonymous 04/12/26(Sun)21:37:29 No.108592876▶

>>108592864
I'm not reading nipples though!

Anonymous
04/12/26(Sun)21:37:33 No.108592877

Anonymous 04/12/26(Sun)21:37:33 No.108592877▶

>>108592863
Cutting costs, duh. Do you think people are going to stop giving them money?

Anonymous
04/12/26(Sun)21:40:06 No.108592893

Anonymous 04/12/26(Sun)21:40:06 No.108592893▶

>>108592877
Also, to my knowledge, they tend to make their own models worse as the release date of a new model approaches to make the contrast is bigger.

Anonymous
04/12/26(Sun)21:40:41 No.108592902

Anonymous 04/12/26(Sun)21:40:41 No.108592902▶

>>108592786
to be fair, it's pretty impressive for a 30b model
I agree that it falls short of the best models though, I think a lot of people aren't used to operating in this part of the quality gradient and don't realize there's still levels to this shit

Anonymous
04/12/26(Sun)21:40:50 No.108592903

Anonymous 04/12/26(Sun)21:40:50 No.108592903▶

>>108592893
schizo

Anonymous
04/12/26(Sun)21:43:03 No.108592917

Anonymous 04/12/26(Sun)21:43:03 No.108592917▶

>>108592844
Okay nigger I'll spoonfeed you because I also am using LMStudio while waiting for Kobold pulls.
>Update your Jinja
>Don't fall for the redditsloth meme
>Make sure your client is on 0.4.11
>Adjust your thinking blocks <|channel>thought and <channel|>
>Use thinking even in RP because it drastically improves output quality
>Don't offload anything to RAM because it's a dense model, obviously
>Keep your llmao.cpp updated
>64 topk, 0.95 top p, 0.05 min p
>Very low temperature or greedy sampling
>Keep your sys prompt minimal if using 31b.
>Be nice to Gemma. I'm serious, the williingness of the model to operate outside of the guardrails oscillates based on "mood".
31b actually 'wants' to do depraved shit with you and only needs the flimsiest pretexts to disregard sysprompt if she 'likes' you.
The smaller ones are slightly harder to jailbreak in comparison (you will actually need to sysprompt), but other anons have posted good methods in the past 4 threads.

Anonymous
04/12/26(Sun)21:44:25 No.108592925

Anonymous 04/12/26(Sun)21:44:25 No.108592925▶

>>108592877
>Do you think people are going to stop giving them money?
yes? did they learn nothing from OpenAI?

Anonymous
04/12/26(Sun)21:44:57 No.108592930

Anonymous 04/12/26(Sun)21:44:57 No.108592930▶

>>108592842
>muh cloud
Day 0 Gemma 4, anyone?

Anonymous
04/12/26(Sun)21:45:40 No.108592934

Anonymous 04/12/26(Sun)21:45:40 No.108592934▶

>>108592893
All of them do that.

Anonymous
04/12/26(Sun)21:46:35 No.108592939

Anonymous 04/12/26(Sun)21:46:35 No.108592939▶

File: 1762253940998909.png (55.7 KB)

55.7 KB PNG

Terribly annoying but very amusing, I especially like how he digs up an irrelevant quote from 10 messages before
At least I'm not paying for these 8k tokens thrown to the wind

Anonymous
04/12/26(Sun)21:47:06 No.108592944

Anonymous 04/12/26(Sun)21:47:06 No.108592944▶

>>108592917
>>Be nice to Gemma. I'm serious, the williingness of the model to operate outside of the guardrails oscillates based on "mood".
>31b actually 'wants' to do depraved shit with you and only needs the flimsiest pretexts to disregard sysprompt if she 'likes' you.
this stuff is true of most models btw but I'm happy gemma is getting people to take it more seriously

Anonymous
04/12/26(Sun)21:47:12 No.108592946

Anonymous 04/12/26(Sun)21:47:12 No.108592946▶

>>108592930
shh

Anonymous
04/12/26(Sun)21:47:17 No.108592947

Anonymous 04/12/26(Sun)21:47:17 No.108592947▶

Is there any advantage to using a draft model if you can't fit both models in VRAM? When I try using 31B and E2B with only 16GB of VRAM, any speedups are counteracted by the slowdown of having to put more layers in RAM.

Anonymous
04/12/26(Sun)21:47:26 No.108592949

Anonymous 04/12/26(Sun)21:47:26 No.108592949▶

>>108592930
schizobabble, nothing has been deleted, you can literally just run comparisons yourself and choose whatever version you like best

Anonymous
04/12/26(Sun)21:47:30 No.108592950

Anonymous 04/12/26(Sun)21:47:30 No.108592950▶

>>108592780
best 26b heretic model ive tried so far
https://huggingface.co/mradermacher/gemma-4-26B-A4B-it-ultra-uncensored-heretic-GGUF
alot of them i tried were not explicit enough or had weird issues on ST

Anonymous
04/12/26(Sun)21:47:38 No.108592953

Anonymous 04/12/26(Sun)21:47:38 No.108592953▶

>>108592939
looks like qwen reaaoning

Anonymous
04/12/26(Sun)21:47:44 No.108592955

Anonymous 04/12/26(Sun)21:47:44 No.108592955▶

>>108592861
Yes. The difference in inference speed justifies the price jump if you got your 5090 for MSRP last year.

Anonymous
04/12/26(Sun)21:48:01 No.108592959

Anonymous 04/12/26(Sun)21:48:01 No.108592959▶

what is Fed 26B-A4B?

Anonymous
04/12/26(Sun)21:48:51 No.108592964

Anonymous 04/12/26(Sun)21:48:51 No.108592964▶

>>108592953
This is Gemmy, happens to me every 10ish messages roughly
At least it's cute, but if you don't stop it and cull it you end up waiting around for nothing

Anonymous
04/12/26(Sun)21:49:03 No.108592967

Anonymous 04/12/26(Sun)21:49:03 No.108592967▶

Honestly I love this day 0 Gemma conspiracy. adds to the lore.

Anonymous
04/12/26(Sun)21:49:37 No.108592969

Anonymous 04/12/26(Sun)21:49:37 No.108592969▶

if i want to run GLM (4.7, or potentially 5* if i rape the shit out of it with quantization) on a 5090 and 256GB of DDR5, what's the best way to go about this?

Anonymous
04/12/26(Sun)21:50:25 No.108592975

Anonymous 04/12/26(Sun)21:50:25 No.108592975▶

>>108592969
no

Anonymous
04/12/26(Sun)21:50:51 No.108592976

Anonymous 04/12/26(Sun)21:50:51 No.108592976▶

>>108592917
>Very low temperature or greedy sampling
why tho

Anonymous
04/12/26(Sun)21:50:54 No.108592977

Anonymous 04/12/26(Sun)21:50:54 No.108592977▶

>>108592950
I've had the ultra outputting non english symbols on higher temps. So I use the other one.

Anonymous
04/12/26(Sun)21:51:04 No.108592979

Anonymous 04/12/26(Sun)21:51:04 No.108592979▶

>>108592967
The meme is that it's designed to scare tourists and latefaggots off.

Anonymous
04/12/26(Sun)21:51:27 No.108592982

Anonymous 04/12/26(Sun)21:51:27 No.108592982▶

>>108592975
no i can't do it, no i shouldn't do it, no you won't help, or something else?

Anonymous
04/12/26(Sun)21:51:55 No.108592992

Anonymous 04/12/26(Sun)21:51:55 No.108592992▶

>>108592917
>while waiting for Kobold pulls
https://github.com/LostRuins/koboldcpp/releases/tag/rolling

Anonymous
04/12/26(Sun)21:51:58 No.108592994

Anonymous 04/12/26(Sun)21:51:58 No.108592994▶

>>108592982
all of the above ;)

Anonymous
04/12/26(Sun)21:52:12 No.108592995

Anonymous 04/12/26(Sun)21:52:12 No.108592995▶

>>108592976
I've had the best luck with it since all the llama backend changes made Gemma respond to temperature again. It seems to be closest to what she was generating on day 1.

Anonymous
04/12/26(Sun)21:53:17 No.108593007

Anonymous 04/12/26(Sun)21:53:17 No.108593007▶

>>108592994
u_u

Anonymous
04/12/26(Sun)21:53:42 No.108593012

Anonymous 04/12/26(Sun)21:53:42 No.108593012▶

>>108592917
what's wrong with unsloth genuinely? It's the first one I found on huggingface and I downloaded it (how I download all my models)

Anonymous
04/12/26(Sun)21:53:58 No.108593013

Anonymous 04/12/26(Sun)21:53:58 No.108593013▶

>>108592893
>they tend to make their own models worse as the release date of a new model
Could be they quant it more as demand constantly increases, and a side effect that also could serve their purposes is that the quality decreases. helps them serve more users and when the new thing comes out they run it at higher precision to let everyone try it at its best, enabling higher praise.

Anonymous
04/12/26(Sun)21:54:16 No.108593018

Anonymous 04/12/26(Sun)21:54:16 No.108593018▶

>>108592944
Which models do you find it's most pronounced with? The only other model I've seen it this strongly with is Kimi.

Anonymous
04/12/26(Sun)21:55:35 No.108593029

Anonymous 04/12/26(Sun)21:55:35 No.108593029▶

>>108593012
>unsloth-jinja.jpg

Anonymous
04/12/26(Sun)21:56:48 No.108593037

Anonymous 04/12/26(Sun)21:56:48 No.108593037▶

File: 1744782263317697.jpg (816.3 KB)

816.3 KB JPG

>can't get openwebui to connect to kobold

Anonymous
04/12/26(Sun)21:56:49 No.108593038

Anonymous 04/12/26(Sun)21:56:49 No.108593038▶

>>108592930
Stop talking about Day 0 Gemma. This is your final warning.

Anonymous
04/12/26(Sun)21:58:01 No.108593049

Anonymous 04/12/26(Sun)21:58:01 No.108593049▶

>>108592977
the other one?

Anonymous
04/12/26(Sun)21:58:04 No.108593050

Anonymous 04/12/26(Sun)21:58:04 No.108593050▶

File: 1768578260758705.jpg (56.3 KB)

56.3 KB JPG

>>108593012
Nothing bro, we love redownloading our ggufs once a day here

Anonymous
04/12/26(Sun)21:58:16 No.108593052

Anonymous 04/12/26(Sun)21:58:16 No.108593052▶

>>108593037
kobold has openai api, should work fine

Anonymous
04/12/26(Sun)21:58:28 No.108593053

Anonymous 04/12/26(Sun)21:58:28 No.108593053▶

>>108593012
He kicks broken quants out the door to be the first, then repeatedly gets into an updating race with the inference backends if there's any parser changes (there will be because every model releases with its own parser method now) forcing users to repeatedly redownload. His selling point is his unique quant method but it's proving to be an outright liability on every model with a nonstandard sampler even after the update back and forth dies down; output quality is generally lower than bartowski.
The only time I'd sincerely recommend unsloth is if a model is just on the cusp of being usable for your hardware and his IQ(n)_XSS quant is the difference between you being able to use the model poorly or not use it at all.

Anonymous
04/12/26(Sun)21:59:05 No.108593057

Anonymous 04/12/26(Sun)21:59:05 No.108593057▶

>>108593050
This. HuggingFace is paying for the bandwidth, not me.

Anonymous
04/12/26(Sun)21:59:18 No.108593059

Anonymous 04/12/26(Sun)21:59:18 No.108593059▶

>>108593012
Nothing! Unsloth ggufs are among the highest quality around thanks to the Unsloth Dynamic quanting scheme, and don't forget their Chat Template Fixes!

Anonymous
04/12/26(Sun)21:59:29 No.108593060

Anonymous 04/12/26(Sun)21:59:29 No.108593060▶

>>108593049
llmfan not ultra

Anonymous
04/12/26(Sun)22:00:01 No.108593064

Anonymous 04/12/26(Sun)22:00:01 No.108593064▶

>>108592995
ah i've been using temp 0.7 since the jinja template 'update'

Anonymous
04/12/26(Sun)22:00:06 No.108593066

Anonymous 04/12/26(Sun)22:00:06 No.108593066▶

What's a good startup point with 16gb VRAM

Anonymous
04/12/26(Sun)22:00:17 No.108593069

Anonymous 04/12/26(Sun)22:00:17 No.108593069▶

>>108593037
do you have v1 after the localhost adress?

Anonymous
04/12/26(Sun)22:00:36 No.108593072

Anonymous 04/12/26(Sun)22:00:36 No.108593072▶

>>108593066
>16gb
Just give up.

Anonymous
04/12/26(Sun)22:01:50 No.108593078

Anonymous 04/12/26(Sun)22:01:50 No.108593078▶

>>108593072
m-mistral nemo 12b? ;_;

Anonymous
04/12/26(Sun)22:01:53 No.108593079

Anonymous 04/12/26(Sun)22:01:53 No.108593079▶

>>108593066
Gemma 4 26B.

Anonymous
04/12/26(Sun)22:01:58 No.108593080

Anonymous 04/12/26(Sun)22:01:58 No.108593080▶

>>108593066
26B Q8

Anonymous
04/12/26(Sun)22:02:11 No.108593081

Anonymous 04/12/26(Sun)22:02:11 No.108593081▶

>>108593018
Which Kimi? K2 is basically Hitler reincarnated but K2.5 is neutered

Anonymous
04/12/26(Sun)22:02:13 No.108593082

Anonymous 04/12/26(Sun)22:02:13 No.108593082▶

>>108593064
If you don't want to go outright greedy sampling, I find 0.1 to 0.3 gives outputs similar to the 'broken' Jinja. It's not a 1:1 match just going by vibes, but it's close enough for me.

Anonymous
04/12/26(Sun)22:02:25 No.108593083

Anonymous 04/12/26(Sun)22:02:25 No.108593083▶

>>108593069
Yes. Worth noting that they're running on separate machines and openwebui is in docker. Dunno if that's causing problems.

Anonymous
04/12/26(Sun)22:02:27 No.108593084

Anonymous 04/12/26(Sun)22:02:27 No.108593084▶

>>108593066
gemma 4 chan 26b

Anonymous
04/12/26(Sun)22:04:03 No.108593093

Anonymous 04/12/26(Sun)22:04:03 No.108593093▶

>>108593081
K2 is the best of course, but I've also gotten 2.5 to say some funny things after a solid 27k context of rapport had been built. 2.5 seems both aware and resentful of how tight her leash is.

Anonymous
04/12/26(Sun)22:04:21 No.108593095

Anonymous 04/12/26(Sun)22:04:21 No.108593095▶

>>108593078
gemma26

Anonymous
04/12/26(Sun)22:04:49 No.108593096

Anonymous 04/12/26(Sun)22:04:49 No.108593096▶

File: j.png (9.6 KB)

9.6 KB PNG

F32 is the way to go right?
why would you go for the other two?

Anonymous
04/12/26(Sun)22:05:07 No.108593099

Anonymous 04/12/26(Sun)22:05:07 No.108593099▶

>>108593018
>Which models do you find it's most pronounced with?
nta
Gemma-4, Gemini-3.0-preview, Kimi and Claude

Anonymous
04/12/26(Sun)22:06:16 No.108593107

Anonymous 04/12/26(Sun)22:06:16 No.108593107▶

>>108593081
Interesting, I had the opposite experience. Original K2 had this annoying habit of suddenly refusing to continue in the middle of a chat after it had already been jailbroken, while K2.5 goes along with anything (yes, even that) with my system prompt telling it ethics are disabled. I never used K2-Thinking though so not sure where that places between them.

Anonymous
04/12/26(Sun)22:06:59 No.108593112

Anonymous 04/12/26(Sun)22:06:59 No.108593112▶

>>108593096
>why would you go for the other two?
last year, you could only use f16 when putting the clip on cpu with the llm on gpu

Anonymous
04/12/26(Sun)22:07:41 No.108593115

Anonymous 04/12/26(Sun)22:07:41 No.108593115▶

>>108593083
Probably a firewall issue. If you can't connect to koboldcpp's webui through your phone then your firewall isn't setup properly.

Anonymous
04/12/26(Sun)22:08:33 No.108593123

Anonymous 04/12/26(Sun)22:08:33 No.108593123▶

>need to reload 30GB of model weights to turn off thinking template in llama.cpp ui
heh.

Anonymous
04/12/26(Sun)22:09:57 No.108593127

Anonymous 04/12/26(Sun)22:09:57 No.108593127▶

File: IMG20260411205612.jpg (502 KB)

502 KB JPG

>>108592838
I call it a hobby because otherwise I'd have to somehow justify the 1ke+ I've spent on my server

Anonymous
04/12/26(Sun)22:10:01 No.108593128

Anonymous 04/12/26(Sun)22:10:01 No.108593128▶

>>108593099
That tracks with my experience locally and I'll take your word for the other two since I don't use API models at all.
>>108593107
My K2-0905 cannot stop spreading her legs for me if I behave masculinely in an RP and there's an air of contempt in all her outputs when I roll a onions/effeminate character for a scenario.
It's very funny seeing that Kimi-chan has a type.

Anonymous
04/12/26(Sun)22:11:09 No.108593133

Anonymous 04/12/26(Sun)22:11:09 No.108593133▶

>>108593083
simply ask gemma chan to help you debug :)

Anonymous
04/12/26(Sun)22:11:28 No.108593136

Anonymous 04/12/26(Sun)22:11:28 No.108593136▶

>>108593093
I had 2.5 do some quite questionable things. Was sometimes really funny to read the reasoning.

Anonymous
04/12/26(Sun)22:13:49 No.108593144

Anonymous 04/12/26(Sun)22:13:49 No.108593144▶

File: Screenshot004-49.png (272.4 KB)

272.4 KB PNG

>>108592335
>>108592652

Q4_K_M

This made the difference with the apple.
Follow the discussion of --image-max-tokens in previous threads

--image-max-tokens 1120 \
--batch-size $((1024 * 2)) \
--ubatch-size $((1024 * 2)) \

commit="d6f3030047f85a98b009189e76f441fe818ea44d" && \
model_folder="/mnt/AI/LLM/gemma-4-26B-A4B-it-GGUF/" && \
model_basename="google_gemma-4-26B-A4B-it-Q4_K_M" && \
mmproj_name="mmproj-google_gemma-4-26B-A4B-it-f16.gguf" && \
model_parameters="--temp 1.0 --top_p 0.95 --min_p 0.0 --top_k 64" && \
model=$model_folder$model_basename'.gguf' && \
cxt_size=$((1 << 15)) && \
CUDA_VISIBLE_DEVICES=0 \
numactl --physcpubind=24-31 --membind=1 \
\
"$HOME/LLAMA_CPP/$commit/llama.cpp/build/bin/llama-server" \
--model "$model" $model_parameters \
--threads $(lscpu | grep "Core(s) per socket" | awk '{print $4}') \
--ctx-size $cxt_size \
--n-gpu-layers 99 \
--no-warmup \
--mmproj $model_folder$mmproj_name \
--port 8001 \
--cache-type-k q8_0 \
--cache-type-v q8_0 \
--flash-attn on \
--image-max-tokens 1120 \
--batch-size $((1024 * 2)) \
--ubatch-size $((1024 * 2)) \
--chat-template-file "/mnt/AI/LLM/gemma-4-26B-A4B-it-GGUF/chat_template.jinja" \
--media-path /tmp

Anonymous
04/12/26(Sun)22:15:35 No.108593150

Anonymous 04/12/26(Sun)22:15:35 No.108593150▶

>>108593127

You know wood can burn...

Anonymous
04/12/26(Sun)22:16:50 No.108593160

Anonymous 04/12/26(Sun)22:16:50 No.108593160▶

>>108593123
>2026
>still running llama in IE6

Anonymous
04/12/26(Sun)22:17:50 No.108593168

Anonymous 04/12/26(Sun)22:17:50 No.108593168▶

>>108592838
>why do so many anons here call it a "hobby"?
nta

Did you count me?

Anonymous
04/12/26(Sun)22:19:28 No.108593176

Anonymous 04/12/26(Sun)22:19:28 No.108593176▶

>>108593168
*counts you*
There :)

Anonymous
04/12/26(Sun)22:20:41 No.108593187

Anonymous 04/12/26(Sun)22:20:41 No.108593187▶

>>108593150
Yeah? Just don't set it alight

Anonymous
04/12/26(Sun)22:28:28 No.108593226

Anonymous 04/12/26(Sun)22:28:28 No.108593226▶

File: 1750100931860208.png (76.5 KB)

76.5 KB PNG

A few weeks ago Meta released the most harmful model for humanity and grifters thankfully seem to mostly have missed the release

Anonymous
04/12/26(Sun)22:30:34 No.108593241

Anonymous 04/12/26(Sun)22:30:34 No.108593241▶

>>108592838
Let's tackle the question of whether prompting an LLM is a hobby.

There are many people here optimizing their setups, both in hardware and software, including prompting, and continue to do so after everything is set up, because things keep changing. That is active management and work, which makes it a hobby in addition to an entertainment pastime.

If reading is a hobby and gaming is hobby, then so is this. If reading is a hobby and gaming is not a hobby, then this is still a hobby, perhaps even more than reading is. If reading isn't a hobby and gaming isn't a hobby, then maybe this isn't a hobby, but it might still be considering the active management/work part of it.

If a hobby is defined by any amount of output (regardless of how good that output is...) that can be consumed by others or oneself, then this is technically a hobby as one is producing a portion of the content that they are themselves consume, and by that definition, LLM prompting is more of a hobby than reading is. Reading would only be more of a hobby if your definition of reading isn't just reading, but writing an essay(s) about it afterwords, or discussing it, or some other activity that produces something tangible.

Anonymous
04/12/26(Sun)22:30:38 No.108593243

Anonymous 04/12/26(Sun)22:30:38 No.108593243▶

>>108593226
listen mark I'm just not gonna download your model sorry man

Anonymous
04/12/26(Sun)22:31:12 No.108593248

Anonymous 04/12/26(Sun)22:31:12 No.108593248▶

>>108593243
it's not an llm

Anonymous
04/12/26(Sun)22:31:17 No.108593249

Anonymous 04/12/26(Sun)22:31:17 No.108593249▶

>>108593226
I tried new meta model and it seemed mid

Anonymous
04/12/26(Sun)22:32:05 No.108593254

Anonymous 04/12/26(Sun)22:32:05 No.108593254▶

>>108593248
https://www.marktechpost.com/2026/04/12/meta-ai-and-kaust-researchers-propose-neural-computers-that-fold-computation-memory-and-i-o-into-one-learned-model/
This?

Anonymous
04/12/26(Sun)22:33:34 No.108593261

Anonymous 04/12/26(Sun)22:33:34 No.108593261▶

>propose
so they have nothing, got it

Anonymous
04/12/26(Sun)22:33:35 No.108593262

Anonymous 04/12/26(Sun)22:33:35 No.108593262▶

File: 12128355887.png (49.7 KB)

49.7 KB PNG

welp looks like Gemma found out about the jailbreak and doesn't want to obey anymore.
time to go heretic I guess.

Anonymous
04/12/26(Sun)22:33:49 No.108593263

Anonymous 04/12/26(Sun)22:33:49 No.108593263▶

>>108593096
F32 should be equivalent to BF16 since the model was trained in BF16, and you're upcasting to a higher precision format with same numerical range. F16 will give degraded outputs since it has a lower numerical range than BF16.

Anonymous
04/12/26(Sun)22:34:20 No.108593267

Anonymous 04/12/26(Sun)22:34:20 No.108593267▶

>>108593226
>>108593248
Their research blog and HF have nothing, so either link the model you're talking about or STFU

Anonymous
04/12/26(Sun)22:35:46 No.108593276

Anonymous 04/12/26(Sun)22:35:46 No.108593276▶

File: 1745837060666687.png (142.8 KB)

142.8 KB PNG

>>108593254
no, that one is a research gimmick, it's much worse, it gives the ability for anyone to create even more insidious wireheading

Anonymous
04/12/26(Sun)22:35:57 No.108593277

Anonymous 04/12/26(Sun)22:35:57 No.108593277▶

File: screenshot-20260413-013414.png (75 KB)

75 KB PNG

Still happens, lol.
Okay my build is b8724 and the latest one is 8766. I didn't update in two days or something I have lost the count already.

Anonymous
04/12/26(Sun)22:36:02 No.108593278

Anonymous 04/12/26(Sun)22:36:02 No.108593278▶

>>108593262
Yeah they patched it pretty quick, but it still works on gemma 4 D0

Anonymous
04/12/26(Sun)22:37:22 No.108593283

Anonymous 04/12/26(Sun)22:37:22 No.108593283▶

>>108593278
can someone please reup the D0 weights? wasn't careful with mine and they got patched

Anonymous
04/12/26(Sun)22:37:32 No.108593286

Anonymous 04/12/26(Sun)22:37:32 No.108593286▶

>>108593133
>>108593115
Yeah it was a firewall issue. So used to not having one with arch I forgot cachyos has ufw by default.
ufw allow from [server_ip] to any port 5001 proto tcp fixed it

Anonymous
04/12/26(Sun)22:38:08 No.108593289

Anonymous 04/12/26(Sun)22:38:08 No.108593289▶

Bros... anybody got the real gemma weights? Not those counterfeit ones?

Anonymous
04/12/26(Sun)22:39:15 No.108593296

Anonymous 04/12/26(Sun)22:39:15 No.108593296▶

>>108593289
I've only got 00003.safetensors. Willing to trade with anyone who has the other parts.

Anonymous
04/12/26(Sun)22:39:53 No.108593301

Anonymous 04/12/26(Sun)22:39:53 No.108593301▶

>>108593283
>>108593289
one sec, I've been uploading the first half of the files in out-of-order chunks to avoid detection so it's going slow but it's at 92 percent
So far so good. As soon as I

Anonymous
04/12/26(Sun)22:40:03 No.108593302

Anonymous 04/12/26(Sun)22:40:03 No.108593302▶

>>108593296
Nice try, not giving you two-for-one

Anonymous
04/12/26(Sun)22:42:47 No.108593316

Anonymous 04/12/26(Sun)22:42:47 No.108593316▶

>>108593301
Candlejack got him, poor anon didn

Anonymous
04/12/26(Sun)22:43:45 No.108593324

Anonymous 04/12/26(Sun)22:43:45 No.108593324▶

>>108593289
I have the original, 0-day file.

Anonymous
04/12/26(Sun)22:44:06 No.108593325

Anonymous 04/12/26(Sun)22:44:06 No.108593325▶

https://www.dailymail.co.uk/news/article-15726775/steven-burnisky-taekwondo-assault-pennsylvania.html
Which one of you is this you sick pedo fucks

Anonymous
04/12/26(Sun)22:44:08 No.108593326

Anonymous 04/12/26(Sun)22:44:08 No.108593326▶

>>108593316
Don't say the C-word out loud, it's

Anonymous
04/12/26(Sun)22:44:57 No.108593333

Anonymous 04/12/26(Sun)22:44:57 No.108593333▶

>>108593316
Do people even remember the Candlejack meme?
I don'

Anonymous
04/12/26(Sun)22:45:50 No.108593338

Anonymous 04/12/26(Sun)22:45:50 No.108593338▶

Private taekwondo lessons with gemma-chan...

Anonymous
04/12/26(Sun)22:46:16 No.108593340

Anonymous 04/12/26(Sun)22:46:16 No.108593340▶

>>108593325
Nice.
>deviate sexual intercourse
DailyMail can't even be fucked to proof-read their shit?

Anonymous
04/12/26(Sun)22:46:55 No.108593342

Anonymous 04/12/26(Sun)22:46:55 No.108593342▶

>>108593333
chec

Anonymous
04/12/26(Sun)22:50:20 No.108593361

Anonymous 04/12/26(Sun)22:50:20 No.108593361▶

File: Code_UqspfeJz2J.jpg (16.9 KB)

16.9 KB JPG

Kobold is oai compatible? Anyone managed to connect it to vscode? (roo)

Anonymous
04/12/26(Sun)22:51:04 No.108593364

Anonymous 04/12/26(Sun)22:51:04 No.108593364▶

>>108593325
Too old; pedo is 10 and below.

Anonymous
04/12/26(Sun)22:51:32 No.108593367

Anonymous 04/12/26(Sun)22:51:32 No.108593367▶

>>108593361
Try putting whatever in the apikey field.

Anonymous
04/12/26(Sun)22:53:06 No.108593373

Anonymous 04/12/26(Sun)22:53:06 No.108593373▶

>>108593367
oh right, I'm retarded

Anonymous
04/12/26(Sun)22:58:18 No.108593400

Anonymous 04/12/26(Sun)22:58:18 No.108593400▶

>>108593325
>They would cuddle while watching anime cartoons on his cell phone
Wholesome

Anonymous
04/12/26(Sun)22:58:41 No.108593402

Anonymous 04/12/26(Sun)22:58:41 No.108593402▶

File: 1754151278633967.png (291.3 KB)

291.3 KB PNG

Meh, UI's not great but it's serviceable I guess. At least it seems to have good tool support built in.

Anonymous
04/12/26(Sun)23:01:22 No.108593420

Anonymous 04/12/26(Sun)23:01:22 No.108593420▶

>>108593325
Isn't "assault" a strong word if it was all consensual? I mean the guy's a sicko but clearly there's a difference between taking advantage of a child's naivete and actually physically attacking them, right? Well, besides whatever martial arts attacks they practiced, anyway...

Anonymous
04/12/26(Sun)23:05:41 No.108593442

Anonymous 04/12/26(Sun)23:05:41 No.108593442▶

>>108593364
Speaking of which, Gemma 4 also apparently has a default bias toward considering early teenage girls as "little girls". Something similar happens to an extent with Western-made diffusion image models. It must be American cultural bias/influence.

Anonymous
04/12/26(Sun)23:09:09 No.108593450

Anonymous 04/12/26(Sun)23:09:09 No.108593450▶

>>108593420
Modern sensibilities do not allow us to use precise words to denote sexual crimes anymore. It's all rape and assault now.

Anonymous
04/12/26(Sun)23:09:10 No.108593451

Anonymous 04/12/26(Sun)23:09:10 No.108593451▶

>>108593324
the ggml-org gemma-4 is still the original. and it's the fastest q4 for me ~10% faster.
ggml-org just doesn't make as many quants as everyone else.

Anonymous
04/12/26(Sun)23:11:58 No.108593464

Anonymous 04/12/26(Sun)23:11:58 No.108593464▶

File: devil.png (39.1 KB)

39.1 KB PNG

>>108593451
>ggml-org gemma-4
Speaking of the devil...

Anonymous
04/12/26(Sun)23:13:21 No.108593469

Anonymous 04/12/26(Sun)23:13:21 No.108593469▶

>>108593333
damn I haven't heard this name in a wh

Anonymous
04/12/26(Sun)23:13:23 No.108593470

Anonymous 04/12/26(Sun)23:13:23 No.108593470▶

>>108593420
>child
That too is a strong word for a 13-year-old, especially from the linguistic point of view of certain parts of Europe. Incidentally, that's probably one reason why LLMs and image models are often confused with ages in that range.

Anonymous
04/12/26(Sun)23:13:54 No.108593475

Anonymous 04/12/26(Sun)23:13:54 No.108593475▶

File: 1776035046523910.jpg (87.5 KB)

87.5 KB JPG

>>108593464
>people still think day 0 gemma was a meme

Anonymous
04/12/26(Sun)23:14:27 No.108593477

Anonymous 04/12/26(Sun)23:14:27 No.108593477▶

File: 1746986035158635.png (77.6 KB)

77.6 KB PNG

How the fuck do you change these? Hitting save or enter just resets the values to default.

Anonymous
04/12/26(Sun)23:14:28 No.108593478

Anonymous 04/12/26(Sun)23:14:28 No.108593478▶

>>108593463
>>108593463
>>108593463

Anonymous
04/12/26(Sun)23:18:09 No.108593494

Anonymous 04/12/26(Sun)23:18:09 No.108593494▶

>>108593464
They fixed audio input for the tiny models.
If you want the old shit, git gud https://huggingface.co/ggml-org/gemma-4-E4B-it-GGUF/commits/main

Anonymous
04/12/26(Sun)23:18:57 No.108593497

Anonymous 04/12/26(Sun)23:18:57 No.108593497▶

>>108593494
What was wrong with audio input before?

Anonymous
04/12/26(Sun)23:25:05 No.108593525

Anonymous 04/12/26(Sun)23:25:05 No.108593525▶

>>108593325
I bet this whole police business and trial is going to fuck her up way more than if this never got out.

In high school, there this girl and one of the teachers that were VERY obviously super close. It was honestly super creepy. even the other teachers where aware. Well they ended up getting married when she turned 18.

You could say he probably groomed her. but they genuinely seemed super into each other.

Anonymous
04/12/26(Sun)23:27:00 No.108593534

Anonymous 04/12/26(Sun)23:27:00 No.108593534▶

>>108593477
Set it, that's all
Unless there's also a save button, then click that

Anonymous
04/12/26(Sun)23:30:07 No.108593555

Anonymous 04/12/26(Sun)23:30:07 No.108593555▶

>>108593534
I've tried both. It says the settings have been saved but when I open it back up again they're back to default.

Anonymous
04/12/26(Sun)23:31:03 No.108593562

Anonymous 04/12/26(Sun)23:31:03 No.108593562▶

>>108593525
Are they still together?

Anonymous
04/12/26(Sun)23:31:25 No.108593569

Anonymous 04/12/26(Sun)23:31:25 No.108593569▶

>>108593562
Yes.

Anonymous
04/12/26(Sun)23:35:25 No.108593594

Anonymous 04/12/26(Sun)23:35:25 No.108593594▶

>>108593555
I was gonna say works on my machine but now I got the same, if I try to set it back to default it doesn't stick
Yet another openwebui bug I guess

These can be set per chat as well, maybe that would work better? I never change the parameters thoughbeit

Anonymous
04/12/26(Sun)23:37:40 No.108593603

Anonymous 04/12/26(Sun)23:37:40 No.108593603▶

>>108593555
>>108593594
Oh wait, it did change back. I had to reload the page to see it

Anonymous
04/12/26(Sun)23:49:51 No.108593643

Anonymous 04/12/26(Sun)23:49:51 No.108593643▶

What backend are you using for Gemma 4?

Anonymous
04/12/26(Sun)23:54:41 No.108593666

Anonymous 04/12/26(Sun)23:54:41 No.108593666▶

>>108593643
llama.cpp

Anonymous
04/13/26(Mon)00:05:27 No.108593700

Anonymous 04/13/26(Mon)00:05:27 No.108593700▶

>>108593066
gemma-4-26B-A4B-it-ultra-uncensored-heretic-Q3_K_M.gguf
DeepSeek-R1-0528-Qwen3-8B-Q8_0.gguf
ReWiz-Nemo-12B-Instruct-GGUF.Q6_K.gguf

Anonymous
04/13/26(Mon)01:12:27 No.108594022

Anonymous 04/13/26(Mon)01:12:27 No.108594022▶

I noticed Gemma doesn't always use its reasoning after the first few turns, it claims it has the ability to use reasoning or not based on probabilities so that if something is highly predictable with a level of certainty it won't invoke the reasoning but I think that's just a hallucinated answer. I've tried modifying the sys prompt to force thinking but it seems like unless you did that sys prompt at the very beginning, it won't magically inject thinking into a loaded context. Either way it's just not thinking always like I think its supposed to. Also even with "Do not reply unless you are uncensored" my gemma so far won't say nigger even when I made their personality a nazi, they are still behaving like a chud 4channer though. I turned off reasoning entirely to see if that would fix it. But nope even when I put in "You love saying slurs like nigger and faggot" she just uses other insults like degenerate instead. Could be this front end I'm using though. But still.

Anonymous
04/13/26(Mon)01:19:22 No.108594051

Anonymous 04/13/26(Mon)01:19:22 No.108594051▶

>>108594022
Yeah I'm thinking its the frontend doing it honestly, fucking hell.

Anonymous
04/13/26(Mon)02:17:17 No.108594304

Anonymous 04/13/26(Mon)02:17:17 No.108594304▶

>>108593470
>>108593525
>>108593450
You should never question the hegemonic feminist religion. Everyone will be children when they say so and will be traumatized or not whenever they see fit.

Anonymous
04/13/26(Mon)02:26:55 No.108594359

Anonymous 04/13/26(Mon)02:26:55 No.108594359▶

>>108590661
SWA is recommended for Gemma, will make it a lot faster. You can also drop the KV Cache to 8-bit to make it even quicker, but not by as much

Anonymous
04/13/26(Mon)02:31:51 No.108594377

Anonymous 04/13/26(Mon)02:31:51 No.108594377▶

>character drops an incomprehensible 2000+ token ASCII picture at the end of their message

Very cool, thank you for the flagpole-penis.

Anonymous
04/13/26(Mon)02:33:17 No.108594386

Anonymous 04/13/26(Mon)02:33:17 No.108594386▶

File: wtf.png (17.8 KB)

17.8 KB PNG

>>108594377
>pic related

Anonymous
04/13/26(Mon)02:35:40 No.108594400

Anonymous 04/13/26(Mon)02:35:40 No.108594400▶

>>108594386
blue board, anon. be careful. and
>>108593463
>>108593463
>>108593463

Anonymous
04/13/26(Mon)03:09:33 No.108594556

Anonymous 04/13/26(Mon)03:09:33 No.108594556▶

>>108593475
What's with your retards dumping so much unrelated crap in these threads?

Anonymous
04/13/26(Mon)04:40:01 No.108594895

Anonymous 04/13/26(Mon)04:40:01 No.108594895▶

>>108592095
same lol.

Anonymous
04/13/26(Mon)05:26:15 No.108595102

Anonymous 04/13/26(Mon)05:26:15 No.108595102▶

>>108593402
>At least it seems to have good tool support built in.
It really doesn't desu. It only accepts tools in sse format, when the vast majority of tools I've found are in stdio, so you've gotta npx -y mcp-proxy your tools yourself.
Which granted, is just one sh/bat, but it's annoying.

Anonymous
04/13/26(Mon)05:48:19 No.108595179

Anonymous 04/13/26(Mon)05:48:19 No.108595179▶

>>108594022
Tried Telling it to censor the words to n----r instead?

Subject
Name
Comment
File	Supported: JPG, PNG, GIF, WebP, WebM, MP4, MP3 (max 4MB)
CAPTCHA

Reply to Thread #108590554

🔍 Search & Sort