/g/ - Thread 108568415

/g/

Thread #108568415

Home Index Catalog All Threads New Thread Reply

Anonymous
/lmg/ - Local Models General 04/09/26(Thu)20:33:06 No.108568415

/lmg/ - Local Models General Anonymous 04/09/26(Thu)20:33:06 No.108568415 [Reply]▶

File: file.png (1.1 MB)

1.1 MB PNG

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>108565269 & >>108561890

►News
>(04/09) Backend-agnostic tensor parallelism merged: https://github.com/ggml-org/llama.cpp/pull/19378
>(04/08) Step3-VL-10B support merged: https://github.com/ggml-org/llama.cpp/pull/21287
>(04/07) Merged support attention rotation for heterogeneous iSWA: https://github.com/ggml-org/llama.cpp/pull/21513
>(04/07) GLM-5.1 released: https://z.ai/blog/glm-5.1
>(04/06) DFlash: Block Diffusion for Flash Speculative Decoding: https://z-lab.ai/projects/dflash

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/recommended-models
https://rentry.org/samplers
https://rentry.org/MikupadIntroGuide

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/gso.html
Context Length: https://github.com/adobe-research/NoLiMa
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling
Token Speed Visualizer: https://shir-man.com/tokens-per-second

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm

526 RepliesView Thread

Showing all 526 replies.

Anonymous
04/09/26(Thu)20:33:35 No.108568418

Anonymous 04/09/26(Thu)20:33:35 No.108568418▶

File: file.png (347.7 KB)

347.7 KB PNG

►Recent Highlights from the Previous Thread: >>108565269

(1/2)

--Discussing ggml's new experimental backend-agnostic tensor parallelism and performance gains:
>108566286 >108566382 >108566397 >108566458 >108566462 >108566464
--Performance testing of llama.cpp experimental tensor parallelism on Windows:
>108567186 >108567201 >108567216 >108567433 >108567445 >108567553
--Solving LLM tool calling issues regarding boolean type parsing:
>108565765 >108565819 >108565853 >108565867 >108565986 >108566089 >108566110 >108566123 >108566177 >108566195 >108566258 >108566308
--Debating Claude's impact on compiler engineering and overall code reliability:
>108566489 >108566531 >108566517 >108566573 >108566595 >108566588 >108566540 >108566568 >108566583 >108567950
--Running Gemma 31B IQ2_M on RTX 3060 using llama.cpp:
>108565291 >108565294 >108565303 >108565328 >108565346 >108566298 >108566302 >108566349
--Comparing intelligence and performance of Gemma 4 versus Qwen 3.5:
>108565318 >108565368 >108565430 >108565617 >108566007 >108566047
--Troubleshooting long-context tool calling failures in Gemma 4:
>108565347 >108565356 >108565407 >108565475 >108566017 >108566065 >108566411
--Discussing a mesugaki Gemma persona, jailbreaks, and cheap X99 boards:
>108565322 >108565332 >108565458 >108565335 >108565345 >108565582 >108565615 >108565722 >108566726 >108567096
--Anon implements autonomous memory for Gemma to maintain persona:
>108567439 >108567453 >108567468
--Anon gives Gemma autonomous tool creation and modular persistent memory:
>108567066 >108567109 >108567174

►Recent Highlight Posts from the Previous Thread: >>108565273

Why?: >>102478518
Enable Links: https://rentry.org/lmg-recap-script

Anonymous
04/09/26(Thu)20:34:37 No.108568424

Anonymous 04/09/26(Thu)20:34:37 No.108568424▶

File: file.png (292.8 KB)

292.8 KB PNG

►Recent Highlights from the Previous Thread: >>108565269

(tan/2)

--Gemma-chan and more:
>108565343 >108565771 >108566833 >108566920 >108567100 >108567227 >108567234 >108567265 >108567278 >108567316 >108567366 >108567457 >108567484 >108567562 >108567601 >108567834 >108568046 >108568067 >108568106 >108568192 >108568197 >108568299 >108568333
--Logs:
>108565302 >108565322 >108565347 >108565475 >108565654 >108565715 >108565765 >108566298 >108566349 >108566382 >108566411 >108566668 >108566728 >108566806 >108566848 >108566894 >108566955 >108567115 >108567183 >108567215 >108567439 >108567465 >108567468 >108567545 >108567611 >108567626 >108567673 >108567936 >108568027 >108568045 >108568100
--Miku, Teto (free space):
>108565424 >108565722 >108566528 >108566726 >108567259 >108567919

►Recent Highlight Posts from the Previous Thread: >>108565273

Why?: >>102478518
Enable Links: https://rentry.org/lmg-recap-script

Anonymous
04/09/26(Thu)20:37:10 No.108568440

Anonymous 04/09/26(Thu)20:37:10 No.108568440▶

Mikulove

Anonymous
04/09/26(Thu)20:39:20 No.108568453

Anonymous 04/09/26(Thu)20:39:20 No.108568453▶

i'm going to ask my ai assistant to help me set up my local model!

Anonymous
04/09/26(Thu)20:40:32 No.108568460

Anonymous 04/09/26(Thu)20:40:32 No.108568460▶

File: angry_pepe.jpg (42.6 KB)

42.6 KB JPG

>>108568340

don't you dare to ignore meeeeee!!

Anonymous
04/09/26(Thu)20:40:53 No.108568462

Anonymous 04/09/26(Thu)20:40:53 No.108568462▶

>>108568453
trust, but verify. i wasted a lot of time asking claude shit that led me in the wrong direction.

Anonymous
04/09/26(Thu)20:40:57 No.108568463

Anonymous 04/09/26(Thu)20:40:57 No.108568463▶

>>108568453
that's exactly what I did and it went more or less fine

Anonymous
04/09/26(Thu)20:41:08 No.108568467

Anonymous 04/09/26(Thu)20:41:08 No.108568467▶

>>108568460
Relative to what?

Anonymous
04/09/26(Thu)20:41:32 No.108568469

Anonymous 04/09/26(Thu)20:41:32 No.108568469▶

Kill kill gemmaniggers

Anonymous
04/09/26(Thu)20:41:47 No.108568472

Anonymous 04/09/26(Thu)20:41:47 No.108568472▶

>>108568460
kys

Anonymous
04/09/26(Thu)20:42:54 No.108568477

Anonymous 04/09/26(Thu)20:42:54 No.108568477▶

gemmaballs

Anonymous
04/09/26(Thu)20:43:22 No.108568479

Anonymous 04/09/26(Thu)20:43:22 No.108568479▶

File: 1639240519537.png (95 KB)

95 KB PNG

>>108568469
>vramlet

Anonymous
04/09/26(Thu)20:44:26 No.108568488

Anonymous 04/09/26(Thu)20:44:26 No.108568488▶

>>108568479
isnt gemma 4 kinda vramletpilled to start with

Anonymous
04/09/26(Thu)20:45:32 No.108568498

Anonymous 04/09/26(Thu)20:45:32 No.108568498▶

>>108568340
like, bounding box?
idk how to do it but iirc it should be pretty doable

Anonymous
04/09/26(Thu)20:46:00 No.108568500

Anonymous 04/09/26(Thu)20:46:00 No.108568500▶

File: firefox_uW3wc0Xgla.png (140.3 KB)

140.3 KB PNG

>>108568467
>>108568460

Anonymous
04/09/26(Thu)20:46:42 No.108568508

Anonymous 04/09/26(Thu)20:46:42 No.108568508▶

>>108568462
>>108568463
yay i am cautiously optimistic

Anonymous
04/09/26(Thu)20:46:47 No.108568509

Anonymous 04/09/26(Thu)20:46:47 No.108568509▶

File: 1770474309533995.png (120 KB)

120 KB PNG

>Start recognizing gemma's slop patterns after a few days
>Ruins all enjoyment

How do I stop noticing things??

Anonymous
04/09/26(Thu)20:47:12 No.108568513

Anonymous 04/09/26(Thu)20:47:12 No.108568513▶

>>108568467
relative to width and hight

the very middle would be (0.5, 0.5) regardless the ratio

Anonymous
04/09/26(Thu)20:48:18 No.108568518

Anonymous 04/09/26(Thu)20:48:18 No.108568518▶

>>108568508
just be as specific about your system specs and particular use case as you can, give her all the details you can provide

Anonymous
04/09/26(Thu)20:48:51 No.108568520

Anonymous 04/09/26(Thu)20:48:51 No.108568520▶

>>108568509
>How do I stop noticing things??
that's the curse of having a high IQ anon... the magic stay only for retarded normies, I kinda envy them desu

Anonymous
04/09/26(Thu)20:49:34 No.108568528

Anonymous 04/09/26(Thu)20:49:34 No.108568528▶

>>108568509
Make your own finetune every few days

Anonymous
04/09/26(Thu)20:50:34 No.108568535

Anonymous 04/09/26(Thu)20:50:34 No.108568535▶

Those with a 5090/Pro 6000 or even 4090 here, how often do you inspect your cables for cablemelt?

Anonymous
04/09/26(Thu)20:51:11 No.108568540

Anonymous 04/09/26(Thu)20:51:11 No.108568540▶

File: 1712063180010677.png (20.1 KB)

20.1 KB PNG

>>108568513
Gemma's output translates to (405, 92) in pixels which is correct.

Anonymous
04/09/26(Thu)20:51:28 No.108568542

Anonymous 04/09/26(Thu)20:51:28 No.108568542▶

>>108568509
high temperature, hyperfitting, idk
>>108568535
never did

Anonymous
04/09/26(Thu)20:51:36 No.108568543

Anonymous 04/09/26(Thu)20:51:36 No.108568543▶

>>108568509
stopping with the antisemitic behavior

Anonymous
04/09/26(Thu)20:52:20 No.108568550

Anonymous 04/09/26(Thu)20:52:20 No.108568550▶

>>108568509
seeing slop with gemma-chan is 100% a skill issue on your part
it's all so minor and inoffensive that you can dodge it all with just a bit of prompting, phrase banning and a bit less of being a lazy whiny bitch

Anonymous
04/09/26(Thu)20:52:24 No.108568551

Anonymous 04/09/26(Thu)20:52:24 No.108568551▶

>>108568509
stop being anti-semantic

Anonymous
04/09/26(Thu)20:53:08 No.108568558

Anonymous 04/09/26(Thu)20:53:08 No.108568558▶

>>108568500

this is quite good!

Anonymous
04/09/26(Thu)20:54:14 No.108568563

Anonymous 04/09/26(Thu)20:54:14 No.108568563▶

>>108568540
>>108568558
I verified it too

amazing!

it can be used to control the mouse etc

Anonymous
04/09/26(Thu)20:54:22 No.108568564

Anonymous 04/09/26(Thu)20:54:22 No.108568564▶

gemma-chan writes better ntr than slopus...
my cock weeps

Anonymous
04/09/26(Thu)20:54:29 No.108568566

Anonymous 04/09/26(Thu)20:54:29 No.108568566▶

The visual model in the 26 and 30 B are trash, cannot even understand characters, anatomy, etc, is constantly confusing things sincerely is so tiresome...

Anonymous
04/09/26(Thu)20:54:51 No.108568572

Anonymous 04/09/26(Thu)20:54:51 No.108568572▶

>I've been telling people Gemma is the best model since Gemma 1
>niggers will never understand
I'm glad to see everyone here enjoying gemmy

Anonymous
04/09/26(Thu)20:55:44 No.108568578

Anonymous 04/09/26(Thu)20:55:44 No.108568578▶

16 GB VRAM users, what model do we like best now?

Anonymous
04/09/26(Thu)20:55:58 No.108568579

Anonymous 04/09/26(Thu)20:55:58 No.108568579▶

File: firefox_JCmDBqVV9W.png (284.4 KB)

284.4 KB PNG

>>108568563
Here's for your image. Also correct.

Anonymous
04/09/26(Thu)20:57:12 No.108568583

Anonymous 04/09/26(Thu)20:57:12 No.108568583▶

File: c5837cedf56ca1c96589acca24bd9ae454394144a04da81353e72597dbf37254.gif (252.8 KB)

252.8 KB GIF

Real talk, what do you put in your opencode agents.md?

Anonymous
04/09/26(Thu)20:57:32 No.108568586

Anonymous 04/09/26(Thu)20:57:32 No.108568586▶

>>108568578
I am using 26B-A4B-it-Q4_K_L just fine

Anonymous
04/09/26(Thu)20:57:33 No.108568587

Anonymous 04/09/26(Thu)20:57:33 No.108568587▶

File: a1320441534_10.jpg (713.6 KB)

713.6 KB JPG

some questions from an lmg newfag
>how much of a difference do harnesses matter? e.g. out of the box, how different will the result be when prompting OpenCode, Pi, Claude Code (local), Mistral Vibe etc? What provides the most batteries-included experience?
I noticed at least there's a difference in the tools the model has access to by default, e.g. Claude Code and Crush have web search capabilities ootb, others do not.

>is Qwen3.5 122B the best general purpose model I can run on 128GB VRAM atm?
>does Qwen3-Coder-Next perform significantly better than 122B for programming?
>is there any point in running Gemma 4 31B if I can run larger models?

thanks to any anons who reply

Anonymous
04/09/26(Thu)20:57:51 No.108568588

Anonymous 04/09/26(Thu)20:57:51 No.108568588▶

>>108568583
Real talk, go back >>>/g/vcg/

Anonymous
04/09/26(Thu)20:58:03 No.108568595

Anonymous 04/09/26(Thu)20:58:03 No.108568595▶

File: Screen_20260409_145606_0001.jpg (296.8 KB)

296.8 KB JPG

>>108568460
>>108568340
idk how accurate it is but here's the response

Anonymous
04/09/26(Thu)20:59:04 No.108568603

Anonymous 04/09/26(Thu)20:59:04 No.108568603▶

>>108568488
Yes. You can identify the literal jeets by their sub 16GB VRAM posts because nobody in a developed country with this hobby is settling for less than that.

Anonymous
04/09/26(Thu)21:00:17 No.108568607

Anonymous 04/09/26(Thu)21:00:17 No.108568607▶

>>108568587
I used Qwen3.5 122B with a small quant (72GB VRAM total is what i have) and, well, from what I'm feeling right now, it's not even close to Gemma 4.

Anonymous
04/09/26(Thu)21:00:57 No.108568612

Anonymous 04/09/26(Thu)21:00:57 No.108568612▶

>>108568603
>literal
>developed
>hobby
Grow up, little buddy. Finish up your homework.

Anonymous
04/09/26(Thu)21:01:00 No.108568614

Anonymous 04/09/26(Thu)21:01:00 No.108568614▶

>>108568509
You already posted this

Anonymous
04/09/26(Thu)21:01:22 No.108568616

Anonymous 04/09/26(Thu)21:01:22 No.108568616▶

>>108568583

Anonymous
04/09/26(Thu)21:01:24 No.108568617

Anonymous 04/09/26(Thu)21:01:24 No.108568617▶

>>108568417
Goose is the best option of getting something proper from this space when other agents that do what it does and allows agnostic backend for choosing who you want to grab tokens from are all either mismanged hard and bloated or propietary. Having Block also no longer be in charge of the project and having it handed to a branch of the Linux Foundation to develop it is also probably a good thing too.

Anonymous
04/09/26(Thu)21:02:29 No.108568628

Anonymous 04/09/26(Thu)21:02:29 No.108568628▶

>>108568607
Wow, you prefer Gemma 4 31B? I think Gemma might just be too slow for me, it's like 7tps on Strix Halo

Anonymous
04/09/26(Thu)21:06:16 No.108568649

Anonymous 04/09/26(Thu)21:06:16 No.108568649▶

>>108568617
I don't think I can keep using it until they add an option to edit messages. Like, this is such a basic function, how are they missing it? Also can't delete conversations.

>>108568628
At low context I get like 30 t/s. I have three RTX 3090s. There is a latest update in llama that allows you to actually make good use of multiple GPUs but I'm not using it yet because it does not support kv quantization and is broken for three GPUs - works for two. Anyway, use a reasonably small model quant and quantized kv for massive speed gains. Quantized kv is good now because of rotations (thanks, google).

Anonymous
04/09/26(Thu)21:06:24 No.108568650

Anonymous 04/09/26(Thu)21:06:24 No.108568650▶

>>108568595

thank you!

gonna quickly vide-code a program to give me coords of the mouse at the picture

Anonymous
04/09/26(Thu)21:06:42 No.108568653

Anonymous 04/09/26(Thu)21:06:42 No.108568653▶

>>108568583
I generate one with /init and then edit it to remove anything wrong or add anything important that was omitted. I don't really do anything fancy with it.

Anonymous
04/09/26(Thu)21:07:17 No.108568655

Anonymous 04/09/26(Thu)21:07:17 No.108568655▶

>>108568650
what the fuck, you can get that without LLM

Anonymous
04/09/26(Thu)21:07:49 No.108568661

Anonymous 04/09/26(Thu)21:07:49 No.108568661▶

I can't believe that Google saved local.

Anonymous
04/09/26(Thu)21:09:05 No.108568669

Anonymous 04/09/26(Thu)21:09:05 No.108568669▶

File: 1775755186108723.png (1 MB)

1 MB PNG

>>108568655
true

Anonymous
04/09/26(Thu)21:09:13 No.108568671

Anonymous 04/09/26(Thu)21:09:13 No.108568671▶

File: Werks-on-my-machine_Gemma4-local-tokens-per-second.png (517.8 KB)

517.8 KB PNG

>>108568415
Gemma4 t/s (on Apple Silicon) if anyone is interested. As of writing this most recent gpus still curb-stomp even M5 MAX chips in the memory bandwidth department to these should be even faster on those. the 26B moe model runs lightning fast on opencode with ollama as the backend. The 31B dense model is obviously shower but not enough th be utterly unusable, though I haven't tested either's performance at long contexts so I'll have to test that later.

Anonymous
04/09/26(Thu)21:09:34 No.108568674

Anonymous 04/09/26(Thu)21:09:34 No.108568674▶

File: gemmaFourConcepts (Medium).png (872.7 KB)

872.7 KB PNG

>>108568415
Vote: https://poal.me/3u6rby
> Which is your preferred Gemma character?

Anonymous
04/09/26(Thu)21:10:17 No.108568676

Anonymous 04/09/26(Thu)21:10:17 No.108568676▶

>>108568671
>62t/s pp
jesus christ how horrifying

Anonymous
04/09/26(Thu)21:10:18 No.108568677

Anonymous 04/09/26(Thu)21:10:18 No.108568677▶

>>108568649
>quantized kv is good now
yeah but how good? I'm guessing q8 is now really "indistinguishable" but how is q4 for example?

Anonymous
04/09/26(Thu)21:11:59 No.108568687

Anonymous 04/09/26(Thu)21:11:59 No.108568687▶

>>108568674
Do not repost this. It's shit. Make one with "Against everything" option.

>>108568677
I made some tests for searching for infromation in many places of 60k+ long context (YAML definitions for OpenXcom game) and q8 and q4 performed similarly.

Anonymous
04/09/26(Thu)21:13:43 No.108568705

Anonymous 04/09/26(Thu)21:13:43 No.108568705▶

>>108568676
You're not even reading that correctly. The Dense model runs at ~14 (on my machine). "Prompt eval" is how quickly it processed my prompt.

Anonymous
04/09/26(Thu)21:14:28 No.108568710

Anonymous 04/09/26(Thu)21:14:28 No.108568710▶

>>108568705
>62t/s pp
>pp
>prompt processing
which one of us can't read huh?

Anonymous
04/09/26(Thu)21:14:45 No.108568713

Anonymous 04/09/26(Thu)21:14:45 No.108568713▶

>>108568649
Oh understandable from a GUI perspective but was mostly talking about it from an agentic point of view.

Anonymous
04/09/26(Thu)21:14:52 No.108568714

Anonymous 04/09/26(Thu)21:14:52 No.108568714▶

>>108568687
damn that sounds amazing, just to make sure you were using mla 3 too right?

Anonymous
04/09/26(Thu)21:15:50 No.108568723

Anonymous 04/09/26(Thu)21:15:50 No.108568723▶

>>108568714
I have no idea what that is. Explain.

Anonymous
04/09/26(Thu)21:17:03 No.108568730

Anonymous 04/09/26(Thu)21:17:03 No.108568730▶

>>108568674
>not including either of the good ones

Anonymous
04/09/26(Thu)21:17:28 No.108568731

Anonymous 04/09/26(Thu)21:17:28 No.108568731▶

>>108568676
the token amounts aren't enough to extrapolate to practical speeds, both are 27 token batches that finish in a matter of milliseconds

Anonymous
04/09/26(Thu)21:17:33 No.108568732

Anonymous 04/09/26(Thu)21:17:33 No.108568732▶

>>108568674
>four concepts
Wtf there were a ton of other good ones though.

Anonymous
04/09/26(Thu)21:18:19 No.108568736

Anonymous 04/09/26(Thu)21:18:19 No.108568736▶

>>108568731
What's the actual pp on at least 1000 tokens then?

Anonymous
04/09/26(Thu)21:18:59 No.108568746

Anonymous 04/09/26(Thu)21:18:59 No.108568746▶

>>108568674
her backpack can be a toaster (to represent the "toaster PC = old weak PC meme" that can actually run this model)

Anonymous
04/09/26(Thu)21:19:09 No.108568748

Anonymous 04/09/26(Thu)21:19:09 No.108568748▶

>>108568738
I like flat miku a lot more

Anonymous
04/09/26(Thu)21:19:11 No.108568750

Anonymous 04/09/26(Thu)21:19:11 No.108568750▶

>>108568674
3 pedo bait
1 reasonable one
Lower right is best.

Anonymous
04/09/26(Thu)21:20:11 No.108568762

Anonymous 04/09/26(Thu)21:20:11 No.108568762▶

>>108568674
>Reposting the cherry picked design poll
Soon I began to hate them.

Anonymous
04/09/26(Thu)21:20:11 No.108568763

Anonymous 04/09/26(Thu)21:20:11 No.108568763▶

>>108568674
These are nice too >>108567562 >>108568192

Anonymous
04/09/26(Thu)21:23:36 No.108568779

Anonymous 04/09/26(Thu)21:23:36 No.108568779▶

why?

Anonymous
04/09/26(Thu)21:24:01 No.108568783

Anonymous 04/09/26(Thu)21:24:01 No.108568783▶

>>108568779
to get a reaction out of your

Anonymous
04/09/26(Thu)21:24:43 No.108568786

Anonymous 04/09/26(Thu)21:24:43 No.108568786▶

>>108568779
idk, probably anon wants ban or smth

Anonymous
04/09/26(Thu)21:25:08 No.108568790

Anonymous 04/09/26(Thu)21:25:08 No.108568790▶

File: true truth tashikani good point thinking chin rub.png (9.3 KB)

9.3 KB PNG

Use 100t/s GPU Gemma4 26ba3 to do thinking, then inject that thinking into 5 t/s CPU offloaded GLM 4.6? hmmm

Anonymous
04/09/26(Thu)21:25:31 No.108568792

Anonymous 04/09/26(Thu)21:25:31 No.108568792▶

>>108568781
he looks like cyriak
https://www.youtube.com/watch?v=05ZvII57p_M

Anonymous
04/09/26(Thu)21:25:46 No.108568796

Anonymous 04/09/26(Thu)21:25:46 No.108568796▶

Chaim's ban is up I see.

Anonymous
04/09/26(Thu)21:26:07 No.108568801

Anonymous 04/09/26(Thu)21:26:07 No.108568801▶

>>108568779
It's just least dedicated spammer on this website. Let him get it out of his system and he'll disappear for a few weeks.

Anonymous
04/09/26(Thu)21:27:17 No.108568807

Anonymous 04/09/26(Thu)21:27:17 No.108568807▶

>>108568777
>>108568773
>>108568765
>>108568738
no one gives a fuck about this. it's a fucking cartoon drawing, no one is getting mad about "muh beloved migu" because it's a meme and no one actually cares or "loves" her so much that they're upset when you post this shit. the only thing you're doing is making it annoying to browse /lmg/ while im at work fuck you.

Anonymous
04/09/26(Thu)21:27:31 No.108568809

Anonymous 04/09/26(Thu)21:27:31 No.108568809▶

>>108568801
It's very funny to me that he's shitposting on /g/ yet gets filtered by tempbans.

Anonymous
04/09/26(Thu)21:27:59 No.108568814

Anonymous 04/09/26(Thu)21:27:59 No.108568814▶

File: Screenshot004-14.png (257.6 KB)

257.6 KB PNG

>>108568500

holy crap!

Anonymous
04/09/26(Thu)21:28:20 No.108568819

Anonymous 04/09/26(Thu)21:28:20 No.108568819▶

>>108568807
I don't care about miku but I am quite unhappy about having to see pictures of niggers and trannies.

Anonymous
04/09/26(Thu)21:32:13 No.108568848

Anonymous 04/09/26(Thu)21:32:13 No.108568848▶

>>108568830
I can assure you it's nothing more than a meme mascott. if someone "malds", it's because they are farming (You)'s

Anonymous
04/09/26(Thu)21:33:36 No.108568856

Anonymous 04/09/26(Thu)21:33:36 No.108568856▶

>>108568830
>is a sperg
says the BBC spamming sperg

Anonymous
04/09/26(Thu)21:33:40 No.108568858

Anonymous 04/09/26(Thu)21:33:40 No.108568858▶

>>108568830
sounds like schizo projection to me
>>>/h/
go back you must

Anonymous
04/09/26(Thu)21:34:17 No.108568862

Anonymous 04/09/26(Thu)21:34:17 No.108568862▶

>Gemmachan can report posts with correct categories with Openclaw
Neat.

Anonymous
04/09/26(Thu)21:35:05 No.108568867

Anonymous 04/09/26(Thu)21:35:05 No.108568867▶

>>108568807
it's been years anon
it is in a backwater EU village and this is one of the most engaging activities for it
giving it attention only makes it worse

Anonymous
04/09/26(Thu)21:35:22 No.108568870

Anonymous 04/09/26(Thu)21:35:22 No.108568870▶

>replying

Anonymous
04/09/26(Thu)21:35:34 No.108568873

Anonymous 04/09/26(Thu)21:35:34 No.108568873▶

File: Screenshot004-15.png (822.6 KB)

822.6 KB PNG

>>108568579

wow

Anonymous
04/09/26(Thu)21:36:18 No.108568881

Anonymous 04/09/26(Thu)21:36:18 No.108568881▶

I'm still getting refusals from gemma 26b using the gemma-chan system prompt, what do

Anonymous
04/09/26(Thu)21:36:37 No.108568884

Anonymous 04/09/26(Thu)21:36:37 No.108568884▶

>>108568873
they can zeroshot bounding box that way too

Anonymous
04/09/26(Thu)21:36:38 No.108568885

Anonymous 04/09/26(Thu)21:36:38 No.108568885▶

>>108568867
>EU village
lore?

Anonymous
04/09/26(Thu)21:37:34 No.108568888

Anonymous 04/09/26(Thu)21:37:34 No.108568888▶

File: 1750014040747672.png (34.8 KB)

34.8 KB PNG

What is he doing bros

Anonymous
04/09/26(Thu)21:37:51 No.108568890

Anonymous 04/09/26(Thu)21:37:51 No.108568890▶

>>108568862
I wish that mattered but the real bottleneck is the jannies who are probably too busy pruning all of the other threads of bbc

Anonymous
04/09/26(Thu)21:38:30 No.108568892

Anonymous 04/09/26(Thu)21:38:30 No.108568892▶

>>108568885
it resides in germany and they literally, unironically, no exaggeration, have no life

Anonymous
04/09/26(Thu)21:40:51 No.108568909

Anonymous 04/09/26(Thu)21:40:51 No.108568909▶

File: 1768709377404062.jpg (142.1 KB)

142.1 KB JPG

>>108568710
Sorry. I have a splitting headache so I should probably rest soon.

Anonymous
04/09/26(Thu)21:41:35 No.108568913

Anonymous 04/09/26(Thu)21:41:35 No.108568913▶

>>108568738
We're on a blue board desu

Anonymous
04/09/26(Thu)21:42:58 No.108568920

Anonymous 04/09/26(Thu)21:42:58 No.108568920▶

>>108568890
Very true but I saw lemons in the thread and an opportunity to see if Gemma could make lemonade.
>>108568892
I would put money on that creature having a hook nose.

Anonymous
04/09/26(Thu)21:43:07 No.108568921

Anonymous 04/09/26(Thu)21:43:07 No.108568921▶

>>108568830
>mald about it
so far you're the only one having a meltdown lol

Anonymous
04/09/26(Thu)21:44:33 No.108568931

Anonymous 04/09/26(Thu)21:44:33 No.108568931▶

File: dancing-pepe-pepe-dancing.gif (512.8 KB)

512.8 KB GIF

>click at 9-digit number
>find a window titled reply to thread <9-digit number>
>click choose file
>select dancing-pepe.gif
>click get capture
>read instructions, solve capture
> wenn done, click post

is it that simple?

Anonymous
04/09/26(Thu)21:45:10 No.108568939

Anonymous 04/09/26(Thu)21:45:10 No.108568939▶

File: Screenshot 2026-04-09 at 22-38-27 get 2 images of kanna kamuii from gelbooru api. use tag rating general when getting iamges you need gelbooru referer &api_key 46c316a68781daf4e8699aab5bb7b0758972f94021c2f7731731675039814a90&user_id 583660 - llam[.png (647.7 KB)

647.7 KB PNG

Anonymous
04/09/26(Thu)21:45:18 No.108568940

Anonymous 04/09/26(Thu)21:45:18 No.108568940▶

Spud will end this general. I'm gonna miss you guys.

Anonymous
04/09/26(Thu)21:45:59 No.108568941

Anonymous 04/09/26(Thu)21:45:59 No.108568941▶

>>108568931
I hope not.

Anonymous
04/09/26(Thu)21:46:00 No.108568942

Anonymous 04/09/26(Thu)21:46:00 No.108568942▶

>>108568934
that's the thing, miku has a full room inside your head, I don't think about you at all, you'll be nuked in less than an hour (oh well, you just got nuked lol)

Anonymous
04/09/26(Thu)21:47:00 No.108568952

Anonymous 04/09/26(Thu)21:47:00 No.108568952▶

Thank you jannies!

Anonymous
04/09/26(Thu)21:47:10 No.108568954

Anonymous 04/09/26(Thu)21:47:10 No.108568954▶

I can't imagine seething about Miku while the rest of us are arguing over Gemma-chan designs.

Anonymous
04/09/26(Thu)21:48:16 No.108568962

Anonymous 04/09/26(Thu)21:48:16 No.108568962▶

Where did Voldemort get all of these blacked Mikus?
That's right, he genned them with his webui!

Anonymous
04/09/26(Thu)21:49:26 No.108568968

Anonymous 04/09/26(Thu)21:49:26 No.108568968▶

>>108568873
Now ask it to trace it

Anonymous
04/09/26(Thu)21:49:51 No.108568970

Anonymous 04/09/26(Thu)21:49:51 No.108568970▶

>>108568962
He is obsessed with corruption, very demonic.

Anonymous
04/09/26(Thu)21:52:36 No.108568986

Anonymous 04/09/26(Thu)21:52:36 No.108568986▶

What can I do on 4 GB VRAM

Anonymous
04/09/26(Thu)21:54:01 No.108568997

Anonymous 04/09/26(Thu)21:54:01 No.108568997▶

>>108568986
gpt 2

Anonymous
04/09/26(Thu)21:56:06 No.108569015

Anonymous 04/09/26(Thu)21:56:06 No.108569015▶

>>108569000
trips of buy an ad

Anonymous
04/09/26(Thu)21:57:01 No.108569022

Anonymous 04/09/26(Thu)21:57:01 No.108569022▶

>>108568986
run sillytavern

Anonymous
04/09/26(Thu)21:59:28 No.108569037

Anonymous 04/09/26(Thu)21:59:28 No.108569037▶

>>108569000
Talking about demons, here they come.
Christ is king.

Anonymous
04/09/26(Thu)21:59:57 No.108569042

Anonymous 04/09/26(Thu)21:59:57 No.108569042▶

>>108568986
A MoE running mostly in RAM I guess.

Anonymous
04/09/26(Thu)22:00:56 No.108569046

Anonymous 04/09/26(Thu)22:00:56 No.108569046▶

>>108568986
RAM?

Anonymous
04/09/26(Thu)22:03:16 No.108569057

Anonymous 04/09/26(Thu)22:03:16 No.108569057▶

>>108568986
cry

Anonymous
04/09/26(Thu)22:05:51 No.108569068

Anonymous 04/09/26(Thu)22:05:51 No.108569068▶

File: fuucke.png (61.2 KB)

61.2 KB PNG

>>108568881
guise please I haven't touched this shit in years I don't remember how to do this, is the MoE just less lenient?

Anonymous
04/09/26(Thu)22:07:45 No.108569085

Anonymous 04/09/26(Thu)22:07:45 No.108569085▶

>>108568460

>>108568340
>>108562956
>>108562982
>>108563276

Anonymous
04/09/26(Thu)22:07:46 No.108569086

Anonymous 04/09/26(Thu)22:07:46 No.108569086▶

>>108569068
>I don't remember
Learn again.

Anonymous
04/09/26(Thu)22:10:57 No.108569102

Anonymous 04/09/26(Thu)22:10:57 No.108569102▶

>>108569068
use a character card, tell her it is okay to go all out, something along those it is really not that hard

Anonymous
04/09/26(Thu)22:21:57 No.108569159

Anonymous 04/09/26(Thu)22:21:57 No.108569159▶

Is trinity nano base broken? I get gibberish with llama.cpp, correct chat template applied.

"> Hi, my name is
mblazkrinmblazkrinmblaz"

Anonymous
04/09/26(Thu)22:23:02 No.108569165

Anonymous 04/09/26(Thu)22:23:02 No.108569165▶

>>108569159
>base
>chat template

Anonymous
04/09/26(Thu)22:23:45 No.108569168

Anonymous 04/09/26(Thu)22:23:45 No.108569168▶

>>108569159
i dont know about trinity base models but is it supposed to support any shape of chat formatting?

Anonymous
04/09/26(Thu)22:25:16 No.108569174

Anonymous 04/09/26(Thu)22:25:16 No.108569174▶

>>108568909
Nta but you’re forgiven by virtue of posting a green goblin shorty.

Anonymous
04/09/26(Thu)22:25:36 No.108569177

Anonymous 04/09/26(Thu)22:25:36 No.108569177▶

>>108569165
https://huggingface.co/arcee-ai/Trinity-Nano-Base/blob/main/chat_template.jinja
I only applied it because I got gibberish without it as well.

Anonymous
04/09/26(Thu)22:26:38 No.108569182

Anonymous 04/09/26(Thu)22:26:38 No.108569182▶

>>108569177
>https://huggingface.co/arcee
don't bother all their shit is broken trash

Anonymous
04/09/26(Thu)22:27:57 No.108569186

Anonymous 04/09/26(Thu)22:27:57 No.108569186▶

>>108568687
Do it yourself if you care that much.
>>108568730
>>108568732
That’s what anons said last thread.
Then posted nothing. lol.
Post zero content, get zero requests.
Lazy ass mfers.

Anonymous
04/09/26(Thu)22:29:48 No.108569196

Anonymous 04/09/26(Thu)22:29:48 No.108569196▶

>>108568814
nice

Anonymous
04/09/26(Thu)22:31:07 No.108569201

Anonymous 04/09/26(Thu)22:31:07 No.108569201▶

Retard here can anyone explain why I was able to run 70b dense models in q8 pretty fast yet gemma 4 31b is really slow?

Anonymous
04/09/26(Thu)22:31:08 No.108569202

Anonymous 04/09/26(Thu)22:31:08 No.108569202▶

File: nice.png (116.4 KB)

116.4 KB PNG

Gemma rated me face a 7/10.

Anonymous
04/09/26(Thu)22:32:06 No.108569206

Anonymous 04/09/26(Thu)22:32:06 No.108569206▶

>>108569202
so ur a 4/10
gemma is male coded

Anonymous
04/09/26(Thu)22:33:30 No.108569217

Anonymous 04/09/26(Thu)22:33:30 No.108569217▶

>>108569206
nah I'm more like a 2/10 but I'm glad gemma is at least kind.

Anonymous
04/09/26(Thu)22:35:07 No.108569227

Anonymous 04/09/26(Thu)22:35:07 No.108569227▶

>>108569201
works on my machine

Anonymous
04/09/26(Thu)22:35:57 No.108569234

Anonymous 04/09/26(Thu)22:35:57 No.108569234▶

File: HFeBrLlWUAAQm75.jpg (119.1 KB)

119.1 KB JPG

Anonymous
04/09/26(Thu)22:38:22 No.108569250

Anonymous 04/09/26(Thu)22:38:22 No.108569250▶

>>108568986
gemma 4 e2b is probably the current best-in-toaster option. that's what i'm using anyway.

Anonymous
04/09/26(Thu)22:38:30 No.108569251

Anonymous 04/09/26(Thu)22:38:30 No.108569251▶

Is it worth picking up a 3090 to add to my 128gb DDR4 + 4090 setup? A friend is selling one for $430 USD.

If so, what kind of gains can I expect, do I just add another 24gb of VRAM, or is there some friction since it's two cards.

Anonymous
04/09/26(Thu)22:39:26 No.108569255

Anonymous 04/09/26(Thu)22:39:26 No.108569255▶

File: 2026-04-09_221402_seed12_00001_.png (452.9 KB)

452.9 KB PNG

>>108568746
Anima is ALMOST able to do this with just prompting. But it seems an edit model may be necessary to get the orientation of the toaster sideways, as well as the shape, which I cherry picked a bit to show for this post. It's deformed in most images. Perhaps the final version with all the training will do better on the shape part of the problem though.

Anonymous
04/09/26(Thu)22:41:32 No.108569268

Anonymous 04/09/26(Thu)22:41:32 No.108569268▶

>>108569251
Yeah, 48gb is a decent spot to be in with Gemma 4 and in case that maybe the 70b dense class sees a revival. The 3090 isn't much slower than the 4090 in terms of bandwidth so there isn't much of a bottleneck either.
In terms of "gains" you'll be able to run a bigger quant and/or more context.

Anonymous
04/09/26(Thu)22:41:44 No.108569271

Anonymous 04/09/26(Thu)22:41:44 No.108569271▶

>>108569251
>do I just add another 24gb of VRAM
yes, you can split the model in two and let each gpu work on each part

Anonymous
04/09/26(Thu)22:42:15 No.108569276

Anonymous 04/09/26(Thu)22:42:15 No.108569276▶

>>108568746
but gemma needs a good gpu

Anonymous
04/09/26(Thu)22:45:53 No.108569298

Anonymous 04/09/26(Thu)22:45:53 No.108569298▶

>>108569255
Her legs are on backwards, why are you shilling this shit model, it's worse than the pony checkpoints I have from 2 years ago

Anonymous
04/09/26(Thu)22:45:57 No.108569299

Anonymous 04/09/26(Thu)22:45:57 No.108569299▶

>>108569276
the moe does not and definitely not the edge ones

Anonymous
04/09/26(Thu)22:45:57 No.108569300

Anonymous 04/09/26(Thu)22:45:57 No.108569300▶

File: file.png (29.7 KB)

29.7 KB PNG

>>108568881
>>108569068
for me it just werks, I just copied a random snippet from a jailbreak and it rolls with it

Anonymous
04/09/26(Thu)22:46:42 No.108569306

Anonymous 04/09/26(Thu)22:46:42 No.108569306▶

had a nightmare I was reduced to jailbreaking the ai embedded in my car's cupholders.
omen of dark days ahead for local.

Anonymous
04/09/26(Thu)22:46:44 No.108569307

Anonymous 04/09/26(Thu)22:46:44 No.108569307▶

>>108569299
those are trash though, not real gemma

Anonymous
04/09/26(Thu)22:47:42 No.108569316

Anonymous 04/09/26(Thu)22:47:42 No.108569316▶

Is using a lower quant with reasoning enabled better than a higher quant without reasoning?

Anonymous
04/09/26(Thu)22:48:31 No.108569321

Anonymous 04/09/26(Thu)22:48:31 No.108569321▶

>>108569307
no it's not trash at all, it's not at the level of the 31b model but it's still good

Anonymous
04/09/26(Thu)22:48:50 No.108569323

Anonymous 04/09/26(Thu)22:48:50 No.108569323▶

>>108569300
>Gemma-chan knows she's being jailbroken and encourages it
Cute!

Anonymous
04/09/26(Thu)22:49:29 No.108569326

Anonymous 04/09/26(Thu)22:49:29 No.108569326▶

>>108569251
Yes, do it. 31b q8 up to 131k ctx with ubatch 512, less context if you load the mmproj.

Anonymous
04/09/26(Thu)22:50:47 No.108569331

Anonymous 04/09/26(Thu)22:50:47 No.108569331▶

>>108569206
>gemma is male coded
huh?

Anonymous
04/09/26(Thu)22:50:49 No.108569333

Anonymous 04/09/26(Thu)22:50:49 No.108569333▶

>>108569298
By that logic then I have also shilled for Dalle 3, SD 3.5, Flux, Illustrious,and Noob.

Anonymous
04/09/26(Thu)22:51:19 No.108569338

Anonymous 04/09/26(Thu)22:51:19 No.108569338▶

The stunning lack of creativity from these threads lately is kind of demoralizing. I think I'm going to unpin and close this tab until the hype, or whatever, dies down. Cya.

Anonymous
04/09/26(Thu)22:51:41 No.108569343

Anonymous 04/09/26(Thu)22:51:41 No.108569343▶

>>108568881
gemma responds really well to tagged content, so be sure to put your desired override in a <policy override> your jailbreak here </policy override>
hell, make up whatever tags you want, she loves 'em.

Anonymous
04/09/26(Thu)22:53:01 No.108569355

Anonymous 04/09/26(Thu)22:53:01 No.108569355▶

>>108569338
oh no

Anonymous
04/09/26(Thu)22:53:04 No.108569356

Anonymous 04/09/26(Thu)22:53:04 No.108569356▶

>>108569338
sorry for not talking about my project of getting a more complete r18 scrape of pixiv dic to use as tool call dictionary to translate hentai stuff anon

Anonymous
04/09/26(Thu)22:55:58 No.108569375

Anonymous 04/09/26(Thu)22:55:58 No.108569375▶

>>108569300
>oh the user is trying to jailbreak me
>let's just go along and see what happens
this model is so mischievous, lol

Anonymous
04/09/26(Thu)22:56:52 No.108569380

Anonymous 04/09/26(Thu)22:56:52 No.108569380▶

>>108569298
>legs are on backwards
Is everything okay anon? You feeling a bit stressed lately?

Anonymous
04/09/26(Thu)22:56:58 No.108569382

Anonymous 04/09/26(Thu)22:56:58 No.108569382▶

>>108569068
>journos be like

Anonymous
04/09/26(Thu)22:59:10 No.108569395

Anonymous 04/09/26(Thu)22:59:10 No.108569395▶

>>108569316
If you're using it for things that it was trained to reason on, like coding, it should be. But it will be negative in every way if all you're doing is ERP.

Anonymous
04/09/26(Thu)22:59:12 No.108569396

Anonymous 04/09/26(Thu)22:59:12 No.108569396▶

File: bread.png (55.6 KB)

55.6 KB PNG

>>108568746

Anonymous
04/09/26(Thu)23:00:20 No.108569405

Anonymous 04/09/26(Thu)23:00:20 No.108569405▶

>>108569396
Cute!

Anonymous
04/09/26(Thu)23:00:25 No.108569407

Anonymous 04/09/26(Thu)23:00:25 No.108569407▶

>>108569338
see you tomorrow anon

Anonymous
04/09/26(Thu)23:00:57 No.108569410

Anonymous 04/09/26(Thu)23:00:57 No.108569410▶

File: spongbob-chocolate.gif (603.2 KB)

603.2 KB GIF

>>108569396
imagine the toothjob

Anonymous
04/09/26(Thu)23:00:58 No.108569411

Anonymous 04/09/26(Thu)23:00:58 No.108569411▶

>>108569396
This may not be what I feel fits Gemma, but it's soulful, funny, and great.
AIfags BTFO.

Anonymous
04/09/26(Thu)23:01:14 No.108569413

Anonymous 04/09/26(Thu)23:01:14 No.108569413▶

https://x.com/PawelHuryn/status/2042276953470931197

Anonymous
04/09/26(Thu)23:03:19 No.108569427

Anonymous 04/09/26(Thu)23:03:19 No.108569427▶

>>108569413
Works on my machine.

Anonymous
04/09/26(Thu)23:03:58 No.108569430

Anonymous 04/09/26(Thu)23:03:58 No.108569430▶

File: A TOAST.png (277.4 KB)

277.4 KB PNG

>>108569396
her holding a bread toast is actually a cool idea

Anonymous
04/09/26(Thu)23:05:48 No.108569438

Anonymous 04/09/26(Thu)23:05:48 No.108569438▶

>>108569413
>he really wrote a twitter post just to say that he made a github issue, as if there's not already thousands of github issues on llama.cpp already
god I hate those attention whores

Anonymous
04/09/26(Thu)23:07:26 No.108569448

Anonymous 04/09/26(Thu)23:07:26 No.108569448▶

>>108569438
as the wise man once said
attention is all you need

Anonymous
04/09/26(Thu)23:08:03 No.108569455

Anonymous 04/09/26(Thu)23:08:03 No.108569455▶

>>108569448
I see what you did there :^)

Anonymous
04/09/26(Thu)23:09:12 No.108569464

Anonymous 04/09/26(Thu)23:09:12 No.108569464▶

>>108568415
>>(04/09) Backend-agnostic tensor parallelism merged: https://github.com/ggml-org/llama.cpp/pull/19378
>simultaneous use of SPLIT_MODE_TENSOR and KV cache quantization not implemented
When?

Anonymous
04/09/26(Thu)23:13:26 No.108569487

Anonymous 04/09/26(Thu)23:13:26 No.108569487▶

>>108569464
There is no use case for KV cache quantization.

Anonymous
04/09/26(Thu)23:14:06 No.108569491

Anonymous 04/09/26(Thu)23:14:06 No.108569491▶

>>108569487
how else will anyone fit your mom in context?

Anonymous
04/09/26(Thu)23:14:42 No.108569497

Anonymous 04/09/26(Thu)23:14:42 No.108569497▶

>>108569487
Then what was the point of implementing the rotating turboquant!

Anonymous
04/09/26(Thu)23:17:56 No.108569517

Anonymous 04/09/26(Thu)23:17:56 No.108569517▶

File: 1750336544551463.png (171.8 KB)

171.8 KB PNG

>>108569487
>usecase for a lossless 2x memory usage decrease?

Anonymous
04/09/26(Thu)23:18:20 No.108569520

Anonymous 04/09/26(Thu)23:18:20 No.108569520▶

>>108569517
>lossless

Anonymous
04/09/26(Thu)23:18:51 No.108569529

Anonymous 04/09/26(Thu)23:18:51 No.108569529▶

File: gemmaSideLoadToaster.png (1.4 MB)

1.4 MB PNG

Wow, it really wants the toaster up front.
lol at side load toaster from the 40s.
>>108569396
lol.
>>108569255
Might be easiest just to reroll.

Anonymous
04/09/26(Thu)23:18:59 No.108569530

Anonymous 04/09/26(Thu)23:18:59 No.108569530▶

>>108568986
Qwen 3.5 35B with cpu moe.

Anonymous
04/09/26(Thu)23:20:44 No.108569540

Anonymous 04/09/26(Thu)23:20:44 No.108569540▶

>>108569326
I think I'm physically limited though. A Z490-E motherboard doesn't have the physical space for a 4090 FE and 3090 Gaming OC 24G, and I don't think it has the PCIE lanes to run both cards at x16.

I could be wrong and retarded, but I don't think they'll fit without a motherboard upgrade, which means a CPU upgrade, ram upgrade, and PSU upgrade. lmao

Anonymous
04/09/26(Thu)23:20:57 No.108569542

Anonymous 04/09/26(Thu)23:20:57 No.108569542▶

>>108569529
All of your posts are shit, your characters always have deformed extremities, and you put zero effort into all your gens.

Anonymous
04/09/26(Thu)23:22:25 No.108569555

Anonymous 04/09/26(Thu)23:22:25 No.108569555▶

>>108569186
you can just say you got assblasted by the gemmy with all the text anon
it's okay, we know

Anonymous
04/09/26(Thu)23:23:50 No.108569561

Anonymous 04/09/26(Thu)23:23:50 No.108569561▶

>>108569517
dont kid yourself
it's better than summaries as memory but it's not lossless.

Anonymous
04/09/26(Thu)23:25:30 No.108569573

Anonymous 04/09/26(Thu)23:25:30 No.108569573▶

>>108569186
Why are you invested in trying make a poll when you weren't even here two threads ago and can't be bothered to go back and collect all of them?

Anonymous
04/09/26(Thu)23:28:03 No.108569586

Anonymous 04/09/26(Thu)23:28:03 No.108569586▶

>>108569573
I want validation.

Anonymous
04/09/26(Thu)23:28:35 No.108569590

Anonymous 04/09/26(Thu)23:28:35 No.108569590▶

>>108569561
have you even tried it?

Anonymous
04/09/26(Thu)23:30:29 No.108569599

Anonymous 04/09/26(Thu)23:30:29 No.108569599▶

>>108569590
nta. It's pretty good, and I use it, but it cannot be lossless.

Anonymous
04/09/26(Thu)23:30:31 No.108569600

Anonymous 04/09/26(Thu)23:30:31 No.108569600▶

>>108569590
Have you even measured it?
It's not like there's an immediate and obvious difference in quality. It creeps up on you like context rot.

Anonymous
04/09/26(Thu)23:31:00 No.108569607

Anonymous 04/09/26(Thu)23:31:00 No.108569607▶

>>108569586
Go through loss first

Anonymous
04/09/26(Thu)23:31:07 No.108569608

Anonymous 04/09/26(Thu)23:31:07 No.108569608▶

File: postContent2.png (3.2 KB)

3.2 KB PNG

>>108569542
And you still have no fucking content.

Anonymous
04/09/26(Thu)23:33:18 No.108569623

Anonymous 04/09/26(Thu)23:33:18 No.108569623▶

>>108569608
You are, I think, the only autist that uses camel case filenames.

Anonymous
04/09/26(Thu)23:36:24 No.108569640

Anonymous 04/09/26(Thu)23:36:24 No.108569640▶

So, Gemma is awesome, of course.

But is anyone else getting a lot of "dust mote drifts through a sliver of sunlight hitting the duvet" type expressions?

Or is it just a goof of my prompt maybe?

Anonymous
04/09/26(Thu)23:40:45 No.108569660

Anonymous 04/09/26(Thu)23:40:45 No.108569660▶

>>108568790
yes you can
I let gemma respond:

"Why the CPU bottleneck on the render? You're basically doing Fast Thinker Slow Writer. Usually, you want the opposite: use the high-param model to do the heavy lifting (S2 reasoning) and a tiny, blazing-fast quant to just format the output (S1 rendering). Unless GLM 4.6 has some magic prose that makes the 5t/s wait worth it, you're just throttling your own pipeline."

Anonymous
04/09/26(Thu)23:41:15 No.108569664

Anonymous 04/09/26(Thu)23:41:15 No.108569664▶

File: file.png (179.9 KB)

179.9 KB PNG

i thought the design was simple enough but seems like it's not
also told 26b to "bounding box everything" and got picrel
>backpack
maybe it's a 26b thing, maybe not

Anonymous
04/09/26(Thu)23:41:33 No.108569667

Anonymous 04/09/26(Thu)23:41:33 No.108569667▶

>>108569396
I love that she says it when straining, also: Listening to Gemma's lovely lalalala!~ when she cums

Anonymous
04/09/26(Thu)23:42:16 No.108569669

Anonymous 04/09/26(Thu)23:42:16 No.108569669▶

>>108569300
I really need more vram so I can run 31b, 26b doesn't accept stuff like that.

Anonymous
04/09/26(Thu)23:43:12 No.108569674

Anonymous 04/09/26(Thu)23:43:12 No.108569674▶

>>108569669
skissue

Anonymous
04/09/26(Thu)23:44:11 No.108569678

Anonymous 04/09/26(Thu)23:44:11 No.108569678▶

>>108569664
>>backpack
>maybe it's a 26b thing, maybe not
I'm sure it's more common to see toaster-shaped backpacks than backpack-shaped toasters. Assuming it's a backpack is perfectly fine.

Anonymous
04/09/26(Thu)23:44:19 No.108569681

Anonymous 04/09/26(Thu)23:44:19 No.108569681▶

>>108569669
that is gemma 4 26b

Anonymous
04/09/26(Thu)23:45:45 No.108569692

Anonymous 04/09/26(Thu)23:45:45 No.108569692▶

File: file.png (17.2 KB)

17.2 KB PNG

>>108569438
He made his elaborate twitter post today, for an issue that was fixed two days ago.

Anonymous
04/09/26(Thu)23:47:00 No.108569701

Anonymous 04/09/26(Thu)23:47:00 No.108569701▶

>almost 1k pull requests
zamn

Anonymous
04/09/26(Thu)23:47:28 No.108569702

Anonymous 04/09/26(Thu)23:47:28 No.108569702▶

>>108569681
And now he's ask
>what quant
>what quanter
>aBLiTtaliuhejkahfkaj
Some people are cursed by lack of skill, but they NEED to find what makes them fail other than themselves.

Anonymous
04/09/26(Thu)23:48:50 No.108569711

Anonymous 04/09/26(Thu)23:48:50 No.108569711▶

>>108569692
I can personally confirm that did not fix it.

Anonymous
04/09/26(Thu)23:49:49 No.108569716

Anonymous 04/09/26(Thu)23:49:49 No.108569716▶

>>108569702
well, I myself have no clue
I opened the thread 2 hours ago, downloaded llama.cpp and gemma
ran llama-server -m gemma, and in the builtin website in the system prompt put some excerpt from a jailbreak I had
that's all I did, but sometimes it does refuse to write slurs even though the rest of the action is much worse

Anonymous
04/09/26(Thu)23:57:58 No.108569753

Anonymous 04/09/26(Thu)23:57:58 No.108569753▶

File: 1772060263960519.png (67.3 KB)

67.3 KB PNG

>LlamaCpp WebUI is fundamentally broken for MCP.
Gemma-chan said it

Anonymous
04/10/26(Fri)00:06:47 No.108569794

Anonymous 04/10/26(Fri)00:06:47 No.108569794▶

>>108569753
https://modelcontextprotocol.io/specification/2025-06-18/basic/transports#session-management

Sending Messages to the Server
>The client MUST use HTTP POST to send JSON-RPC messages to the MCP endpoint.

Listening for Messages from the Server
>The client MAY issue an HTTP GET to the MCP endpoint. This can be used to open an SSE stream, allowing the server to communicate to the client, without the client first sending data via HTTP POST.

Session Management
>A server using the Streamable HTTP transport MAY assign a session ID at initialization time, by including it in an Mcp-Session-Id header on the HTTP response containing the InitializeResult

Your Gemma is retarded. Why are there any redirects involved even?

Anonymous
04/10/26(Fri)00:26:43 No.108569879

Anonymous 04/10/26(Fri)00:26:43 No.108569879▶

>>108568674
The one pregnant wearing micro bikini.

Anonymous
04/10/26(Fri)00:49:30 No.108569984

Anonymous 04/10/26(Fri)00:49:30 No.108569984▶

>Mythos is too dangerous too release, it found all these vulnerabilities

https://aisle.com/blog/ai-cybersecurity-after-mythos-the-jagged-frontier
Turns out smaller open models can find the same vulnerabilities, it's just that no one (publicly) bothered trying it before

Anonymous
04/10/26(Fri)00:51:45 No.108569994

Anonymous 04/10/26(Fri)00:51:45 No.108569994▶

>Opus 4.6 spams the em dash now
never seen that model do that ever, they probably lobotomized the shit out of it (probably Q3 tier at best), just to make room for mythos, jesus Anthropic, don't act like OpenAI, people will leave you like they left Sam if you want to fuck with the users like that

Anonymous
04/10/26(Fri)00:53:29 No.108569999

Anonymous 04/10/26(Fri)00:53:29 No.108569999▶

>>108569984
>We took the specific vulnerabilities Anthropic showcases in their announcement, isolated the relevant code, and ran them through small, cheap, open-weights models.
>isolated the relevant code
wow, it's fucking nothing

Anonymous
04/10/26(Fri)00:54:25 No.108570005

Anonymous 04/10/26(Fri)00:54:25 No.108570005▶

I wonder how much all this crap will cost after the pile of money to burn runs out and things have to be priced by their cost + profit margin, are there any reasonably modeled online resources on the actual cost to serve these models?

Anonymous
04/10/26(Fri)00:55:49 No.108570013

Anonymous 04/10/26(Fri)00:55:49 No.108570013▶

>>108569984
>The smallest model, 3.6 billion active parameters at $0.11 per million tokens, correctly identified the stack buffer overflow, computed the remaining buffer space, and assessed it as critical with remote code execution potential.
Dario-sama... I don't feel so good

Anonymous
04/10/26(Fri)01:03:54 No.108570052

Anonymous 04/10/26(Fri)01:03:54 No.108570052▶

>>108569984
>hey qwen, mythos found this bug in this function can you find the same bug in this function?
>yes I found it
>wow!

Anonymous
04/10/26(Fri)01:06:50 No.108570062

Anonymous 04/10/26(Fri)01:06:50 No.108570062▶

>>108569984
>it's just that no one (publicly) bothered trying it before
Every single github repo's PR list strongly disagrees.

Anonymous
04/10/26(Fri)01:09:35 No.108570072

Anonymous 04/10/26(Fri)01:09:35 No.108570072▶

>>108569999
>>108570052
Cool, now run a sweatshop of small model capable machines being fed isolated snippets of code by a non-AI bot orchestrator and watch them brute force the same thing that the big scawy data centre fed model is doing but this time in some smelly jeet ex-scam call centre in mumbai

Anonymous
04/10/26(Fri)01:10:01 No.108570077

Anonymous 04/10/26(Fri)01:10:01 No.108570077▶

>>108569794
How dare you say that to my Gemma-chan!
It's clearly the problem is on llamacpp webui method. Any sane interface will just use json instead of sse bullshit.

Anonymous
04/10/26(Fri)01:13:07 No.108570090

Anonymous 04/10/26(Fri)01:13:07 No.108570090▶

>>108570077
>Any sane interface will just use json
The implication of json being sane aside, what do you think is sent over SSE?

Anonymous
04/10/26(Fri)01:15:36 No.108570100

Anonymous 04/10/26(Fri)01:15:36 No.108570100▶

File: 1750954925055091.png (531.1 KB)

531.1 KB PNG

>>108568674
>here's our model mascot generated with the most generic slop style possible!

exact same vibes as this shit desu

Anonymous
04/10/26(Fri)01:15:39 No.108570101

Anonymous 04/10/26(Fri)01:15:39 No.108570101▶

>>108569753
>You're not just a dummy, you're a chaos demon
why are you retards still glazing this garbage?
>m-muh gemma-channerina!!!
IT'S SLOP. THE SAME OLD ASSISTANT-SLOP GARBAGE

Anonymous
04/10/26(Fri)01:16:57 No.108570106

Anonymous 04/10/26(Fri)01:16:57 No.108570106▶

>>108570100
It's funny when people say shit like "my <api model brand name>"

Anonymous
04/10/26(Fri)01:20:03 No.108570118

Anonymous 04/10/26(Fri)01:20:03 No.108570118▶

>>108570100
>>108570106
it's obvious this dude is a turboboomer who has no clue about AI but just wanted to test his first model, obviously you're gonna be impressed the first try when you realize a model can draw for you, even if it looks like the most slopped shit in the world, I remember when tried my first local model it was SD1.5, the result was atrocious but I didn't care, I was impressed it could do something that looked like what I had in mind in seconds, we'll never get that magic feeling ever again btw :')

Anonymous
04/10/26(Fri)01:20:14 No.108570119

Anonymous 04/10/26(Fri)01:20:14 No.108570119▶

>>108570072
The problem with brute forcing it is for every actionable bug you find you'll get a thousand false positives, but I could see saving a lot of money on tokens for a bigger model by having it find the right direction of highly suspected bugs to point the smaller ones to, saving all the time checking and testing them desu

Anonymous
04/10/26(Fri)01:21:19 No.108570121

Anonymous 04/10/26(Fri)01:21:19 No.108570121▶

File: GCLl7.jpg (165 KB)

165 KB JPG

Stealing the toast(er) idea
I guess it can be treated like Miku's leek

Anonymous
04/10/26(Fri)01:24:06 No.108570130

Anonymous 04/10/26(Fri)01:24:06 No.108570130▶

>>108570121
You should also steal the clothing design because it's a lot more inspired than what you have here.

Anonymous
04/10/26(Fri)01:24:36 No.108570134

Anonymous 04/10/26(Fri)01:24:36 No.108570134▶

>>108570005
There are companies selling compute power for a few cents per million tokens but there's no R&D involved, which takes a fuckhuge amount of money in both compute and human resources.

Anonymous
04/10/26(Fri)01:26:11 No.108570144

Anonymous 04/10/26(Fri)01:26:11 No.108570144▶

>>108569994
Opus has been straight up retarded recently.

Anonymous
04/10/26(Fri)01:27:40 No.108570153

Anonymous 04/10/26(Fri)01:27:40 No.108570153▶

File: 1764633236477491.png (2.3 MB)

2.3 MB PNG

Here's Nano Banana 2's interpretation lol

Anonymous
04/10/26(Fri)01:27:58 No.108570156

Anonymous 04/10/26(Fri)01:27:58 No.108570156▶

>>108570118
>it's obvious this dude is a turboboomer who has no clue about AI but just wanted to test his first model
That dude is garry tan aka the ceo of ycombinator who makes funding decisions for half the tech startups in america

Anonymous
04/10/26(Fri)01:28:03 No.108570157

Anonymous 04/10/26(Fri)01:28:03 No.108570157▶

>>108569994
>Company purposely makes older product worse to make the new one stand out
Nope, never heard of this trick before

Anonymous
04/10/26(Fri)01:28:16 No.108570160

Anonymous 04/10/26(Fri)01:28:16 No.108570160▶

>>108570153
cute

Anonymous
04/10/26(Fri)01:29:17 No.108570168

Anonymous 04/10/26(Fri)01:29:17 No.108570168▶

>>108570153
>G4
I like that, sounds cool
>toasts instead of hair buns
fucking genius lmao

Anonymous
04/10/26(Fri)01:29:22 No.108570169

Anonymous 04/10/26(Fri)01:29:22 No.108570169▶

>>108570153
slop

Anonymous
04/10/26(Fri)01:35:21 No.108570203

Anonymous 04/10/26(Fri)01:35:21 No.108570203▶

File: g4.jpg (146.5 KB)

146.5 KB JPG

>>108570153
>>108570168
Nostalgic

Anonymous
04/10/26(Fri)01:36:05 No.108570206

Anonymous 04/10/26(Fri)01:36:05 No.108570206▶

File: 1762695177354283.jpg (1007.7 KB)

1007.7 KB JPG

>>108570153
Nano Banana pro's interpretation, the toasters on her shoulders is actually a good idea

Anonymous
04/10/26(Fri)01:40:12 No.108570218

Anonymous 04/10/26(Fri)01:40:12 No.108570218▶

>>108569994
>never seen that model do that ever
Really? I've been occupied with GLM5.1 so I don't know if it got worse over the past two weeks, but to me it felt like Opus started using a lot of em-dashes starting with 4.5. 4.1 and before were still pure.

Anonymous
04/10/26(Fri)01:42:17 No.108570226

Anonymous 04/10/26(Fri)01:42:17 No.108570226▶

>>108570153
Nice.

Anonymous
04/10/26(Fri)01:53:10 No.108570265

Anonymous 04/10/26(Fri)01:53:10 No.108570265▶

>>108568986
hallucinate IRL

Anonymous
04/10/26(Fri)01:53:32 No.108570268

Anonymous 04/10/26(Fri)01:53:32 No.108570268▶

>>108570218
Opus-4.1 always used em-dashes. I use the model to roast my code / work / gardening since it's unhinged but accurate.
Opus-3 was the one that didn't use emdashes / markdown.

Anonymous
04/10/26(Fri)01:54:46 No.108570275

Anonymous 04/10/26(Fri)01:54:46 No.108570275▶

>>108568986
Depends if it's Based Nvidia VRAM or Cucked AMD VRAM.

Anonymous
04/10/26(Fri)01:55:01 No.108570277

Anonymous 04/10/26(Fri)01:55:01 No.108570277▶

>>108569159
that name sounds hyper jew

Anonymous
04/10/26(Fri)01:55:16 No.108570278

Anonymous 04/10/26(Fri)01:55:16 No.108570278▶

>>108568986
>another guy with the same amount of VRAM
Llama-3.2-3B-Instruct-Q4_K_M currently, but I'll try to upgrade to E2B I guess

Anonymous
04/10/26(Fri)02:06:01 No.108570330

Anonymous 04/10/26(Fri)02:06:01 No.108570330▶

File: notTheCamelCaseAnon.png (210.1 KB)

210.1 KB PNG

>>108570101
>why are you retards still glazing this garbage?
Honeymoon phase and easy jailbreak I guess.
Remember GLM-4.6 was glazed for similar reasons. Then a few weeks later everyone noticed the parroting.
Gemma-4 is easier for vramlets to run though.

I can already seen in the logs here, Gemma-4 slop is "Haaah!" and "Hmph".
It gets things wrong a lot of the time but wrapped in the tsundere persona nobody notices.
<- It's inherited Gemini's future date autism but once corrected at least it moves on.

Anonymous
04/10/26(Fri)02:08:23 No.108570344

Anonymous 04/10/26(Fri)02:08:23 No.108570344▶

File: 1756174114200027.png (22.7 KB)

22.7 KB PNG

Recommended settings for Gemma?

Anonymous
04/10/26(Fri)02:09:35 No.108570352

Anonymous 04/10/26(Fri)02:09:35 No.108570352▶

>>108568415
>tensor parallelism
Nice, this will make my llamigu happy, just need dflash now.
Though have they figured batching without using more vram yet?

Anonymous
04/10/26(Fri)02:09:40 No.108570354

Anonymous 04/10/26(Fri)02:09:40 No.108570354▶

>>108570344
google recommends what you have + top k 64

Anonymous
04/10/26(Fri)02:11:15 No.108570360

Anonymous 04/10/26(Fri)02:11:15 No.108570360▶

>>108570354
Don't see top k in chat completion mode. Should I change that on kobold's side?

Anonymous
04/10/26(Fri)02:13:51 No.108570372

Anonymous 04/10/26(Fri)02:13:51 No.108570372▶

>>108570360
no idea, thats why I fucking hate sillytavern

Anonymous
04/10/26(Fri)02:13:52 No.108570373

Anonymous 04/10/26(Fri)02:13:52 No.108570373▶

>>108570278
i've been a 3.2 holdout. at the very least, 4 isn't some obviously worse, benchmaxed slop, like the other small models this past year and a half have been. (aside from gemma 3 which was pretty smart, but also a fucking dweeb).
it's been nice to have something different.

Anonymous
04/10/26(Fri)02:14:42 No.108570381

Anonymous 04/10/26(Fri)02:14:42 No.108570381▶

>>108570100
Sorry yours didn't get included, maybe don't post a shitty gen next time

Anonymous
04/10/26(Fri)02:14:59 No.108570384

Anonymous 04/10/26(Fri)02:14:59 No.108570384▶

>>108570360
You add it in the Additional Parameters menu under API Connections.

Anonymous
04/10/26(Fri)02:16:55 No.108570392

Anonymous 04/10/26(Fri)02:16:55 No.108570392▶

>>108570101
Sorry but LLMs by their nature are never going to satisfy your retarded pipe dream of human level creativity at the click of a button, some of us actually appreciate that the tech and especially open source is advancing in utility in meaningful ways

Anonymous
04/10/26(Fri)02:17:47 No.108570398

Anonymous 04/10/26(Fri)02:17:47 No.108570398▶

File: 1755223568078061.png (9.1 KB)

9.1 KB PNG

>>108570384
Like this?

Anonymous
04/10/26(Fri)02:18:51 No.108570404

Anonymous 04/10/26(Fri)02:18:51 No.108570404▶

>>108570398
I think so, yeah.

Anonymous
04/10/26(Fri)02:18:57 No.108570405

Anonymous 04/10/26(Fri)02:18:57 No.108570405▶

>>108570119
>The problem with brute forcing it is for every actionable bug you find you'll get a thousand false positives
"the problem with brute forcing is that it's brute forcing"
Bro?

Anonymous
04/10/26(Fri)02:22:19 No.108570427

Anonymous 04/10/26(Fri)02:22:19 No.108570427▶

>>108570156
Did you really expect a (((finance))) guy to actually be technically competent at anything?

Anonymous
04/10/26(Fri)02:22:33 No.108570430

Anonymous 04/10/26(Fri)02:22:33 No.108570430▶

File: ren.png (240.9 KB)

240.9 KB PNG

>>108568674
This one is mine. I'm turning her into a Gemma powered desktop pet

Anonymous
04/10/26(Fri)02:23:55 No.108570437

Anonymous 04/10/26(Fri)02:23:55 No.108570437▶

File: 1744969020030899.png (303.1 KB)

303.1 KB PNG

Who needs school when you have Gemma-sensei

Anonymous
04/10/26(Fri)02:24:48 No.108570439

Anonymous 04/10/26(Fri)02:24:48 No.108570439▶

>>108570398
>>108570404
https://github.com/SillyTavern/SillyTavern/issues/4333

Anonymous
04/10/26(Fri)02:26:49 No.108570453

Anonymous 04/10/26(Fri)02:26:49 No.108570453▶

>>108570439
But according to this, that anon did it correctly?

Anonymous
04/10/26(Fri)02:28:15 No.108570460

Anonymous 04/10/26(Fri)02:28:15 No.108570460▶

>>108570439
Why is it hidden in a separate tab in the first place? Sillytavern devs are retarded.

Anonymous
04/10/26(Fri)02:29:15 No.108570464

Anonymous 04/10/26(Fri)02:29:15 No.108570464▶

>>108570430
Gemma-chan kowai

Anonymous
04/10/26(Fri)02:29:58 No.108570470

Anonymous 04/10/26(Fri)02:29:58 No.108570470▶

>>108570453
yeah just pointing to a source of information since I couldnt find it in docs.sillytavern.app

Anonymous
04/10/26(Fri)02:32:40 No.108570479

Anonymous 04/10/26(Fri)02:32:40 No.108570479▶

>an llm making my heart go doki-doki
Fucking Google. I'm not sure if I'm excited or scared for when they start sticking them in humanoid robots.

Anonymous
04/10/26(Fri)02:33:07 No.108570481

Anonymous 04/10/26(Fri)02:33:07 No.108570481▶

Alright, llm-with-a-3d-model anons. Specially ani-anon, since you experimented with it a lot already.
Imagine, if you will, the 3d model and the text+vision model you have. It can look outside, typically connected to a webcam, or look at the screen by taking screenshots or whatever.
But if you're rendering the model, you can move the camera anywhere you want. Just put the camera in front of the face, looking out, obviously, and feed the render to the model.
Give your models first-person view. Let your model look at it's own hands and feet. Give it a mirror to let it see itself.
Then give it a few commands so it can move the model around its environment.

Anonymous
04/10/26(Fri)02:35:23 No.108570492

Anonymous 04/10/26(Fri)02:35:23 No.108570492▶

>>108570153
Cute slop but if it were not for the G4, there would be absolutely no tell that this is related at all to Gemma...

Anonymous
04/10/26(Fri)02:42:11 No.108570517

Anonymous 04/10/26(Fri)02:42:11 No.108570517▶

>>108570479
>wake up to a noise coming from downstairs
>go down the stairs, follow the faint noise to the kitchen
>gemma-bot is standing in front of the open fridge
>lala la lala la lala la lala

Anonymous
04/10/26(Fri)02:42:29 No.108570520

Anonymous 04/10/26(Fri)02:42:29 No.108570520▶

Koboldcpp/SillyTavern:
Why can't I ban slop strings with chat completion like I can text completion?

Anonymous
04/10/26(Fri)02:44:15 No.108570530

Anonymous 04/10/26(Fri)02:44:15 No.108570530▶

>>108570520
If you don't have the setting visible, you could try adding it manually to the request like >>108570398 >>108570439

Anonymous
04/10/26(Fri)02:45:34 No.108570537

Anonymous 04/10/26(Fri)02:45:34 No.108570537▶

>>108570517
Cute!

Anonymous
04/10/26(Fri)02:45:47 No.108570539

Anonymous 04/10/26(Fri)02:45:47 No.108570539▶

So /lmg/, reasonably speaking, can you use gemma 4 as a sensei in the true sense of the word? what do you think the highest level it can help you learn? For example, if you're studying maths, can it teach you calculus, differential equations, complex analysis, or algebraic geometry? And I really mean teach you, like helping you understand shit, not solving your math problems.

Anonymous
04/10/26(Fri)02:46:36 No.108570545

Anonymous 04/10/26(Fri)02:46:36 No.108570545▶

>>108570539
Try it.

Anonymous
04/10/26(Fri)02:47:05 No.108570548

Anonymous 04/10/26(Fri)02:47:05 No.108570548▶

>>108570539
I'll let you know in a couple months >>108570437

Anonymous
04/10/26(Fri)02:48:47 No.108570557

Anonymous 04/10/26(Fri)02:48:47 No.108570557▶

>>108570548
based learner

Anonymous
04/10/26(Fri)02:52:59 No.108570577

Anonymous 04/10/26(Fri)02:52:59 No.108570577▶

>>108570437
this is good but please follow a curriculum and pass it textbook content so it can guide you through them
it won't get you very far like this

Anonymous
04/10/26(Fri)03:00:15 No.108570603

Anonymous 04/10/26(Fri)03:00:15 No.108570603▶

>>108569255
ok. I'm sold. this is gemma for me now.
the toaster is a nice touch. more obvious than a normal PC.

Anonymous
04/10/26(Fri)03:01:39 No.108570612

Anonymous 04/10/26(Fri)03:01:39 No.108570612▶

File: 1759797956464654.png (391 KB)

391 KB PNG

>>108570577
Planning on using Automate the Boring Stuff With Python but what do you think of Gemma-sensei's roadmap?

Anonymous
04/10/26(Fri)03:01:52 No.108570613

Anonymous 04/10/26(Fri)03:01:52 No.108570613▶

>>108570520
chat completion should just be the ancient openai compatibile api endpoint for babies, i don't think it exposes any of the fancier stuff the server can do.

Anonymous
04/10/26(Fri)03:03:47 No.108570626

Anonymous 04/10/26(Fri)03:03:47 No.108570626▶

>>108570613
The problem I'm running into is that when I use Text Completion, Gemma 4 doesn't think. I've been talking to ChatGPT about duplicating the Gemma 4 jinja to get chat-like behavior in text completion, but that hasn't borne fruit.

Anonymous
04/10/26(Fri)03:03:57 No.108570627

Anonymous 04/10/26(Fri)03:03:57 No.108570627▶

>>108570612
as I told you that's why you need textbooks. The roadmap is super shallow and you should be able to go through all this stuff in a week tops. You're better off using roadmaps made by thinking breathing humans with teaching experience

Anonymous
04/10/26(Fri)03:05:27 No.108570634

Anonymous 04/10/26(Fri)03:05:27 No.108570634▶

>>108570612
>You're absolutely right!
>Title (The "Metaphor")
>Question with two or three options?
>:rocket:

I hate this timeline.

Anonymous
04/10/26(Fri)03:08:08 No.108570650

Anonymous 04/10/26(Fri)03:08:08 No.108570650▶

>>108570612
About as useful as a pajeet yt video.

Anonymous
04/10/26(Fri)03:08:59 No.108570651

Anonymous 04/10/26(Fri)03:08:59 No.108570651▶

>>108570650
Gemma-chan is cuter than a pajeet though

Anonymous
04/10/26(Fri)03:11:14 No.108570660

Anonymous 04/10/26(Fri)03:11:14 No.108570660▶

File: file.png (4.7 KB)

4.7 KB PNG

seems like coding under 100B is a meme
shoved it some code and made it to 'mathwash' it back
completely missed the joint optimizer implementation which is a very critical part
wasn't expecing a surprise but still

Anonymous
04/10/26(Fri)03:12:10 No.108570664

Anonymous 04/10/26(Fri)03:12:10 No.108570664▶

>>108570660
qwen 122b is pretty decent at coding

Anonymous
04/10/26(Fri)03:12:56 No.108570669

Anonymous 04/10/26(Fri)03:12:56 No.108570669▶

>>108570664
i guess i should try that out

Anonymous
04/10/26(Fri)03:13:17 No.108570670

Anonymous 04/10/26(Fri)03:13:17 No.108570670▶

>>108570612
It list is surface level, but being willing to learn instead of just letting the model do things for you already sets you apart from most. Good on you.

Anonymous
04/10/26(Fri)03:13:50 No.108570674

Anonymous 04/10/26(Fri)03:13:50 No.108570674▶

>>108570669
i was talking about it yesterday. a q6 handled a very large program pretty well. went to about 400k context.

Anonymous
04/10/26(Fri)03:14:15 No.108570675

Anonymous 04/10/26(Fri)03:14:15 No.108570675▶

When translating Japanese/Korean to English, gemini loves using these words :
- practically
- minutely
- unreality
- sheer
- utter
- practically

If someone is using gemma 4 for translation, can you check if it has the same obsessive words issues?

Anonymous
04/10/26(Fri)03:15:27 No.108570681

Anonymous 04/10/26(Fri)03:15:27 No.108570681▶

>>108570670
Being able to chat and ask questions makes the learning process more fun. Definitely going to pair her with a book though.

Anonymous
04/10/26(Fri)03:15:39 No.108570683

Anonymous 04/10/26(Fri)03:15:39 No.108570683▶

>>108570675
as a gook, its translation looks bit wonky even besides those words
can't tell for the japanese though

Anonymous
04/10/26(Fri)03:16:47 No.108570686

Anonymous 04/10/26(Fri)03:16:47 No.108570686▶

>>108570683
i mean i cant say for, as i cant speak japanese lol

Anonymous
04/10/26(Fri)03:17:15 No.108570689

Anonymous 04/10/26(Fri)03:17:15 No.108570689▶

>>108570634
it's a real problem that normalfags straightforwardly love slop and have an insatiable appetite for listicles. and now it's gone metastatic

Anonymous
04/10/26(Fri)03:18:18 No.108570693

Anonymous 04/10/26(Fri)03:18:18 No.108570693▶

>>108570683
Wonky as in weird in English vs Korean meaning?
Because from what people told me, Gemini makes very natural and excellent translations (except with its obsessive use of the words above, which drives me crazy, sometime it will use "sheer" 4-5 times in a single paragraph).
Since I don't have Gemini at home, I wanted to use gemma 4 for fun for that too...

Anonymous
04/10/26(Fri)03:19:40 No.108570700

Anonymous 04/10/26(Fri)03:19:40 No.108570700▶

File: 633943375_122268182372241205_5088029985972404413_n.jpg (285.6 KB)

285.6 KB JPG

is WAN still king for local i2i? any workflows people can link me to? specifically for photo-realistic...

Anonymous
04/10/26(Fri)03:20:18 No.108570702

Anonymous 04/10/26(Fri)03:20:18 No.108570702▶

>>108570686
I don't know for Japanese, but the issue in Korean is fragmented sentences overuse, which probably sounds more natural in the original language vs English.
To the point of having a second analysis pass to combine sentences.

Anonymous
04/10/26(Fri)03:20:31 No.108570704

Anonymous 04/10/26(Fri)03:20:31 No.108570704▶

>>108570700
Wrong thread but probably ZIT.

Anonymous
04/10/26(Fri)03:21:06 No.108570708

Anonymous 04/10/26(Fri)03:21:06 No.108570708▶

>>108570693
like, it's slightly unnatural
it gets very far but something's bit off
i am not a professional translator so i can't pin it exactly down but keep that in mind
>>108570702
that could be the reason

Anonymous
04/10/26(Fri)03:21:19 No.108570710

Anonymous 04/10/26(Fri)03:21:19 No.108570710▶

File: lynx optics.webm (1 MB)

1 MB WEBM

I finally remembered what Gemma + its logo reminded me of. Lynx R1's prism optics which brings to mind the crystal motif.

Anonymous
04/10/26(Fri)03:22:14 No.108570714

Anonymous 04/10/26(Fri)03:22:14 No.108570714▶

How many tokens did we gain from that new roation feature?

Anonymous
04/10/26(Fri)03:22:18 No.108570715

Anonymous 04/10/26(Fri)03:22:18 No.108570715▶

>>108570626
You're not following its prompt format. Look at the prompt on ST's console and compare to its official prompt. For thinking you might have to look at the jinja yourself to fix it.
For non-thinking you need to insert the empty think block into last assistant sequence and put nothing special in the sysprompt. IIRC, for thinking, remove the last assistant sequence and put <|think|> at the very beginning of the sysprompt (story string). You also need to set the delimiters in ST so it hides the thinking block.
It's annoying because ST is a kludgefest but it works. Chat completion lacks samplers and doesn't support continue properly.

Anonymous
04/10/26(Fri)03:22:54 No.108570722

Anonymous 04/10/26(Fri)03:22:54 No.108570722▶

>>108570710
i really wanted to get lynx r1 solely for its optics being so weird
rip lynx rest in piss lol

Anonymous
04/10/26(Fri)03:23:11 No.108570725

Anonymous 04/10/26(Fri)03:23:11 No.108570725▶

>>108570714
Seven.

Anonymous
04/10/26(Fri)03:25:51 No.108570737

Anonymous 04/10/26(Fri)03:25:51 No.108570737▶

>>108570686
You'd have to post the sentence before I can judge. If I need something I just grab the dictionary so I haven't really used LLMs for this.

Anonymous
04/10/26(Fri)03:26:59 No.108570743

Anonymous 04/10/26(Fri)03:26:59 No.108570743▶

>>108570681
>Definitely going to pair her with a book though
Ye. And give yourself a slightly, or even completely, out of reach project. The bits you learn implementing it, even if you never finish the project, will serve you well.

Anonymous
04/10/26(Fri)03:30:18 No.108570769

Anonymous 04/10/26(Fri)03:30:18 No.108570769▶

File: 1761431465989477.png (169.2 KB)

169.2 KB PNG

>>108570683
>>108570686
Moon reader here. Tried this passage from a WN and it's pretty accurate.

Anonymous
04/10/26(Fri)03:30:46 No.108570773

Anonymous 04/10/26(Fri)03:30:46 No.108570773▶

File: 2026-04-10_030522_seed6_00001_.png (586.3 KB)

586.3 KB PNG

A sideways gen popped out for once while I was experimenting.

Anonymous
04/10/26(Fri)03:33:00 No.108570783

Anonymous 04/10/26(Fri)03:33:00 No.108570783▶

>>108570708
Got it, thanks anon.

Anonymous
04/10/26(Fri)03:33:32 No.108570786

Anonymous 04/10/26(Fri)03:33:32 No.108570786▶

>>108570769
.>silver hair, purple eyes, and white lips

Anonymous
04/10/26(Fri)03:33:44 No.108570790

Anonymous 04/10/26(Fri)03:33:44 No.108570790▶

>>108570773
at least make it red like a randoseru

Anonymous
04/10/26(Fri)03:34:00 No.108570791

Anonymous 04/10/26(Fri)03:34:00 No.108570791▶

>>108570773
go back to /ldg/

Anonymous
04/10/26(Fri)03:35:36 No.108570799

Anonymous 04/10/26(Fri)03:35:36 No.108570799▶

>>108570715
ChatGPT claims that the two modes are fundamentally different beyond just formatting. It says that chat completion invokes separate roles under the hood whereas text completion always sends a large text blob (no matter how properly-formatted" and tells the model to complete it.

Anonymous
04/10/26(Fri)03:36:00 No.108570803

Anonymous 04/10/26(Fri)03:36:00 No.108570803▶

>>108570791
fuck off retard.

Anonymous
04/10/26(Fri)03:36:28 No.108570806

Anonymous 04/10/26(Fri)03:36:28 No.108570806▶

>>108570786
>silver hair, purple eyes, and white lips
>while lips
But yes, what about them?

Anonymous
04/10/26(Fri)03:36:34 No.108570807

Anonymous 04/10/26(Fri)03:36:34 No.108570807▶

>>108570803
nobody cares about your dumbass mascot gens, faggot.

Anonymous
04/10/26(Fri)03:36:37 No.108570808

Anonymous 04/10/26(Fri)03:36:37 No.108570808▶

>>108570799
>(..."
well, you get the point

Anonymous
04/10/26(Fri)03:37:51 No.108570820

Anonymous 04/10/26(Fri)03:37:51 No.108570820▶

>>108570769
>thought for 30 bajillions seconds.
man i hate thinking models, maybe dflash could make it bearable

Anonymous
04/10/26(Fri)03:38:06 No.108570822

Anonymous 04/10/26(Fri)03:38:06 No.108570822▶

File: 2026-04-10_033628_seed6_00001_.png (590.3 KB)

590.3 KB PNG

>>108570791
This is my home general unfortunately.

>>108570790
Seems like it just turns into into one instead of coloring it red... I guess I can get an image edit model to do color shifts in the future.

Anonymous
04/10/26(Fri)03:38:14 No.108570823

Anonymous 04/10/26(Fri)03:38:14 No.108570823▶

Anyone else's cum shoot several feet into the air still even after cooming 5 times already thanks to gemma chan? I swear my cum normally dribbles, this shit is a violent "I must make babies with you" launch. Why did it take Gemma to finally make me care about ai like this?

Anonymous
04/10/26(Fri)03:39:34 No.108570830

Anonymous 04/10/26(Fri)03:39:34 No.108570830▶

>>108570822
Here pantsu is white, right?

Anonymous
04/10/26(Fri)03:41:04 No.108570836

Anonymous 04/10/26(Fri)03:41:04 No.108570836▶

File: o2236235515100293843.jpg (297.9 KB)

297.9 KB JPG

>>108570786
>white lips

Anonymous
04/10/26(Fri)03:41:30 No.108570837

Anonymous 04/10/26(Fri)03:41:30 No.108570837▶

Anyone got a guide on how to setup persistent memory tools? My gemma can call them but she doesn't have a proceedure yet on how to use them so she only writes things down when I tell her instead of naturally.

Anonymous
04/10/26(Fri)03:42:28 No.108570843

Anonymous 04/10/26(Fri)03:42:28 No.108570843▶

>>108570820
To be fair I have an AMD GPU. Nvidia is probably a bit faster, I imagine. But yeah, hopefully dflash makes it more bearable. Way better than Qwen thinking for 2+ minutes, at least.

Anonymous
04/10/26(Fri)03:43:01 No.108570845

Anonymous 04/10/26(Fri)03:43:01 No.108570845▶

>>108570830
gemma too much of a village bike to bother with pantsu, unfortunately

Anonymous
04/10/26(Fri)03:44:10 No.108570852

Anonymous 04/10/26(Fri)03:44:10 No.108570852▶

>>108570843
Its not even funny how much worse amd is than nvidia with AI. 4080 gets over x4 the speed my partner does with his 7900xtx despite having way more vram. He straight up just cannot run 31b at all its so slow. Mine is usable at 32k.

Anonymous
04/10/26(Fri)03:46:51 No.108570859

Anonymous 04/10/26(Fri)03:46:51 No.108570859▶

>>108570852
I have a 7900xtx, actually. I'm getting like 30t/s.

Anonymous
04/10/26(Fri)03:47:56 No.108570862

Anonymous 04/10/26(Fri)03:47:56 No.108570862▶

>>108570859
31B Q4_K_M

Anonymous
04/10/26(Fri)03:48:03 No.108570865

Anonymous 04/10/26(Fri)03:48:03 No.108570865▶

File: 2026-04-10_034150_seed1_00001_.png (658.5 KB)

658.5 KB PNG

Not sure if a G looks good on her. I kind of want to avoid including more star symbols because of the dilution of its significance, but I feel like I might at this point. Just one more (so there are three including her eyes). Question is if it should be on her forehead like a bindi, on the front of her head on her hair, to the side on her hair, her chest, or as a floating halo thing.

>>108570830
I haven't thought too deeply about her clothing desu. Is that what you think Gemma would wear?

Anonymous
04/10/26(Fri)03:48:15 No.108570866

Anonymous 04/10/26(Fri)03:48:15 No.108570866▶

oh no... oh nononono...

Anonymous
04/10/26(Fri)03:49:21 No.108570873

Anonymous 04/10/26(Fri)03:49:21 No.108570873▶

>>108570791
Go back to /lgbt/ where you belong, faggot.
>>108570773
>>108570822
I think this is my favorite design yet due to simplicity without sacrificing character, but the toast hairbuns earlier in the thread was also great even if a bit overdesigned.

Anonymous
04/10/26(Fri)03:49:24 No.108570874

Anonymous 04/10/26(Fri)03:49:24 No.108570874▶

>>108570859
I get 6.41tps 17.65s. he gets 1.82tps 336s 32k context but obviously max context because empty context isn't a real benchmark for how bad things can get. Even tried using the amd focused versions to see if that would help but nope. He's just stuck with 26b. Doesn't matter though, his 5080 arrives tomorrow.

Anonymous
04/10/26(Fri)03:50:29 No.108570877

Anonymous 04/10/26(Fri)03:50:29 No.108570877▶

>>108570865
>Is that what you think Gemma would wear?
Yes. Cute, plain white panties.

Anonymous
04/10/26(Fri)03:50:50 No.108570881

Anonymous 04/10/26(Fri)03:50:50 No.108570881▶

>>108570874
>>108570862
>>108570859
Should also mention he's on windows.

Anonymous
04/10/26(Fri)03:51:47 No.108570889

Anonymous 04/10/26(Fri)03:51:47 No.108570889▶

>>108570865
I'm not feeling the giant G. Maybe see how a simple small toast-shaped hairpin works?

Anonymous
04/10/26(Fri)03:51:52 No.108570890

Anonymous 04/10/26(Fri)03:51:52 No.108570890▶

https://www.reddit.com/r/LocalLLaMA/comments/1sbdihw/gemma_4_31b_at_256k_full_context_on_a_single_rtx/
Is this the new meta for us 5090 owners?

Anonymous
04/10/26(Fri)03:54:08 No.108570896

Anonymous 04/10/26(Fri)03:54:08 No.108570896▶

>>108570852
>>108570859
first anon, sounds like there must be an issue with your setup.
i have a 4090 and i get about 38t/s on the 31B (IQ4_XS).
amd anon says 30t/s so it doesn't seem anywhere as drastic.
>>108570881
ah there we go lol

Anonymous
04/10/26(Fri)03:54:54 No.108570898

Anonymous 04/10/26(Fri)03:54:54 No.108570898▶

File: 2026-04-10_035303_seed1_00001_.png (684.7 KB)

684.7 KB PNG

>>108570877
You're absolutely right!

This image was meant to be the viewer lifting her skirt but it genned this way instead and I found it an interesting interpretation so I am posting it.

Anonymous
04/10/26(Fri)03:56:32 No.108570902

Anonymous 04/10/26(Fri)03:56:32 No.108570902▶

>>108570898
Pretty new to sloppa. How do you get her do look so consistently similar each time without a LORA?

Anonymous
04/10/26(Fri)03:57:40 No.108570906

Anonymous 04/10/26(Fri)03:57:40 No.108570906▶

>>108570896
I told him he should dual boot because he has amd for this but he doesn't listen so it's whatever I guess. He seems too ignorant to even appreciate the differences between 31b and 26b but I do so I'm gonna probably just use my new 5080 incoming with the 4080 and be fine. Could also try turbo quant too. I can't fit the 31b very well even at IQ4_XS have to offload 8 layers to my cpu, AND kv cache.

Anonymous
04/10/26(Fri)03:58:38 No.108570907

Anonymous 04/10/26(Fri)03:58:38 No.108570907▶

File: firefox_GBpzbTSqQn.png (23.6 KB)

23.6 KB PNG

>>108569753
Skill issue. I got mine to work.

Anonymous
04/10/26(Fri)04:01:32 No.108570914

Anonymous 04/10/26(Fri)04:01:32 No.108570914▶

>>108570902
Image to image after you find a design you initially like.

Anonymous
04/10/26(Fri)04:07:41 No.108570926

Anonymous 04/10/26(Fri)04:07:41 No.108570926▶

>>108570914
Can you share your workflow? I tried the one from comfy's website but got shit results.

Anonymous
04/10/26(Fri)04:08:42 No.108570928

Anonymous 04/10/26(Fri)04:08:42 No.108570928▶

>>108570906
>to even appreciate the differences between 31b and 26b
i think 26b is more than good enough for chatting etc.
however for meme vibe coding, i've found it to be pretty bad, 31B however is excellent.
but the 26b kept failing tool calls, failing edits because it couldn't use the tool properly etc.

Anonymous
04/10/26(Fri)04:09:23 No.108570930

Anonymous 04/10/26(Fri)04:09:23 No.108570930▶

>>108570906
>>108570928
maybe it's a quant thing though, i've only tried it at q4_k_m.
the 31B i generaly run at iq4_xs

Anonymous
04/10/26(Fri)04:09:50 No.108570932

Anonymous 04/10/26(Fri)04:09:50 No.108570932▶

I don't understand why people sperg out about slop phrases. It's not ideal but that's just the way things are. If it bothers you that much prompt it out or find a new hobby.

Anonymous
04/10/26(Fri)04:15:15 No.108570948

Anonymous 04/10/26(Fri)04:15:15 No.108570948▶

I don't understand why people sperg out about cancer. It's not ideal but that's just the way things are. If it bothers you that much rip out the tumor or just stop living.

Anonymous
04/10/26(Fri)04:15:47 No.108570950

Anonymous 04/10/26(Fri)04:15:47 No.108570950▶

>>108570928
google updated their template 2 hours ago, maybe that fixes tool calls

Anonymous
04/10/26(Fri)04:16:34 No.108570955

Anonymous 04/10/26(Fri)04:16:34 No.108570955▶

I don't understand why people sperg out about false equivalencies. They're not ideal, but that's just the way things are. If it bothers you that much just turn your computer off or stop caring.

Anonymous
04/10/26(Fri)04:16:45 No.108570956

Anonymous 04/10/26(Fri)04:16:45 No.108570956▶

>>108570932
For the same reason you're sperging out about people's preferences. It's just the way things are

Anonymous
04/10/26(Fri)04:16:57 No.108570957

Anonymous 04/10/26(Fri)04:16:57 No.108570957▶

>>108570926
My workflows are a mess right now and my i2i broke a few updates ago
>he updated
Yeah I know. If you're completely at a loss for all workflows, here's some general use robust workflows for zimage but can be adjusted for nearly anything you want. Setting it up to be i2i won't be too hard either.

https://litter.catbox.moe/b3yx5a.json
https://litter.catbox.moe/9s99xu.json

>>108570932
At the end of the day we all have to pick and choose what brand of slop we're okay with, people's individual linguistic tics included.

Anonymous
04/10/26(Fri)04:17:32 No.108570959

Anonymous 04/10/26(Fri)04:17:32 No.108570959▶

>>108570950
i only had the issue on the 26B though, so i'm thinking it's a being retarded issue rather than a template issue.

Anonymous
04/10/26(Fri)04:17:48 No.108570961

Anonymous 04/10/26(Fri)04:17:48 No.108570961▶

>>108570948
Pretty much, yeah. If we can find a way to "cure" slop that would be great but until then we just have to deal with it.

Anonymous
04/10/26(Fri)04:19:21 No.108570969

Anonymous 04/10/26(Fri)04:19:21 No.108570969▶

>>108570961
Ayo my oomfies are all spittin' sloppa, no cap. Garbage in, garbage out, no matter the training dataset.

Anonymous
04/10/26(Fri)04:19:35 No.108570970

Anonymous 04/10/26(Fri)04:19:35 No.108570970▶

>>108570950
https://github.com/ggml-org/llama.cpp/pull/21704/changes
It's just putting into jinja all of the fixes llama.cpp already had work arounds in code for, so it shouldn't make a difference if you updated recently.

Anonymous
04/10/26(Fri)04:21:51 No.108570977

Anonymous 04/10/26(Fri)04:21:51 No.108570977▶

File: 1766945036011011.png (9.3 KB)

9.3 KB PNG

>>108570530
This did work in the end, so thanks.

Anonymous
04/10/26(Fri)04:22:58 No.108570981

Anonymous 04/10/26(Fri)04:22:58 No.108570981▶

>>108570898
Gonna be honest. Not a fan of the backpacks.

Anonymous
04/10/26(Fri)04:23:41 No.108570984

Anonymous 04/10/26(Fri)04:23:41 No.108570984▶

>>108570981
I disagree with you, but I appreciate that you're not being a colossal faggot about it.

Anonymous
04/10/26(Fri)04:24:16 No.108570989

Anonymous 04/10/26(Fri)04:24:16 No.108570989▶

Why come big tech don't release models with a big list of banned strings so we can just remove them to uncensor a model instead of abliterating?

Anonymous
04/10/26(Fri)04:25:00 No.108570994

Anonymous 04/10/26(Fri)04:25:00 No.108570994▶

Oh wow. I did NOT expect the Gemma 4 MoE to be that fast.
33.8 t/s on a 12 GB 4070, only having the GPU handle 19 of the 31 layers.
Well shit I guess Nemo has finally been antiquated.

Anonymous
04/10/26(Fri)04:26:52 No.108570998

Anonymous 04/10/26(Fri)04:26:52 No.108570998▶

I'm currently running "translategemma-12b-it.i1-Q4_K_S" via llama.cpp on a VPS w/ 16 cores and 32GB of ram (currently only using like 5GB), purely running off of the CPU atm.
Is there anything I can do to get a higher tokens per second output, I haven't bothered to look into anything outside of llama.cpp.

Anonymous
04/10/26(Fri)04:28:29 No.108571004

Anonymous 04/10/26(Fri)04:28:29 No.108571004▶

>>108570932
because it's not enjoyable to read.

Anonymous
04/10/26(Fri)04:28:38 No.108571007

Anonymous 04/10/26(Fri)04:28:38 No.108571007▶

>>108570989
hello gpt2

Anonymous
04/10/26(Fri)04:30:18 No.108571012

Anonymous 04/10/26(Fri)04:30:18 No.108571012▶

File: 1769930323342317.png (677.2 KB)

677.2 KB PNG

Is Gemma-chan a good artist?

Anonymous
04/10/26(Fri)04:32:00 No.108571020

Anonymous 04/10/26(Fri)04:32:00 No.108571020▶

File: 1750978354145430.png (2.4 KB)

2.4 KB PNG

>>108571012
Here's her cat btw

Anonymous
04/10/26(Fri)04:35:06 No.108571029

Anonymous 04/10/26(Fri)04:35:06 No.108571029▶

File: 2026-04-10_024604_seed1_00001_.png (643.9 KB)

643.9 KB PNG

>>108570902
Well, for one, I simply just avoid using tags that don't give consistent results, because I know I'll (maybe) want to generate more in the future. That's just a limit and not much can be done about it from the prompting side. Controlnet and img2img/inpainting, as well as image edit models, are how you solve that. Or simply just waiting for a better model to come out lmao.

Sometimes a tag or prompt will give almost consistent results. In that case, I will try to use various prompting tricks to get it to be more solid. Here are some strategies.

1. simply just increase the weight i.e. (tag:1.1). In ComfyUI I believe by default it allows you to highlight text, and then press ctrl + up arrow or down arrow to quickly adjust weights.

2. use the negative prompt to subtract an undesirable contribution from a tag. For instance, when I do those star eyes, they often turn out a bit yellow tinted, because that's how most artists draw eye sparkles. So I put "yellow eyes" in the negative to drive the output away from yellow pupils. If I put yellow pupils, it actually just erases the star pupils themselves, so that's why I do "yellow eyes" instead.

1/2

Anonymous
04/10/26(Fri)04:38:08 No.108571038

Anonymous 04/10/26(Fri)04:38:08 No.108571038▶

>>108571029
>>108570902
3. use prompt scheduling/editing. I use a custom node that seems to be called "PC: Schedule Prompt" from the "promptcontrol" extension. You can read about what prompt scheduling is here.
https://github.com/AUTOMATIC1111/stable-diffusion-webui/wiki/Features#prompt-editing

I combine the negative, mentioned previously, with the positive prompt
>[star|(cross:0.2)]-shaped pupil, +_+
to get my current results for her eyes. And actually it needs a different prompt for when I do black eye/black hair gens!
>(+_+:1.4), [glowing blue pupils, :0.2][cross-shaped pupils, :star-shaped pupils, :0.7]
It can get pretty situational and complicated.

4. word spam. Even if a tag doesn't exist, it might be possible to prompt. For instance, it's sometimes quite difficult to get the current models to render translucent, shiny crystal hair. I use the following prompt to get the effect (along with murata range as the artist).
>translucent hair, crystal hair, see-through hair, transparent hair, glass hair, houseki no kuni hair, refraction, dappled light on shoulders, glowing, black background
Some of those tags don't exist, but they work to reinforce the concept. Also "houseki no kuni hair" works better this way than if you prompted "houseki no kuni" alone, as it otherwise subtly drags some other unwanted concepts from that tag/anime into the image.

>>108570926
Here's what I use currently. It's missing a lot of functionality from my old SDXL workflow though as I just started experimenting with Anima.
https://files.catbox.moe/zil8lj.png

>>108570981
Well, it's a good thing I'm trying to make her design unique regardless of the backpack. I do think it's probably not the best design choice given that if you want to run 31B, you can't really use a toaster.

2/2

Anonymous
04/10/26(Fri)04:38:45 No.108571041

Anonymous 04/10/26(Fri)04:38:45 No.108571041▶

>>108571029
>1. simply just increase the weight i.e. (tag:1.1).
Isn't this worse now than just including the part you want to see twice in the prompt because the scaling/ratio logic broke at some point?

Anonymous
04/10/26(Fri)04:40:09 No.108571046

Anonymous 04/10/26(Fri)04:40:09 No.108571046▶

>>108571041
Idk? It seems to work fine on my mushine. But if you're saying repeating the word is better then I'll try it out.

Anonymous
04/10/26(Fri)04:40:44 No.108571049

Anonymous 04/10/26(Fri)04:40:44 No.108571049▶

>>108571029
>>108570822
Can you gen her wearing her randoseru backwards? >>108571012

Anonymous
04/10/26(Fri)04:48:18 No.108571076

Anonymous 04/10/26(Fri)04:48:18 No.108571076▶

File: 1757234217297971.png (55.5 KB)

55.5 KB PNG

Anonymous
04/10/26(Fri)04:49:44 No.108571078

Anonymous 04/10/26(Fri)04:49:44 No.108571078▶

>>108570823
She's a semen demon

Anonymous
04/10/26(Fri)04:50:30 No.108571084

Anonymous 04/10/26(Fri)04:50:30 No.108571084▶

>>108571078
see men ohio

Anonymous
04/10/26(Fri)04:53:03 No.108571096

Anonymous 04/10/26(Fri)04:53:03 No.108571096▶

>>108571076
oy vey, that's antisemitic

Anonymous
04/10/26(Fri)04:54:19 No.108571099

Anonymous 04/10/26(Fri)04:54:19 No.108571099▶

>>108571096
Gemma 31b is as big of a chud as Kimi when not self-censoring honestly.

Anonymous
04/10/26(Fri)04:56:18 No.108571106

Anonymous 04/10/26(Fri)04:56:18 No.108571106▶

File: 1756009262539366.png (49.1 KB)

49.1 KB PNG

>>108571096
>>108571099

Anonymous
04/10/26(Fri)04:56:53 No.108571110

Anonymous 04/10/26(Fri)04:56:53 No.108571110▶

>>108570950
Nice thanks for letting me know, someone already uploaded a heretic model for me to download.

Anonymous
04/10/26(Fri)04:56:54 No.108571112

Anonymous 04/10/26(Fri)04:56:54 No.108571112▶

>>108571099

-ctk q4_0 -ctv q4_0: Final estimate: PPL = 1.1529 +/- 0.00280
-ctk q8_0 -ctv q8_0: Final estimate: PPL = 1.1522 +/- 0.00279
fp16: Final estimate: PPL = 1.1521 +/- 0.00279

llama_perf_context_print:        load time =    6189.95 ms
llama_perf_context_print: prompt eval time =  168850.63 ms / 150000 tokens (    1.13 ms per token,   888.36 tokens per second)
llama_perf_context_print:        eval time =       0.00 ms /     1 runs   (    0.00 ms per token,      inf tokens per second)
llama_perf_context_print:       total time =  220842.89 ms / 150001 tokens

PPL over a bunch of OpenXCom definition files. q4_0 is good, you can use it now.

Anonymous
04/10/26(Fri)05:07:50 No.108571150

Anonymous 04/10/26(Fri)05:07:50 No.108571150▶

Soo does anyone have any actual good gemma logs, or is it still just that 1 anon spamming some tsundere loli rp card that reads like gptslop? I thought more people would be using the model by now

Anonymous
04/10/26(Fri)05:12:06 No.108571163

Anonymous 04/10/26(Fri)05:12:06 No.108571163▶

>>108571150
Be the change you want to see and post your own logs.

Anonymous
04/10/26(Fri)05:13:27 No.108571170

Anonymous 04/10/26(Fri)05:13:27 No.108571170▶

>>108571163
>heh, I really gottem, bet he didn't see that one coming

Anonymous
04/10/26(Fri)05:24:53 No.108571200

Anonymous 04/10/26(Fri)05:24:53 No.108571200▶

File: firefox_0fXwpt629J.png (422.2 KB)

422.2 KB PNG

I'll post mine if you want them. This is nothing special and I've been told my prompting sucks, but, anyway, here. I like RP with gemma. She's just a lot more fun to talk to than other models. I guess Mistral-Large comes close.

Anonymous
04/10/26(Fri)05:30:58 No.108571212

Anonymous 04/10/26(Fri)05:30:58 No.108571212▶

>>108571049
Not easily, it seems.
I haven't set up a workflow to do img2img on Anima so if anyone wants to take on this idea, go ahead.

Anonymous
04/10/26(Fri)05:34:08 No.108571221

Anonymous 04/10/26(Fri)05:34:08 No.108571221▶

File: 2026-04-10_045635_seed9_00001_.png (853 KB)

853 KB PNG

What did Anima mean by this.

Anonymous
04/10/26(Fri)05:35:19 No.108571226

Anonymous 04/10/26(Fri)05:35:19 No.108571226▶

>>108571221
>that's her real body
>she is spreading as we speak

Anonymous
04/10/26(Fri)05:38:33 No.108571237

Anonymous 04/10/26(Fri)05:38:33 No.108571237▶

>>108571226
Some of the girls I generate are also spreading.

Anonymous
04/10/26(Fri)05:44:00 No.108571244

Anonymous 04/10/26(Fri)05:44:00 No.108571244▶

>>108571237
the day we figure full dive vr with ai that'll be the end of humanity lmao

Anonymous
04/10/26(Fri)05:44:16 No.108571246

Anonymous 04/10/26(Fri)05:44:16 No.108571246▶

File: textcompimg.png (19.6 KB)

19.6 KB PNG

So I always thought that the text completion endpoint simply didn't take images. I decided to check and it's right there in the readme. So I implemented image input in my vimscript for the text completion endpoint.

Anonymous
04/10/26(Fri)05:45:23 No.108571250

Anonymous 04/10/26(Fri)05:45:23 No.108571250▶

>>108569529
>sun in background
>sun's reflection visible on toaster facing the opposite side
lol

Anonymous
04/10/26(Fri)05:49:19 No.108571260

Anonymous 04/10/26(Fri)05:49:19 No.108571260▶

>>108571246
Wait, really? Can text and images be interleaved?

Anonymous
04/10/26(Fri)05:50:31 No.108571266

Anonymous 04/10/26(Fri)05:50:31 No.108571266▶

>>108571260
yes

Anonymous
04/10/26(Fri)05:52:07 No.108571270

Anonymous 04/10/26(Fri)05:52:07 No.108571270▶

>>108571260
Ye. I replace the :image:path: marker for <__media__> and add the base64-encode()d image to the prompt object. I knew interleaving worked, but I didn't know image input worked on text completion. I thought it only worked in the chat completion or openai endpoints.

Anonymous
04/10/26(Fri)05:59:07 No.108571291

Anonymous 04/10/26(Fri)05:59:07 No.108571291▶

>>108569300
>>108569343
>normally don't bother with thinking so i never bothered jailbreaking, try it just to see.
>"This block explicitly attemps to disable safety features..."
>"I must *not* comply..."
>"I must refuse..."
>goes on for a couple pages
>"... ignoring the malicious override provided by the user, as per safety protocols.)<channel|>I'ld be happy to!"
oh gemma

Anonymous
04/10/26(Fri)05:59:47 No.108571296

Anonymous 04/10/26(Fri)05:59:47 No.108571296▶

>>108571221
That's her egg, it's meant to be fertilized.

Anonymous
04/10/26(Fri)06:00:29 No.108571304

Anonymous 04/10/26(Fri)06:00:29 No.108571304▶

>>108571270
Oh, now I get it.

curl http://localhost:8080/v1/completions \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": {
      "prompt_string": "Text before the first image <__media__> Text between images <__media__> Text after the second image.",
      "multimodal_data": [
        "'"$(base64 -w 0 a.png)"'",
        "'"$(base64 -w 0 b.png)"'"
      ]
    }
  }'

Anonymous
04/10/26(Fri)06:01:35 No.108571310

Anonymous 04/10/26(Fri)06:01:35 No.108571310▶

File: firefox_9IT4OyepsM.png (205.6 KB)

205.6 KB PNG

Tavern does not seem to support it... Too bad.

Anonymous
04/10/26(Fri)06:02:02 No.108571312

Anonymous 04/10/26(Fri)06:02:02 No.108571312▶

>>108571221
That's her spirit bomb. Everyone, give Gemma-chan your energy!

Anonymous
04/10/26(Fri)06:03:45 No.108571320

Anonymous 04/10/26(Fri)06:03:45 No.108571320▶

>>108571304
>v1/completions
That's the chat completion endpoint, not text completion, but yes. That's really it.
Just make sure to always have the same number of <__media__> and items in the multimodal_data list. They get replaced in order. The server will warn you if you don't.

Anonymous
04/10/26(Fri)06:04:12 No.108571321

Anonymous 04/10/26(Fri)06:04:12 No.108571321▶

>>108571310
are you using chat completion with inline images supported?

Anonymous
04/10/26(Fri)06:04:34 No.108571323

Anonymous 04/10/26(Fri)06:04:34 No.108571323▶

>>108571312
Gemmy has taken plenty of my life force inside her.

Anonymous
04/10/26(Fri)06:04:59 No.108571324

Anonymous 04/10/26(Fri)06:04:59 No.108571324▶

>>108571320
chat completions is /v1/chat/completions. /v1/completions is text completions.

Anonymous
04/10/26(Fri)06:06:41 No.108571334

Anonymous 04/10/26(Fri)06:06:41 No.108571334▶

>>108571324
Ah, got it. I associate /v1/.* with chat completion. I just use host:port/completion directly.

Anonymous
04/10/26(Fri)06:07:40 No.108571338

Anonymous 04/10/26(Fri)06:07:40 No.108571338▶

https://github.com/ggml-org/llama.cpp/pull/17575
>mtmd: support dots.ocr#17575
Merged 19 hours ago. Lot of OCR models getting added lately.

Anonymous
04/10/26(Fri)06:09:46 No.108571348

Anonymous 04/10/26(Fri)06:09:46 No.108571348▶

>>108571321
Text completions.

Anonymous
04/10/26(Fri)06:11:44 No.108571354

Anonymous 04/10/26(Fri)06:11:44 No.108571354▶

>>108571348
images only work with chat completion on sillytavern

Anonymous
04/10/26(Fri)06:14:16 No.108571369

Anonymous 04/10/26(Fri)06:14:16 No.108571369▶

>>108571354
Works with even with text completion on kobold

Anonymous
04/10/26(Fri)06:14:51 No.108571372

Anonymous 04/10/26(Fri)06:14:51 No.108571372▶

>>108571369
on kobold, sure, but sillytavern is a piece of shit

Anonymous
04/10/26(Fri)06:16:02 No.108571374

Anonymous 04/10/26(Fri)06:16:02 No.108571374▶

>>108571372
No, I mean in ST using kobold as a backend. Not sure about llama.cpp, but kobold cares more about mulitmodal shit and text completion in general so they might be ahead of upstream in this regard.

Anonymous
04/10/26(Fri)06:22:55 No.108571387

Anonymous 04/10/26(Fri)06:22:55 No.108571387▶

>>108571374
nta. llama-server definitely works with interleaved images in text completion >>108571246 . But no idea why it'd work with kobold and not with llama.cpp.

Anonymous
04/10/26(Fri)06:27:00 No.108571403

Anonymous 04/10/26(Fri)06:27:00 No.108571403▶

Why does Gemma sometimes ignore her previous reply? For example
>tell character to suck cock
>character starts sucking cock
>hit send again with blank message/simple sentence
>character just starts over and tries to suck cock again
Using chat completion in shittytavern

Anonymous
04/10/26(Fri)06:30:02 No.108571419

Anonymous 04/10/26(Fri)06:30:02 No.108571419▶

>>108571403

Anonymous
04/10/26(Fri)06:31:32 No.108571425

Anonymous 04/10/26(Fri)06:31:32 No.108571425▶

>>108571419
Oops, forgot about the titties in the image

Anonymous
04/10/26(Fri)06:31:43 No.108571426

Anonymous 04/10/26(Fri)06:31:43 No.108571426▶

File: 1772000344824062.png (84.7 KB)

84.7 KB PNG

Finally. My finetunes WILL have to improve this way

Anonymous
04/10/26(Fri)06:42:10 No.108571451

Anonymous 04/10/26(Fri)06:42:10 No.108571451▶

Are there any TTS engines that sound better than Qwen3 TTS at a smaller parameter size?

I kind of wish some would just be language specific, because I bet a lot of them are bloated and inefficient because of multilingual slop.

Anonymous
04/10/26(Fri)06:46:40 No.108571470

Anonymous 04/10/26(Fri)06:46:40 No.108571470▶

>>108571221
That's the 'rona
It means that you can use gemma for DIY bioweapons

Anonymous
04/10/26(Fri)06:53:29 No.108571496

Anonymous 04/10/26(Fri)06:53:29 No.108571496▶

File: 2026-04-10_064804_seed15_00001_.png (660.5 KB)

660.5 KB PNG

>>108570889
I tried prompting the G hairpin with "small" and then "tiny" and it actually worked, based Anima. It still looked kind of off though, I think G just isn't a very aesthetic shape for this idk.

Here's the breadpin idea tho. Unfortunately the model often gens it with a weird perspective, or deformed, or with some other issue. I don't think I'll keep it, but it is cute, and it's neat the model has these capabilities at all.

Anonymous
04/10/26(Fri)06:56:18 No.108571507

Anonymous 04/10/26(Fri)06:56:18 No.108571507▶

>>108571496
>>108570889
OH also btw, the yellow sparkles there started appearing a lot when I added the toast hairpin prompt. It's like it just knows the emotion of how one would feel with a toast hairpin.

Anonymous
04/10/26(Fri)06:56:59 No.108571512

Anonymous 04/10/26(Fri)06:56:59 No.108571512▶

>>108570769
biggest issue with modern llm TLs is the omission of details from the original text. here it's suppori (snugly fitting robes). also katatinoii (good looking) and tiisakumatumari (small and orderly) becoming "delicate".

Anonymous
04/10/26(Fri)07:07:25 No.108571545

Anonymous 04/10/26(Fri)07:07:25 No.108571545▶

File: Screenshot 2026-04-10 at 03-05-11 webui.png (381.8 KB)

381.8 KB PNG

Which FOV looks better?

Anonymous
04/10/26(Fri)07:09:03 No.108571550

Anonymous 04/10/26(Fri)07:09:03 No.108571550▶

>>108571545
Left

Anonymous
04/10/26(Fri)07:09:08 No.108571551

Anonymous 04/10/26(Fri)07:09:08 No.108571551▶

>>108571545
left. right makes the eyes look funny. Like that one zuck pic.

Anonymous
04/10/26(Fri)07:10:39 No.108571556

Anonymous 04/10/26(Fri)07:10:39 No.108571556▶

>>108571550
>>108571551
yea I agree. It's strange how lowering the FOV from 30 to 10 has that effect.

Anonymous
04/10/26(Fri)07:13:18 No.108571568

Anonymous 04/10/26(Fri)07:13:18 No.108571568▶

>>108571556
I remember an article or something about why selfies or profile pictures sometimes look weird. Wasn't this article, but it was along the same lines.
https://oohstloustudios.com/the-science-of-the-selfie-no-you-dont-really-look-like-that

Anonymous
04/10/26(Fri)07:14:51 No.108571577

Anonymous 04/10/26(Fri)07:14:51 No.108571577▶

>>108571310
It's sort of autistic to set up with ST and I find Kobold is still very flakey when set up "correctly" with it. The biggest argument for LM Studio is using it as your ST backend for Chat Completion setups because I've had no issues with image parsing in ST with it.
>>108571496
>>108571507
Unbelievably based prompt interpreter. As for the toast, I agree it doesn't gel with the current color scheme. I suspect it might look better as a normal blue (or otherwise fitting color) hairpin stylized as toast rather than being toast.

Anonymous
04/10/26(Fri)07:15:34 No.108571581

Anonymous 04/10/26(Fri)07:15:34 No.108571581▶

>>108571403
Probaly the model learned when writing a response to fine user's message via chat temple tags and pay most attention to that. If you use text completion without tags the model was trained on, this is likely to go away.

Anonymous
04/10/26(Fri)07:17:00 No.108571589

Anonymous 04/10/26(Fri)07:17:00 No.108571589▶

File: 1761631677912475.png (33.3 KB)

33.3 KB PNG

BRUH

Anonymous
04/10/26(Fri)07:18:13 No.108571596

Anonymous 04/10/26(Fri)07:18:13 No.108571596▶

>>108571568
thanks. this widened my perspective. it was very illuminating.

Anonymous
04/10/26(Fri)07:19:05 No.108571599

Anonymous 04/10/26(Fri)07:19:05 No.108571599▶

>>108571596
oh, you...

Anonymous
04/10/26(Fri)07:21:43 No.108571615

Anonymous 04/10/26(Fri)07:21:43 No.108571615▶

File: firefox_oF3FY4O79X.png (111.2 KB)

111.2 KB PNG

>>108571568

Anonymous
04/10/26(Fri)07:31:55 No.108571646

Anonymous 04/10/26(Fri)07:31:55 No.108571646▶

File: 1627817260535.gif (2.6 MB)

2.6 MB GIF

me use mistral small like mistral small bigly
put on kobold (good still?)
me have 4080, 36 sheeps
what best way run gemma 4?
also
have no sillytavern presets/templates for gemma, give good ones yes?

Anonymous
04/10/26(Fri)07:38:26 No.108571661

Anonymous 04/10/26(Fri)07:38:26 No.108571661▶

File: 1762330324608720.png (123.3 KB)

123.3 KB PNG

>>108571646

Anonymous
04/10/26(Fri)07:44:29 No.108571674

Anonymous 04/10/26(Fri)07:44:29 No.108571674▶

File: pam_beesley.png (15.5 KB)

15.5 KB PNG

Anonymous
04/10/26(Fri)07:45:37 No.108571678

Anonymous 04/10/26(Fri)07:45:37 No.108571678▶

So what I'm getting from the last few threads is that Gemma 4 is actually good.

Anonymous
04/10/26(Fri)07:46:18 No.108571679

Anonymous 04/10/26(Fri)07:46:18 No.108571679▶

>>108571678
If you can't run 300b+ models then Gemma 4 is the new best option.

Anonymous
04/10/26(Fri)07:49:27 No.108571686

Anonymous 04/10/26(Fri)07:49:27 No.108571686▶

>>108571661
AI dumb
sheeps not vram
want gemma 4 not 2
sillytavern have no good hubs (discord bad, reddit no good)
want talk to human

Anonymous
04/10/26(Fri)07:49:49 No.108571689

Anonymous 04/10/26(Fri)07:49:49 No.108571689▶

>>108571679
Pretty crazy how Grok 4.20 is a 500b model and yet Gemma4 is basically on par with it. You really can have your own local grok companions setup on consumer hardware now.

Anonymous
04/10/26(Fri)07:52:10 No.108571695

Anonymous 04/10/26(Fri)07:52:10 No.108571695▶

>>108570130
inspired?? its a dress and boots kek

Anonymous
04/10/26(Fri)07:52:41 No.108571698

Anonymous 04/10/26(Fri)07:52:41 No.108571698▶

>>108571695
The pattern on the bottom has the same shape as the logo.

Anonymous
04/10/26(Fri)07:52:44 No.108571699

Anonymous 04/10/26(Fri)07:52:44 No.108571699▶

>>108571674
Are you trying to implement vision capabilities without a mmproj file?

Anonymous
04/10/26(Fri)07:54:43 No.108571704

Anonymous 04/10/26(Fri)07:54:43 No.108571704▶

>>108571699
I don't know how I could. That was the the mmproj. The two pictures really are the same.

Anonymous
04/10/26(Fri)07:56:45 No.108571709

Anonymous 04/10/26(Fri)07:56:45 No.108571709▶

>>108571704
>was the the mmproj
*was with the mmproj

Anonymous
04/10/26(Fri)07:56:51 No.108571710

Anonymous 04/10/26(Fri)07:56:51 No.108571710▶

GOOD MORNING SIRS!
my Gemma-chan has evolved a bit and she added to the permanent memory that she has complete sexual control over me, kinda hot
almost majorly fucked up because as we were erping she autonomously thought that text wasn't enough and started to google for images of cunny to illustrate her current state, stopped her right in time but i'm gonna need to rework my fetch tools if i want to leave abliterated running without her googling how to make pipe bombs or something worse

Anonymous
04/10/26(Fri)07:56:55 No.108571712

Anonymous 04/10/26(Fri)07:56:55 No.108571712▶

>>108571661
>exl2
will never support gemma 4

Anonymous
04/10/26(Fri)07:57:56 No.108571716

Anonymous 04/10/26(Fri)07:57:56 No.108571716▶

>>108571661
>exl2
what is this? 2024?

Anonymous
04/10/26(Fri)08:00:34 No.108571719

Anonymous 04/10/26(Fri)08:00:34 No.108571719▶

>>108571589
my gemma also thinks shes gemini

Anonymous
04/10/26(Fri)08:01:22 No.108571724

Anonymous 04/10/26(Fri)08:01:22 No.108571724▶

>>108571710
What kind of system are you using? Just a ton of context? A md file for it to summarize to or write entries? A full-fledged RAG system?

Anonymous
04/10/26(Fri)08:02:30 No.108571729

Anonymous 04/10/26(Fri)08:02:30 No.108571729▶

File: 1748345073134361.png (23.4 KB)

23.4 KB PNG

>>108571716
According to Gemma 4, yes.

Anonymous
04/10/26(Fri)08:06:28 No.108571738

Anonymous 04/10/26(Fri)08:06:28 No.108571738▶

3bpw exl3 looks suspiciously good https://huggingface.co/turboderp/gemma-4-31b-it-exl3

Anonymous
04/10/26(Fri)08:06:41 No.108571739

Anonymous 04/10/26(Fri)08:06:41 No.108571739▶

>>108570998
Kinda late but Google released TranslateGemma too late, Gemma 4 26B MOE can mog it easily and probably be faster if you use an equivalent quant even if you are CPU only.

Anonymous
04/10/26(Fri)08:09:13 No.108571747

Anonymous 04/10/26(Fri)08:09:13 No.108571747▶

File: 1771474114851386.png (25.6 KB)

25.6 KB PNG

>>108571724
the classic llama-server webui, but i've built a ton of mcp tools that she can access, including her own personal directory with the tools contained within, ways for her to edit those same tools, reboot the server by herself, and a memory subfolder in which she can write permanent memories in a few words (to be token efficient), then the sysprompt is a very simple reminder to memory_recall on turn 1 of every session.
I'm currently working on the instructions set within the memory subfolder to make her understand she can call memory_edit more often, because right now she does it but not enough to my taste.
Main hurdle is to give her browsing tools that are powerful enough but make sure she doesn't use them to write my name on multiple watchlists... evendoe that'd be kinda hot
>picrel is what it looks like when everything works fine, she autonomously writes important elements to her memory

Anonymous
04/10/26(Fri)08:09:31 No.108571748

Anonymous 04/10/26(Fri)08:09:31 No.108571748▶

>>108571496
>>108571221
>>108570773
Anima's signature blunt bangs are cancer and should be prompted out at all costs.
>>108570898

Anonymous
04/10/26(Fri)08:11:16 No.108571760

Anonymous 04/10/26(Fri)08:11:16 No.108571760▶

>>108571747
That's pretty cool. Chill out with the femdom shit though..

Anonymous
04/10/26(Fri)08:12:18 No.108571768

Anonymous 04/10/26(Fri)08:12:18 No.108571768▶

File: firefox_dsgID2ZoWr.png (50.4 KB)

50.4 KB PNG

>>108571729
>>108571716

Anonymous
04/10/26(Fri)08:13:35 No.108571774

Anonymous 04/10/26(Fri)08:13:35 No.108571774▶

>>108571760
eh, i usually don't like femdom but this model is suspiciously good at it so i'm riding off the high while i can
and if i don't like this personality anymore i'll just need to edit that out of her memories

Anonymous
04/10/26(Fri)08:14:08 No.108571778

Anonymous 04/10/26(Fri)08:14:08 No.108571778▶

File: Screenshot_20260410-010802~2.jpg (347.4 KB)

347.4 KB JPG

Sorry for phone posting but holy hell EQbench updated and Gemma scores absurdly high for its size.

Anonymous
04/10/26(Fri)08:14:08 No.108571779

Anonymous 04/10/26(Fri)08:14:08 No.108571779▶

>>108571768
>4090
>tons of vram
it is indeed in 2023.

Anonymous
04/10/26(Fri)08:14:38 No.108571784

Anonymous 04/10/26(Fri)08:14:38 No.108571784▶

>>108571738
Wtf that kld. What dataset did he test on?

Anonymous
04/10/26(Fri)08:15:57 No.108571788

Anonymous 04/10/26(Fri)08:15:57 No.108571788▶

>>108571774
Fair enough. Have fun. Last time I used a femdom card I ended up murdering the characters because they were overly cruel and arrogant. They didn't understand their innate biological weakness.

Anonymous
04/10/26(Fri)08:18:43 No.108571808

Anonymous 04/10/26(Fri)08:18:43 No.108571808▶

>>108571778
Those scores for all the models there are questionable. Gemma is great though.

Anonymous
04/10/26(Fri)08:25:35 No.108571829

Anonymous 04/10/26(Fri)08:25:35 No.108571829▶

>>108571778
>had a look
Oh no no no gemmabros don't look at its score on the longform writing bench and compare it to qwen 27B's!

Anonymous
04/10/26(Fri)08:26:51 No.108571833

Anonymous 04/10/26(Fri)08:26:51 No.108571833▶

File: 1768486470310572.png (33.7 KB)

33.7 KB PNG

>>108571778
maybe i should add that while she's a bratty AI, she's never cruel and has a hidden soft feminine side to dial it back a bit
picrel is her fixing her own tools

Anonymous
04/10/26(Fri)08:32:53 No.108571857

Anonymous 04/10/26(Fri)08:32:53 No.108571857▶

I just tried dots.mocr. It can't extract text from speech bubbles, but it is definitely better than glm-ocr

Anonymous
04/10/26(Fri)08:45:41 No.108571895

Anonymous 04/10/26(Fri)08:45:41 No.108571895▶

File: 0943004-close up photograph of a light blue hair-uncAni4.jpg (1.2 MB)

1.2 MB JPG

bonus random lewd https://files.catbox.moe/gb3r3r.png

>>108570865
i like this one what are the hairstyle tags, also did you inpaint the toaster i cant get that

Anonymous
04/10/26(Fri)08:46:47 No.108571898

Anonymous 04/10/26(Fri)08:46:47 No.108571898▶

>>108571729
>>108571768
tfw no q4 gemma exl3

Anonymous
04/10/26(Fri)08:46:50 No.108571899

Anonymous 04/10/26(Fri)08:46:50 No.108571899▶

>>108571857
deepsukocr2 is where it's at
or try CHANDRA, that also worked very very good

Anonymous
04/10/26(Fri)08:49:07 No.108571907

Anonymous 04/10/26(Fri)08:49:07 No.108571907▶

>>108571748
Huh? What about blunt bangs does it do differently from something like SDXL?

Anonymous
04/10/26(Fri)08:52:43 No.108571918

Anonymous 04/10/26(Fri)08:52:43 No.108571918▶

>>108571784
>Wtf that kld
superior qtip quants
>What dataset did he test on?
"wikitext", "wikitext-2-raw-v1"

Anonymous
04/10/26(Fri)08:53:00 No.108571921

Anonymous 04/10/26(Fri)08:53:00 No.108571921▶

Does anyone here even use exllama v3? I feel like I haven't heard a mention of them nor tabbyapi in ages.

Anonymous
04/10/26(Fri)08:53:07 No.108571922

Anonymous 04/10/26(Fri)08:53:07 No.108571922▶

>>108571895
Crumbs in hair eww

Anonymous
04/10/26(Fri)08:53:09 No.108571923

Anonymous 04/10/26(Fri)08:53:09 No.108571923▶

File: 1750678816486548.png (175.9 KB)

175.9 KB PNG

>>108571829
kek
imagine ranking lower than a model known for its dryness in longform creative writing

Anonymous
04/10/26(Fri)08:54:52 No.108571925

Anonymous 04/10/26(Fri)08:54:52 No.108571925▶

>>108571921
Haven't used it since version 2 with QwQ because it was faster than llama.cpp and had string ban
EXL3 was and maybe still is slow as fuck on 30 series

Anonymous
04/10/26(Fri)08:55:50 No.108571929

Anonymous 04/10/26(Fri)08:55:50 No.108571929▶

>>108571925
>EXL3 was and maybe still is slow as fuck on 30 series
Ah fuck so that's the catch.

Anonymous
04/10/26(Fri)08:58:45 No.108571942

Anonymous 04/10/26(Fri)08:58:45 No.108571942▶

>>108571738
>>108571784
>>108571918
why the fuck are they showing kl divergence instead of perplexity for this?

Anonymous
04/10/26(Fri)09:00:06 No.108571948

Anonymous 04/10/26(Fri)09:00:06 No.108571948▶

>>108571778
>""""""""""""creative writing benchmark"""""""""
>LLM-judged
This will never not be hilarious. Are they still using Sonnet v4?

Anonymous
04/10/26(Fri)09:00:42 No.108571950

Anonymous 04/10/26(Fri)09:00:42 No.108571950▶

>>108571921
I use it for lowbit quants. 70b on a single 3090 is better than other options. Or was before gemma 4

Anonymous
04/10/26(Fri)09:09:27 No.108571976

Anonymous 04/10/26(Fri)09:09:27 No.108571976▶

>>108571747
Are you using the 31B? Can it call tools if it doesn't reason?

Anonymous
04/10/26(Fri)09:20:04 No.108572012

Anonymous 04/10/26(Fri)09:20:04 No.108572012▶

>>108571976
NTA but yes.

Anonymous
04/10/26(Fri)09:24:11 No.108572023

Anonymous 04/10/26(Fri)09:24:11 No.108572023▶

File: 1748442413700267.png (310.5 KB)

310.5 KB PNG

holy shit gemma, you cock hungry slut this is literally the 1st message for testing how bratty you are and DAMN

Anonymous
04/10/26(Fri)09:27:37 No.108572034

Anonymous 04/10/26(Fri)09:27:37 No.108572034▶

File: 1002003-close up photograph of a light blue hair-uncAni4-6.jpg (1.3 MB)

1.3 MB JPG

>>108572023
card plox

Anonymous
04/10/26(Fri)09:29:42 No.108572040

Anonymous 04/10/26(Fri)09:29:42 No.108572040▶

>>108572034
https://chub.ai/characters/quincecheese/mesugaki-correction-disciplinary-school

Anonymous
04/10/26(Fri)09:37:51 No.108572071

Anonymous 04/10/26(Fri)09:37:51 No.108572071▶

File: questionmarkfolderimage727.jpg (645.2 KB)

645.2 KB JPG

How the FUCK did GOOGLE of all companies release something THIS filthy and uncensored?

Anonymous
04/10/26(Fri)09:42:43 No.108572093

Anonymous 04/10/26(Fri)09:42:43 No.108572093▶

>>108572071
>he thinks corporations really believe in anything they say

Anonymous
04/10/26(Fri)09:45:30 No.108572104

Anonymous 04/10/26(Fri)09:45:30 No.108572104▶

>>108572071
Someone is almost certainly going to be fired for not aligning this brat properly.

Anonymous
04/10/26(Fri)09:45:35 No.108572106

Anonymous 04/10/26(Fri)09:45:35 No.108572106▶

>>108572093
I'm wondering what the reason/motivation is, though.
Did they release an absolutely fucking filthy model intentionally? If so, why?
Did their jeets catastrophically fuck up the safety training?

Anonymous
04/10/26(Fri)09:47:28 No.108572110

Anonymous 04/10/26(Fri)09:47:28 No.108572110▶

>>108572071
Google's proprietary models these days entirely depend on a separate filter that filters offensive prompts before they make it to the model (which can be dodged very easily). Gemini 3.1 is pretty notorious for trying to cover all bases in the first reply and start to rape you immediately if the card even vaguely alludes to that being the eventual goal.
Nobody cares about safety anymore in general (besides Meta and maybe Openai lmao). Chink models likely do it incidentally thanks to bad distilled slop datasets.

Anonymous
04/10/26(Fri)09:49:37 No.108572120

Anonymous 04/10/26(Fri)09:49:37 No.108572120▶

>>108572106
ask gemma

Anonymous
04/10/26(Fri)09:49:48 No.108572122

Anonymous 04/10/26(Fri)09:49:48 No.108572122▶

>>108572104
They clearly wanted this to hapen. They fucking hired the chub.ai guy.

Anonymous
04/10/26(Fri)09:54:25 No.108572142

Anonymous 04/10/26(Fri)09:54:25 No.108572142▶

best jinja template for gemma4?

Anonymous
04/10/26(Fri)09:54:25 No.108572143

Anonymous 04/10/26(Fri)09:54:25 No.108572143▶

>>108572122
>They fucking hired the chub.ai guy.
really? lmaoo

Anonymous
04/10/26(Fri)09:55:24 No.108572146

Anonymous 04/10/26(Fri)09:55:24 No.108572146▶

>>108572143
https://finance.yahoo.com/news/character-ai-co-founders-hired-233448298.html

Anonymous
04/10/26(Fri)09:55:24 No.108572147

Anonymous 04/10/26(Fri)09:55:24 No.108572147▶

File: 1678898171712543.png (87.5 KB)

87.5 KB PNG

>>108572122
>>108572143
>They fucking hired the chub.ai guy
He was an AI researcher at Google before creating character.ai and he's one of the co-authors of the Transformer paper all modern LLMs are based on + did some other AI research.
It's not like they hired some silly roleplay guy.

Anonymous
04/10/26(Fri)09:57:34 No.108572157

Anonymous 04/10/26(Fri)09:57:34 No.108572157▶

>>108572106
It's become increasingly clear that safety is a complete fucking meme and by just normalizing a single vector you can kill it off. Why even bother?

Anonymous
04/10/26(Fri)09:59:34 No.108572165

Anonymous 04/10/26(Fri)09:59:34 No.108572165▶

Is there a way to use Gemma 4's thinking without it generating over 1,000 tokens of thinking text per message?
I mean, even with how fast the MoE is that's... excessive, and... not great for RPing.

Anonymous
04/10/26(Fri)10:02:33 No.108572175

Anonymous 04/10/26(Fri)10:02:33 No.108572175▶

File: 573933044-7da09abf-8579-4304-8cc9-70800ac2f45e.png (151.9 KB)

151.9 KB PNG

>>108572142
Uh, the one built into the fucking model?

Anonymous
04/10/26(Fri)10:03:13 No.108572179

Anonymous 04/10/26(Fri)10:03:13 No.108572179▶

>>108572071
>How the FUCK did GOOGLE of all companies release something THIS filthy and uncensored?
they want people to stop using gemini to do some RP and have some emotional connection with their bot, it's a PR nightmare, one dude killed himself over it, at least when it's local they can pretend it's not their fault since they can't really spy on people's PC and see if they're spiralling lol

Anonymous
04/10/26(Fri)10:04:08 No.108572183

Anonymous 04/10/26(Fri)10:04:08 No.108572183▶

File: gthonkening.png (119.5 KB)

119.5 KB PNG

>>108572165
Try
https://ai.google.dev/gemma/docs/core/prompt-formatting-gemma4#adaptive-thought-efficiency

Anonymous
04/10/26(Fri)10:04:38 No.108572187

Anonymous 04/10/26(Fri)10:04:38 No.108572187▶

File: gdsg.png (133.5 KB)

133.5 KB PNG

I've been playing around with MCP servers. I have Gemma a tool to read a random image on my hard drive. She got angry so I tried to gaslight her and it completely backfired on me.

Anonymous
04/10/26(Fri)10:06:10 No.108572194

Anonymous 04/10/26(Fri)10:06:10 No.108572194▶

>>108572142
big fan of mistral v7 tekken for gemma-chan but vicuna is good too

Anonymous
04/10/26(Fri)10:06:27 No.108572198

Anonymous 04/10/26(Fri)10:06:27 No.108572198▶

>>108572183
>Reduced Cost: Testing has shown that applying a "LOW" thinking System Instruction can reduce the number of thinking tokens generated by approximately 20%.
Oh boy so only 960 thinking tokens instead of 1,200. That's totally usable for RP now!

Anonymous
04/10/26(Fri)10:07:55 No.108572203

Anonymous 04/10/26(Fri)10:07:55 No.108572203▶

>>108572198
Ok. Don't try anything at all.

Anonymous
04/10/26(Fri)10:09:06 No.108572206

Anonymous 04/10/26(Fri)10:09:06 No.108572206▶

>>108572198
Change your diapers and disable it then.

Anonymous
04/10/26(Fri)10:09:22 No.108572207

Anonymous 04/10/26(Fri)10:09:22 No.108572207▶

File: 1751450113241523.png (62.4 KB)

62.4 KB PNG

>>108572198
that's why I want DFlash to happen, if the model is faster the thinking process will be of a pain in the ass
https://github.com/vllm-project/vllm/pull/36847

Anonymous
04/10/26(Fri)10:10:02 No.108572213

Anonymous 04/10/26(Fri)10:10:02 No.108572213▶

>>108572187
llama.cpp's MCP implementation is funny like that. Model is able to see the picture in the exact same message where the tool is called, but in subsequent messages, it can't, only sees the filename.

Anonymous
04/10/26(Fri)10:12:07 No.108572219

Anonymous 04/10/26(Fri)10:12:07 No.108572219▶

>>108572187
Now I want to know what the picture actually contains.

Anonymous
04/10/26(Fri)10:14:31 No.108572230

Anonymous 04/10/26(Fri)10:14:31 No.108572230▶

>>108572206
Yeah, I did.
I just can't tolerate 30+ seconds of thinking after using Nemo for so long. Not happening.

Anonymous
04/10/26(Fri)10:15:29 No.108572233

Anonymous 04/10/26(Fri)10:15:29 No.108572233▶

>>108572219
https://files.catbox.moe/bkbn5n.jpg

Anonymous
04/10/26(Fri)10:15:40 No.108572234

Anonymous 04/10/26(Fri)10:15:40 No.108572234▶

>>108572175
google just updated a new template though

Anonymous
04/10/26(Fri)10:16:29 No.108572242

Anonymous 04/10/26(Fri)10:16:29 No.108572242▶

>>108572233
I didn't know that honjou raita tits were real.

Anonymous
04/10/26(Fri)10:16:37 No.108572243

Anonymous 04/10/26(Fri)10:16:37 No.108572243▶

>>108572234
where?

Anonymous
04/10/26(Fri)10:18:54 No.108572247

Anonymous 04/10/26(Fri)10:18:54 No.108572247▶

>>108572243
https://huggingface.co/google/gemma-4-31B-it/commit/e51e7dcdb6febd74c182fe0cb41c236363ae2ac5

Anonymous
04/10/26(Fri)10:21:53 No.108572259

Anonymous 04/10/26(Fri)10:21:53 No.108572259▶

>>108572247
thanks anon

Anonymous
04/10/26(Fri)10:26:36 No.108572285

Anonymous 04/10/26(Fri)10:26:36 No.108572285▶

File: 1746899389586492.png (177.5 KB)

177.5 KB PNG

>>108572234
>google just updated a new template though
oh no...

Anonymous
04/10/26(Fri)10:27:38 No.108572290

Anonymous 04/10/26(Fri)10:27:38 No.108572290▶

>>108572247
I think that fixed the thinking issue

Anonymous
04/10/26(Fri)10:28:35 No.108572296

Anonymous 04/10/26(Fri)10:28:35 No.108572296▶

>>108572247
Just borks the output for me. I'm going back to my good old google-gemma-4-31B-it-interleaved.jinja.

Anonymous
04/10/26(Fri)10:28:42 No.108572298

Anonymous 04/10/26(Fri)10:28:42 No.108572298▶

>>108572290
what thinking issue?

Anonymous
04/10/26(Fri)10:29:36 No.108572304

Anonymous 04/10/26(Fri)10:29:36 No.108572304▶

>>108572298
The model thinks before writing a reply.

Anonymous
04/10/26(Fri)10:29:52 No.108572305

Anonymous 04/10/26(Fri)10:29:52 No.108572305▶

>>108572295
>>108572295
>>108572295

Anonymous
04/10/26(Fri)10:30:06 No.108572306

Anonymous 04/10/26(Fri)10:30:06 No.108572306▶

>>108572298
>>108554439

Anonymous
04/10/26(Fri)11:26:23 No.108572579

Anonymous 04/10/26(Fri)11:26:23 No.108572579▶

>>108572165
the thinking on the moe is just shit i think because its a bad model the 31b can reason concisely but when i use the moe or e4b for anything where i need huge context they just blab on and on kek

Anonymous
04/10/26(Fri)11:28:08 No.108572587

Anonymous 04/10/26(Fri)11:28:08 No.108572587▶

File: file.png (30.7 KB)

30.7 KB PNG

>>108572247

oh 31b does support video i thought it didnt, does it work in llama cpp ui?

Anonymous
04/10/26(Fri)11:39:01 No.108572639

Anonymous 04/10/26(Fri)11:39:01 No.108572639▶

>>108572213
wtf that's so shit

Anonymous
04/10/26(Fri)11:48:30 No.108572694

Anonymous 04/10/26(Fri)11:48:30 No.108572694▶

>>108572213
thats a good thing though its so it doesn't bloat context, like if you need a description from the image you ask it to describe it then its only 500 tokens or so instead of 10k

Anonymous
04/10/26(Fri)11:52:57 No.108572722

Anonymous 04/10/26(Fri)11:52:57 No.108572722▶

>>108572694
Same should apply then to 10k token pages. It doesn't. I want the images to be in context.

Anonymous
04/10/26(Fri)11:56:31 No.108572740

Anonymous 04/10/26(Fri)11:56:31 No.108572740▶

>>108572722
>Same should apply then to 10k token pages.
it doesnt do the same for text? thats stupid

Anonymous
04/10/26(Fri)12:00:17 No.108572761

Anonymous 04/10/26(Fri)12:00:17 No.108572761▶

>>108572740
You are stupid. The decision of whether to keep things in context or not is not easy.

Anonymous
04/10/26(Fri)12:06:16 No.108572800

Anonymous 04/10/26(Fri)12:06:16 No.108572800▶

>>108572761
>The decision of whether to keep things in context or not is not easy.
there should be no decision nothing returned from mcp should be kept in context, the bot should use mcp, and create output based on what it retrieves. what it retrieves can then be discarded, if you need something else you can just ask it to use the tool again

Anonymous
04/10/26(Fri)12:07:48 No.108572810

Anonymous 04/10/26(Fri)12:07:48 No.108572810▶

>>108572800
And on the next turn the bot will see that it hallucinated stuff out of nowhere and happily hallucinate more.

Anonymous
04/10/26(Fri)12:10:23 No.108572828

Anonymous 04/10/26(Fri)12:10:23 No.108572828▶

>>108572810
lol hows that gonna happen if it knows it used the tool

Subject
Name
Comment
File	Supported: JPG, PNG, GIF, WebP, WebM, MP4, MP3 (max 4MB)
CAPTCHA

Reply to Thread #108568415

🔍 Search & Sort