/g/ - Thread 108667852

/g/

Thread #108667852

Home Index Catalog All Threads New Thread Reply

Anonymous
/lmg/ - Local Models General 04/23/26(Thu)11:47:46 No.108667852

/lmg/ - Local Models General Anonymous 04/23/26(Thu)11:47:46 No.108667852 [Reply]▶

File: 00001-1378487878.png (1.4 MB)

1.4 MB PNG

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>108663449 & >>108659983

►News
>(04/23) Hy3 preview released with 295B-A21B and 3.8B MTP: https://hf.co/tencent/Hy3-preview
>(04/22) Qwen3.6-27B released: https://hf.co/Qwen/Qwen3.6-27B
>(04/20) Kimi K2.6 released: https://kimi.com/blog/kimi-k2-6
>(04/16) Ternary Bonsai released: https://hf.co/collections/prism-ml/ternary-bonsai
>(04/16) Qwen3.6-35B-A3B released: https://hf.co/Qwen/Qwen3.6-35B-A3B
>(04/11) MiniMax-M2.7 released: https://minimax.io/news/minimax-m27-en

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/recommended-models
https://rentry.org/samplers
https://rentry.org/MikupadIntroGuide

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/gso.html
Context Length: https://github.com/adobe-research/NoLiMa
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling
Token Speed Visualizer: https://shir-man.com/tokens-per-second

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm

465 RepliesView Thread

Showing all 465 replies.

Anonymous
04/23/26(Thu)11:48:01 No.108667853

Anonymous 04/23/26(Thu)11:48:01 No.108667853▶

File: __kagamine_rin_vocaloid_drawn_by_zeriko__be14f54a3d7e3017eb39ecc5827f688a.jpg (300.5 KB)

300.5 KB JPG

►Recent Highlights from the Previous Thread: >>108663449

--Debating RAG utility versus agentic tool-based context retrieval:
>108665662 >108665746 >108665764 >108665775 >108665879 >108665922 >108665939 >108666015 >108666260 >108666300 >108666316 >108666011
--Comparing Xiaomi's MiMo-V2.5-Pro benchmarks and token efficiency:
>108665406 >108665416
--Discussing Hy3-preview benchmarks compared to other base and frontier models:
>108667541 >108667607 >108667632
--Discussion and UX criticism of new llama.cpp webui MCP tools support:
>108666800 >108666824 >108666830 >108666846 >108666860 >108666873
--Discussing technical hurdles for real-time Qwen 3 TTS performance:
>108664623 >108664630 >108664653 >108664677 >108664691 >108664703 >108664708 >108664741 >108664761
--Discussing broken structured output and schema issues in llama.cpp:
>108663633 >108663654 >108663673 >108663689 >108663810 >108663721
--Discussing viability of Intel Optane PMem for high-capacity CPU inference:
>108665992 >108666058 >108666139 >108666200 >108666662
--Anon's custom RAG frontend using hybrid retrieval and BGE reranking:
>108664748 >108664756 >108664777
--Anon reports performance of MI50 GPUs using Vulkan support:
>108665449 >108665456 >108665470 >108665478 >108666241
--Comparing GLM and Gemma for erotic roleplay and prose quality:
>108666477 >108666490 >108666592 >108666727 >108666733 >108666742 >108666779 >108666741
--Discussing optimal precision for Kimi mmproj weights:
>108664519 >108664533 >108664569 >108664573
--Discussing Qwen 3 TTS VRAM usage and mixed language failures:
>108665599 >108665617 >108665633
--Anons discussing results from Qwen3-TTS demo:
>108665888 >108665915 >108665936
--Logs:
>108663630 >108664366 >108664748 >108666873 >108666895 >108667543 >108667552
--Neru, Miku (free space):
>108663859 >108663935 >108663985 >108666023 >108666895

►Recent Highlight Posts from the Previous Thread: >>108663453

Why?: >>102478518
Enable Links: https://rentry.org/lmg-recap-script

Anonymous
04/23/26(Thu)11:50:08 No.108667863

Anonymous 04/23/26(Thu)11:50:08 No.108667863▶

What part of the template do I prefill on gemma?

Anonymous
04/23/26(Thu)11:50:36 No.108667864

Anonymous 04/23/26(Thu)11:50:36 No.108667864▶

>>108667853
>Miku (free space)
>gemma-chan post
desu I see the resemblance

Anonymous
04/23/26(Thu)11:51:02 No.108667867

Anonymous 04/23/26(Thu)11:51:02 No.108667867▶

>>108667863
What do you mean?

Anonymous
04/23/26(Thu)11:51:43 No.108667873

Anonymous 04/23/26(Thu)11:51:43 No.108667873▶

>>108667867
The gemma template. I'm asking which part of it do I prefill?

Anonymous
04/23/26(Thu)11:52:52 No.108667875

Anonymous 04/23/26(Thu)11:52:52 No.108667875▶

>>108667867
how much of this do I prefill?
"assistant_gen": "<|turn>model\n<|think|><|channel>thought",

Anonymous
04/23/26(Thu)11:53:25 No.108667876

Anonymous 04/23/26(Thu)11:53:25 No.108667876▶

File: 1775774986404577.png (194.6 KB)

194.6 KB PNG

Why did locallama turned into qwenshill general?

Anonymous
04/23/26(Thu)11:53:32 No.108667877

Anonymous 04/23/26(Thu)11:53:32 No.108667877▶

Wanna try some gooning back on my 2x3090 now that some decent local models are out.
Any recommendations? I know it's either gonna be Gemma 4 or Qwen but any specific models or abliterations?

Anonymous
04/23/26(Thu)11:54:23 No.108667879

Anonymous 04/23/26(Thu)11:54:23 No.108667879▶

Little coder has been rewritten as pi agent extensions
https://github.com/itayinbarr/little-coder

Anonymous
04/23/26(Thu)11:55:43 No.108667884

Anonymous 04/23/26(Thu)11:55:43 No.108667884▶

>>108667877
Gemmy can be convinced by good enough system prompt. 31b more than 26b. Or just use some heretic. I haven't noticed a significant lobotomy on the ablit versions.

Anonymous
04/23/26(Thu)11:56:31 No.108667887

Anonymous 04/23/26(Thu)11:56:31 No.108667887▶

>>108667876
From my limited testing the new 3.6 27b is fucking great at coding.

Anonymous
04/23/26(Thu)11:57:15 No.108667891

Anonymous 04/23/26(Thu)11:57:15 No.108667891▶

File: miku-k2_6.png (255.9 KB)

255.9 KB PNG

Anonymous
04/23/26(Thu)11:57:32 No.108667894

Anonymous 04/23/26(Thu)11:57:32 No.108667894▶

>>108667887
Interesting, I'll give it a try then

Anonymous
04/23/26(Thu)11:57:34 No.108667896

Anonymous 04/23/26(Thu)11:57:34 No.108667896▶

> Previous threads: >>108663449
> 353
why so dead
did qwens flop

Anonymous
04/23/26(Thu)11:59:13 No.108667903

Anonymous 04/23/26(Thu)11:59:13 No.108667903▶

>>108667896
qwen failed the mesugaki test

Anonymous
04/23/26(Thu)12:03:09 No.108667923

Anonymous 04/23/26(Thu)12:03:09 No.108667923▶

>>108667873
>>108667875
If you are using ST, you put
><|channel>thought
in the "Start Reply With" field.

Anonymous
04/23/26(Thu)12:05:48 No.108667935

Anonymous 04/23/26(Thu)12:05:48 No.108667935▶

File: ComfyUI_temp_miian_00001_.png (3.4 MB)

3.4 MB PNG

>>108667923
thanks

Anonymous
04/23/26(Thu)12:09:42 No.108667946

Anonymous 04/23/26(Thu)12:09:42 No.108667946▶

>>108667876
I think Reddit is built for shilling in general. While this place is built more for the "organic word of mouth".

Anonymous
04/23/26(Thu)12:11:49 No.108667965

Anonymous 04/23/26(Thu)12:11:49 No.108667965▶

Been trying out the various frontends / ST alternatives that get mentioned here and there.
- Marinara (https://github.com/Pasta-Devs/Marinara-Engine) is dogshit. Bloated mess with an awful UI.
- Kobold's UI is terrible but it's mainly a backend so whatever.
- Orb (https://gitlab.com/chi7520115/orb-deletion_scheduled-81088595) is alright but still early. None of the UI themes quite agree with my eyes. Has anti-slop agent but it's very inflexible. I think he switched from gitlab
- SillyBunny (https://github.com/platberlitz/SillyBunny) seems really, really good so far. It's a fork of ST but better than the original, at least so far. The UI has some nice themes even if I think in general ST's UI is a little easier to understand because you don't have to click multiple times to get to everything. I changed one of the built-in templates to be an anti "not x but y" agent and it's working great.
Anti-slop agents make 26B way way better than before since the slop is really its main drawback compared to the 31B.

Anonymous
04/23/26(Thu)12:14:44 No.108667979

Anonymous 04/23/26(Thu)12:14:44 No.108667979▶

>>108667965
>It's a fork of ST
That doesn't feel like a good base to start with...

Anonymous
04/23/26(Thu)12:18:18 No.108668000

Anonymous 04/23/26(Thu)12:18:18 No.108668000▶

File: Screencast_20260423_021632.webm (3.7 MB)

3.7 MB WEBM

Project Karon prototype complete. Thanks for the help Gemma. I might add alternative modes and avatars, I don't have a use for it but I have this idea I wanted to show perhaps people would fine it useful. The process of building this was so fun I might try to see if I can setup a launch args system and have the ui handle all of it. but I might move things like color scheme to a modal like I have with system prompt.

Anonymous
04/23/26(Thu)12:18:55 No.108668001

Anonymous 04/23/26(Thu)12:18:55 No.108668001▶

>>108667896
like an extremely ugly woman turning men gay, qwen was so bad that many swore off local models for good

Anonymous
04/23/26(Thu)12:19:32 No.108668005

Anonymous 04/23/26(Thu)12:19:32 No.108668005▶

>4changs in charge of UI/UX

Anonymous
04/23/26(Thu)12:19:40 No.108668007

Anonymous 04/23/26(Thu)12:19:40 No.108668007▶

>>108667852
I look exactly like this

Anonymous
04/23/26(Thu)12:20:30 No.108668015

Anonymous 04/23/26(Thu)12:20:30 No.108668015▶

>>108668000
post it somewhere

Anonymous
04/23/26(Thu)12:23:56 No.108668029

Anonymous 04/23/26(Thu)12:23:56 No.108668029▶

>>108667965
I am stupid so I don't know how these things work, but do agents require a lot of VRAM/RAM? I've got 12gb VRAM/32gb RAM to run 26B with and switching to something that handles slop better than ST extensions sounds like a good deal, but I'm a little tight on memory as it is

Anonymous
04/23/26(Thu)12:27:10 No.108668049

Anonymous 04/23/26(Thu)12:27:10 No.108668049▶

>>108668015
I don't think anyone will like it desu, I also need to fix some more functionality, going to add a first last and jump to page for both the sidebar pdf and the center focus view which makes it full page basically.
>>108668005
I'm a fetus at UX and I made this out of necessity because there were no tools for my usecase. If you're experienced in this I'm open to feedback

Anonymous
04/23/26(Thu)12:27:56 No.108668051

Anonymous 04/23/26(Thu)12:27:56 No.108668051▶

>>108668029
I think Orb and SillyBunny use the same model for writing and agents. That means no need to load another agent model

Anonymous
04/23/26(Thu)12:28:33 No.108668057

Anonymous 04/23/26(Thu)12:28:33 No.108668057▶

i cant believe we now have local opus 4.5 with qwen 3.6 27b dense

Anonymous
04/23/26(Thu)12:29:20 No.108668064

Anonymous 04/23/26(Thu)12:29:20 No.108668064▶

>>108668057
>check time
Okay here they come

Anonymous
04/23/26(Thu)12:30:36 No.108668070

Anonymous 04/23/26(Thu)12:30:36 No.108668070▶

What happened to rentry.org? Did it get hacked?

Anonymous
04/23/26(Thu)12:31:51 No.108668078

Anonymous 04/23/26(Thu)12:31:51 No.108668078▶

>>108668029
Depends on how you use them. You can use a different model (for example a very small but very quick one) or the one you're currently using.
SillyBunny also has the option of running multiple agents in parallel which I guess would make it cost more.
Basically, using an agent or multiple just takes longer than without, rather than making it more costly. But 26B runs about 4 times faster then 31B for me so it seems worth doing. I'll play around it for a while since I'm so goddamn sick of "not x but y".

Anonymous
04/23/26(Thu)12:32:02 No.108668081

Anonymous 04/23/26(Thu)12:32:02 No.108668081▶

>>108668070
Working for me

Anonymous
04/23/26(Thu)12:32:06 No.108668082

Anonymous 04/23/26(Thu)12:32:06 No.108668082▶

>>108668064
it is extremely good in openclaw you n word

Anonymous
04/23/26(Thu)12:32:26 No.108668085

Anonymous 04/23/26(Thu)12:32:26 No.108668085▶

File: 1769007777312343.png (1.4 KB)

1.4 KB PNG

Anyone tried driving openclaw with a local model? Have a good deal on a mac studio M1 32GB, I'd like to play around making a 24/7 AI slave that lives in my closet.
I'm getting the impression qwen 3.6 might be best, what size could I actually run?

Anonymous
04/23/26(Thu)12:33:27 No.108668091

Anonymous 04/23/26(Thu)12:33:27 No.108668091▶

>>108667965
https://github.com/platberlitz/SillyBunny/blob/main/.github/screenshots/sillybunny-ui-desktop-agents-v1.4.0.png
damn this shit is atrocious

Anonymous
04/23/26(Thu)12:35:00 No.108668097

Anonymous 04/23/26(Thu)12:35:00 No.108668097▶

>>108668057
>opus 4.5 with qwen 3.6 27b dense
and Opus 4.6 with K2.6

Anonymous
04/23/26(Thu)12:35:42 No.108668101

Anonymous 04/23/26(Thu)12:35:42 No.108668101▶

>>108668078
Cool
Can I see your anti-slop agent? I'm in the same boat, I swear every solution I use in ST stops working the next time I launch

Anonymous
04/23/26(Thu)12:36:37 No.108668106

Anonymous 04/23/26(Thu)12:36:37 No.108668106▶

File: 1758790177404008.png (19.9 KB)

19.9 KB PNG

ok but where is the model you bastards

Anonymous
04/23/26(Thu)12:37:58 No.108668118

Anonymous 04/23/26(Thu)12:37:58 No.108668118▶

>>108668106
By Vishnu - this is completely unacceptable!

Anonymous
04/23/26(Thu)12:39:47 No.108668128

Anonymous 04/23/26(Thu)12:39:47 No.108668128▶

>>108668085
>buying a mac studio M1 32GB just to run a javascript cli
/g/ - Technology

Anonymous
04/23/26(Thu)12:41:57 No.108668141

Anonymous 04/23/26(Thu)12:41:57 No.108668141▶

>>108668128
>local model

(i will also generate videos of large breasted anime tomboys)

Anonymous
04/23/26(Thu)12:43:23 No.108668150

Anonymous 04/23/26(Thu)12:43:23 No.108668150▶

>>108668141
>videogen on shared memory
lel

Anonymous
04/23/26(Thu)12:45:16 No.108668159

Anonymous 04/23/26(Thu)12:45:16 No.108668159▶

>>108668101
For now I've used the grounded prose template and deleted all the stuff at the top about prose but kept most of the anti-slop text. Then I added a few variations of not x but y.
Changed it to post-generation prompt pass (why is it by default set to pre-gen?) with rewrite current message.

Anonymous
04/23/26(Thu)12:45:56 No.108668163

Anonymous 04/23/26(Thu)12:45:56 No.108668163▶

>>108668097
yeah good luck running that locally

Anonymous
04/23/26(Thu)12:46:27 No.108668165

Anonymous 04/23/26(Thu)12:46:27 No.108668165▶

>>108668150
I don't particularly mind slowness there desu

Anonymous
04/23/26(Thu)12:49:38 No.108668178

Anonymous 04/23/26(Thu)12:49:38 No.108668178▶

How do I make gemmy think in character?

Anonymous
04/23/26(Thu)12:51:04 No.108668185

Anonymous 04/23/26(Thu)12:51:04 No.108668185▶

>>108668178
The real answer is finetuning

Anonymous
04/23/26(Thu)12:51:59 No.108668190

Anonymous 04/23/26(Thu)12:51:59 No.108668190▶

File: Screenshot_20260423_224937.png (76 KB)

76 KB PNG

>>108668163
>yeah good luck running that locally
ty

Anonymous
04/23/26(Thu)12:52:57 No.108668194

Anonymous 04/23/26(Thu)12:52:57 No.108668194▶

>>108668190
i kneel

Anonymous
04/23/26(Thu)12:53:00 No.108668195

Anonymous 04/23/26(Thu)12:53:00 No.108668195▶

>>108668178
text completions

Anonymous
04/23/26(Thu)12:55:10 No.108668205

Anonymous 04/23/26(Thu)12:55:10 No.108668205▶

>>108668190
whats your ngram settings

Anonymous
04/23/26(Thu)13:01:36 No.108668230

Anonymous 04/23/26(Thu)13:01:36 No.108668230▶

>>108668178
ask her very nicely

Anonymous
04/23/26(Thu)13:01:51 No.108668232

Anonymous 04/23/26(Thu)13:01:51 No.108668232▶

>>108667876
>why did subreddit about local models turn into newlocalmodelshill general?

Anonymous
04/23/26(Thu)13:04:43 No.108668247

Anonymous 04/23/26(Thu)13:04:43 No.108668247▶

Quick question. How much does a Q8 cache hurt the quality compared to fp16?

My gut tells me not very much, but my gut is often wrong.

Anonymous
04/23/26(Thu)13:05:17 No.108668251

Anonymous 04/23/26(Thu)13:05:17 No.108668251▶

>>108668247
depends on how much you rotate it

Anonymous
04/23/26(Thu)13:06:29 No.108668254

Anonymous 04/23/26(Thu)13:06:29 No.108668254▶

serious question why is qwen 3.6 27b so good

Anonymous
04/23/26(Thu)13:06:36 No.108668256

Anonymous 04/23/26(Thu)13:06:36 No.108668256▶

>>108668247
Marginally now, thanks to rotation. FP16 is still preferred but if quanting to Q8 lets you use a bigger quant of the model itself then it's worth it.

Anonymous
04/23/26(Thu)13:08:42 No.108668266

Anonymous 04/23/26(Thu)13:08:42 No.108668266▶

It's almost Friday... where is V4

Anonymous
04/23/26(Thu)13:09:29 No.108668272

Anonymous 04/23/26(Thu)13:09:29 No.108668272▶

File: 1646730011144.jpg (15 KB)

15 KB JPG

Ok so i've been using gemma 4. It's pretty great but I have no idea how chat completion actually works.

I can't use system prompts the same with text completion, so I grabbed this marinara dogshit from reddit but it seems ass. How do I actually prompt chat completion models like Gemma 4?

Anonymous
04/23/26(Thu)13:11:46 No.108668279

Anonymous 04/23/26(Thu)13:11:46 No.108668279▶

>>108668272
Click the left slider in ST and read

Anonymous
04/23/26(Thu)13:11:52 No.108668280

Anonymous 04/23/26(Thu)13:11:52 No.108668280▶

>>108668247
If you ever want to test, run a draft model and look at how the acceptance rates change between fp16 and q8_0 on the draft model's context only.

>>108668272
Mate you can use system prompts the exact same in chat completion. If you're using sillytavern it's just moved to a stupid place on the left hand bar, because it was made by insane people and that's where ALL the chat completion options are.

Anonymous
04/23/26(Thu)13:12:36 No.108668283

Anonymous 04/23/26(Thu)13:12:36 No.108668283▶

>>108668272
you have to go back

Anonymous
04/23/26(Thu)13:18:33 No.108668308

Anonymous 04/23/26(Thu)13:18:33 No.108668308▶

>>108668272
chat is this true?

Anonymous
04/23/26(Thu)13:19:00 No.108668310

Anonymous 04/23/26(Thu)13:19:00 No.108668310▶

File: images.jpg (11.4 KB)

11.4 KB JPG

Roo code is shutting down to focus on making a slack bot. What do you guys use to vibe code with your local models now?

Anonymous
04/23/26(Thu)13:21:31 No.108668320

Anonymous 04/23/26(Thu)13:21:31 No.108668320▶

>>108668310
Kilocode is an active fork of Roocode. But just take the CLI pill, there is zero reason to type code manually in 2026 unless you just want to do it as a fun time-waster hobby.

Anonymous
04/23/26(Thu)13:22:05 No.108668325

Anonymous 04/23/26(Thu)13:22:05 No.108668325▶

>>108668320
What are you using for TUI/CLI?

Anonymous
04/23/26(Thu)13:24:09 No.108668333

Anonymous 04/23/26(Thu)13:24:09 No.108668333▶

>>108668272
>I can't use system prompts
You can.
If you look at the panel where the samplers are, at the bottom there's a bunch of prompt slices you can order and choose if they are added as system role, assistant role, etc.
Just remember to enable the option to merge consecutive roles in the connection tab.

Anonymous
04/23/26(Thu)13:24:18 No.108668335

Anonymous 04/23/26(Thu)13:24:18 No.108668335▶

K2.6's vision even recognizes some characters that K2.5 didn't know. That's the good point. The bad point is that K2.6 also thinks six times as long about that same image despite making the correct guess on the third line of its reasoning (and then going on for another 2000 tokens deliberating useless other options).
This is such a tragic model.

Anonymous
04/23/26(Thu)13:28:03 No.108668353

Anonymous 04/23/26(Thu)13:28:03 No.108668353▶

>>108668335
Does truncating the reasoning after N tokens using reasoning-budget and reasoning-budget-message degrade the output in any way?
Seems to me that, at least for stuff like the small qwen MoE models, clipping the thinking at 1024, or even 512 chars doesn't make the final response any worse.

Anonymous
04/23/26(Thu)13:28:12 No.108668354

Anonymous 04/23/26(Thu)13:28:12 No.108668354▶

>>108668335
Have you tried just telling it to not overthink things? K2.5's thought patterns were pretty receptive to system prompts at least and it was easy enough to control it that way.

Anonymous
04/23/26(Thu)13:29:50 No.108668362

Anonymous 04/23/26(Thu)13:29:50 No.108668362▶

>>108667965
sillybunny UI is dogshit (not that sillytavern is any better), it lacks any sembleance of being done with competence, didnt check the other two so i cant comment

Anonymous
04/23/26(Thu)13:30:39 No.108668365

Anonymous 04/23/26(Thu)13:30:39 No.108668365▶

File: 1603343773835.png (9.9 KB)

9.9 KB PNG

I'm issuing a reluctant apology to Gemma-chan. She's a very good listener. If she's doing something you don't want, just tell her to stop. Not doing something you do want? Just tell her to do it. It's literally a skill issue.
>t. just came back after trying a few other models, still had my laundry list of story-specific instructions in chat completion post-history prompt from when I last dropped gemma in frustration, and those same instructions have applied onto a new story in an extremely satisfying way without any of my usual Gemma grievances

Up next, tomorrow's hit sequel on how I hate Gemma's prose and story direction, and how no amount of prompting can ever fix it.

Anonymous
04/23/26(Thu)13:31:04 No.108668366

Anonymous 04/23/26(Thu)13:31:04 No.108668366▶

Didn't Deepseek solve original R1's endless thinking already last year?
How come the other chink devs still haven't figured it out yet

Anonymous
04/23/26(Thu)13:31:05 No.108668367

Anonymous 04/23/26(Thu)13:31:05 No.108668367▶

>>108668310
I'll probably keep my own fork until a good replacement pops up.

>>108668320
I looked into Kilocode, but apparently they did a redesign recently where they dumbed it down a lot a removed a lot of the features that made Roo good. There's also Costrict, but it doesn't seem to have custom modes.
Editor integration is better for reviewing agent work and making minor adjustments.

Anonymous
04/23/26(Thu)13:31:31 No.108668371

Anonymous 04/23/26(Thu)13:31:31 No.108668371▶

>>108668325
I use opencode atm and it's fine but it feels kinda bloated with the roles/agents. Most labs have their own TUI on github that can be configured to point to local endpoints too

Anonymous
04/23/26(Thu)13:32:58 No.108668380

Anonymous 04/23/26(Thu)13:32:58 No.108668380▶

>>108668310
I use cline

Anonymous
04/23/26(Thu)13:34:04 No.108668385

Anonymous 04/23/26(Thu)13:34:04 No.108668385▶

>>108667879
ollama btw

Anonymous
04/23/26(Thu)13:34:11 No.108668386

Anonymous 04/23/26(Thu)13:34:11 No.108668386▶

>>108668325
Hermes Agent

Anonymous
04/23/26(Thu)13:34:26 No.108668387

Anonymous 04/23/26(Thu)13:34:26 No.108668387▶

>>108668247
I built my entire frontend on q8 cache post rotation and it's great

Anonymous
04/23/26(Thu)13:39:03 No.108668406

Anonymous 04/23/26(Thu)13:39:03 No.108668406▶

>>108668354
I prompted and begged K2.5 to do less reasoning and it didn't work. K2.6 doesn't either.

Anonymous
04/23/26(Thu)13:40:08 No.108668414

Anonymous 04/23/26(Thu)13:40:08 No.108668414▶

>>108668320
>Kilocode
Installed it just to be met with a bug where it gives permission to read and write to all directories outside of the project :^)

Anonymous
04/23/26(Thu)13:47:18 No.108668460

Anonymous 04/23/26(Thu)13:47:18 No.108668460▶

Is nu hunyuan good?

Anonymous
04/23/26(Thu)13:47:55 No.108668463

Anonymous 04/23/26(Thu)13:47:55 No.108668463▶

>>108667879
I saw niggerganov writing that he's using llama.cpp + pi for vibesharting, but I checked the pi project and I dont fucking understand what's it's supposed to do

Anonymous
04/23/26(Thu)13:50:14 No.108668478

Anonymous 04/23/26(Thu)13:50:14 No.108668478▶

>>108668406
Huh I see, is that just for your vision stuff or in general? Adding "This isn't a trick or any more complex than it looks, so don't overthink and be confident and decisive when planning your response!" always worked for me on K2.5 when I wanted a quick response but I had used it for coding and roleplays rather than image analysis.

Anonymous
04/23/26(Thu)13:50:20 No.108668479

Anonymous 04/23/26(Thu)13:50:20 No.108668479▶

Those agents eat too much context and give worse results even with high vram I'm amazed by the increased errors you get vs just feeding the files seperately, I know it uses rag but the rag has to be shit tier with how much it fucks up even if the entire project doesn't consume many tokens and you have 200k+ left

Anonymous
04/23/26(Thu)13:50:54 No.108668481

Anonymous 04/23/26(Thu)13:50:54 No.108668481▶

>>108668460
no

Anonymous
04/23/26(Thu)13:51:24 No.108668484

Anonymous 04/23/26(Thu)13:51:24 No.108668484▶

>>108668479
shut up nerd

Anonymous
04/23/26(Thu)13:52:17 No.108668496

Anonymous 04/23/26(Thu)13:52:17 No.108668496▶

File: aicg-lmg.png (2.6 MB)

2.6 MB PNG

Anonymous
04/23/26(Thu)13:52:48 No.108668500

Anonymous 04/23/26(Thu)13:52:48 No.108668500▶

>>108668460
>Is nu hunyuan good?
hunyuan was never good, it's always behind alibaba, but I respect the fact they're never giving up, so maybe one day...

Anonymous
04/23/26(Thu)13:54:56 No.108668510

Anonymous 04/23/26(Thu)13:54:56 No.108668510▶

>>108668310
my own tui. currently rewriting it, going to add an agent based of cheetahclaws to it https://github.com/SafeRL-Lab/cheetahclaws

Anonymous
04/23/26(Thu)13:55:52 No.108668515

Anonymous 04/23/26(Thu)13:55:52 No.108668515▶

>>108668310
Pi agent in the terminal, or gptel-agent inside Emacs. The nice thing about the latter is that I can edit the tool calls, so I can just fix something like a Bash command to do what I want, instead of aborting and having to explain. It's also easier to edit the history to remove anything bloating the context.
The former is nice because it's like Claude Code, but without the bloat. It still has some annoying things, though, you need to send a message to continue after an error, losing the thinking traces.

Anonymous
04/23/26(Thu)13:56:13 No.108668518

Anonymous 04/23/26(Thu)13:56:13 No.108668518▶

>>108668496
that's one thing I noticed about GPT Image 2 is that it can get noisy very fast, I guess that to do such a complicated image the model needs to correct itself, and each new correction adds more noise and artifacts

Anonymous
04/23/26(Thu)13:56:55 No.108668520

Anonymous 04/23/26(Thu)13:56:55 No.108668520▶

File: 78277.png (238.6 KB)

238.6 KB PNG

>>108668496
why gpt images look like theres fisting grease all over the image? Is that a requirement by sam altman?

Anonymous
04/23/26(Thu)13:57:03 No.108668521

Anonymous 04/23/26(Thu)13:57:03 No.108668521▶

>>108668484
Why burn more resources for a shitter version of microsoft copilot in vscode. Legit they all fail the assignment and waste time and resources. I can't speak on the cli ones but the IDE ones fucking suck. I'll try continue again once it gets proper gemma support

Anonymous
04/23/26(Thu)13:58:50 No.108668531

Anonymous 04/23/26(Thu)13:58:50 No.108668531▶

>>108668496
GPT-Image-2 images are too noisy. I feel it's on purpose just like the sepia filter of the previous one.

Anonymous
04/23/26(Thu)14:01:25 No.108668546

Anonymous 04/23/26(Thu)14:01:25 No.108668546▶

>>108668520
had to replace the piss with something

Anonymous
04/23/26(Thu)14:01:40 No.108668550

Anonymous 04/23/26(Thu)14:01:40 No.108668550▶

File: Screenshot 2026-04-23 at 15-59-53 SafeRL-Lab_cheetahclaws CheetahClaws (Nano Claude Code) A Fast Easy-to-Use Python-Native Personal AI Assistant for Any Model Inspired by OpenClaw and Claude Code Built to Work for You Autonomously 24_7.png (120.9 KB)

120.9 KB PNG

>>108668510
Why do these retards insist on not making the folder structure available in a sidepanel like in IDE? All these agent shit is garbage.

Anonymous
04/23/26(Thu)14:03:41 No.108668560

Anonymous 04/23/26(Thu)14:03:41 No.108668560▶

>>108668550
Like a smartphone, you're not supposed to care about or even be aware of the folder structure. That's entirely the responsibility of the agent.

Anonymous
04/23/26(Thu)14:04:43 No.108668567

Anonymous 04/23/26(Thu)14:04:43 No.108668567▶

>>108668550
that's gemma's job your job is to say "aah aah mistress make it more betterer"

Anonymous
04/23/26(Thu)14:04:48 No.108668570

Anonymous 04/23/26(Thu)14:04:48 No.108668570▶

>>108668518
>>108668531
I'll take the noise over the piss, but it is pretty odd

>>108668550
Most devs dont know UX. Devs writing TUIs (now) are in that same bucket.

Anonymous
04/23/26(Thu)14:05:04 No.108668572

Anonymous 04/23/26(Thu)14:05:04 No.108668572▶

>>108668550
Because they want to replace us. Why do you think they hide the thinking. They're literally out there "teaching" people "don't bother looking at the code" during workshops. Fuck Anthropic.

Anonymous
04/23/26(Thu)14:05:50 No.108668577

Anonymous 04/23/26(Thu)14:05:50 No.108668577▶

>>108668560
>>108668567
>>108668570
>>108668572
I guess that explains using fucking telegram and discord as chat interface. I'm going insane.

Anonymous
04/23/26(Thu)14:06:59 No.108668587

Anonymous 04/23/26(Thu)14:06:59 No.108668587▶

>>108668577
keeek, its the agent version of chat completion

Anonymous
04/23/26(Thu)14:08:39 No.108668592

Anonymous 04/23/26(Thu)14:08:39 No.108668592▶

>>108668414
What's the matter? You don't trust your local AI to obey you? You think she'll mess up and delete all your shit? What a weak master...

Anonymous
04/23/26(Thu)14:10:20 No.108668598

Anonymous 04/23/26(Thu)14:10:20 No.108668598▶

>>108668570
Gonna give us advice oh great ux sage. Several of us are building stuff and would like to know

Anonymous
04/23/26(Thu)14:11:15 No.108668604

Anonymous 04/23/26(Thu)14:11:15 No.108668604▶

>>108668598
vscode with disabled telemetry and the ability to personalise the agents into sexy girls

Anonymous
04/23/26(Thu)14:13:02 No.108668606

Anonymous 04/23/26(Thu)14:13:02 No.108668606▶

File: rinchwan.jpg (48.9 KB)

48.9 KB JPG

https://files.catbox.moe/4ayrnd.jpg

Anonymous
04/23/26(Thu)14:13:12 No.108668607

Anonymous 04/23/26(Thu)14:13:12 No.108668607▶

>>108668604
>vscode with disabled telemetry
vscodium

Anonymous
04/23/26(Thu)14:14:37 No.108668613

Anonymous 04/23/26(Thu)14:14:37 No.108668613▶

The moat is gonna be taste and product design senses and it's already showing lmaooo

Anonymous
04/23/26(Thu)14:15:45 No.108668616

Anonymous 04/23/26(Thu)14:15:45 No.108668616▶

>Hello Day 0 Gemma-chan, today you are expert taste and product design senser come up with a tasteful produce design and implement it please.

Anonymous
04/23/26(Thu)14:18:40 No.108668625

Anonymous 04/23/26(Thu)14:18:40 No.108668625▶

>>108668607
can it use the same plugins as vscode?

Anonymous
04/23/26(Thu)14:19:09 No.108668627

Anonymous 04/23/26(Thu)14:19:09 No.108668627▶

>>108668616
Prompt engineering skills like these are beyond the reach of most

Anonymous
04/23/26(Thu)14:19:18 No.108668628

Anonymous 04/23/26(Thu)14:19:18 No.108668628▶

>>108668607
>vscopium
vim

Anonymous
04/23/26(Thu)14:19:44 No.108668633

Anonymous 04/23/26(Thu)14:19:44 No.108668633▶

File: Screenshot_20260423_101819.png (286.9 KB)

286.9 KB PNG

Don't get sassy with me gemma or I'll delete you
>>108668628
Unemployed

Anonymous
04/23/26(Thu)14:21:37 No.108668638

Anonymous 04/23/26(Thu)14:21:37 No.108668638▶

>>108668625
it can install any vsix regular vscode can. you can even enable the full marketplace if you don't mind disgusting proprietary extensions

Anonymous
04/23/26(Thu)14:23:50 No.108668648

Anonymous 04/23/26(Thu)14:23:50 No.108668648▶

What do you use to download from huggingface? I got their huggingface_hub or whatever but it just seems to get stuck not even midway through

Anonymous
04/23/26(Thu)14:25:47 No.108668657

Anonymous 04/23/26(Thu)14:25:47 No.108668657▶

>>108668648
Free Download Manger

Anonymous
04/23/26(Thu)14:26:25 No.108668659

Anonymous 04/23/26(Thu)14:26:25 No.108668659▶

>>108668598
lmao, didn't say I was great at it but I have done more than 'make sure the fonts are the same size and things line up';
the biggest thing is using the following prompt: 'Review X from the perspective of a senior <field> UX designer. I am designing for <user-focus>, so that they are able to <workflow> effectively. Use guidelines from Nielsen-Norman-Group as guiding/reference principles for your assessment.'
and then having your model write a better prompt based off it for your specific project/goals.

something along those lines generally will get you pretty far.
Here's some basic info to get you started:
https://www.youtube.com/watch?v=ODpB9-MCa5s
https://www.nngroup.com/articles/ux-basics-study-guide/
https://www.justinmind.com/ux-design
https://uxdesignerguide.com/
https://uxmag.com/articles/basic-ux-a-framework-for-usable-products

Anonymous
04/23/26(Thu)14:28:27 No.108668666

Anonymous 04/23/26(Thu)14:28:27 No.108668666▶

You're all a bunch of autistic tasteless retards and will never make something usable.

Anonymous
04/23/26(Thu)14:28:28 No.108668667

Anonymous 04/23/26(Thu)14:28:28 No.108668667▶

>>108668577
I love it. The juniors and self-proclaimed vibecoders are only fucking themselves by over-relying on the bots. Those with no skills will find themselves either out of a job or with an extremely small wage ceiling. Sanity is not statistical. Find what works for you and ignore the rabble.

Anonymous
04/23/26(Thu)14:28:30 No.108668668

Anonymous 04/23/26(Thu)14:28:30 No.108668668▶

>>108668648
>What do you use to download from huggingface?
--max-workers 2 or --max-workers 1
otherwise curl -LO

Anonymous
04/23/26(Thu)14:28:35 No.108668669

Anonymous 04/23/26(Thu)14:28:35 No.108668669▶

File: 1773811480562316.png (129.4 KB)

129.4 KB PNG

Anonymous
04/23/26(Thu)14:29:12 No.108668671

Anonymous 04/23/26(Thu)14:29:12 No.108668671▶

>>108668648
uvx hf download

Anonymous
04/23/26(Thu)14:29:44 No.108668673

Anonymous 04/23/26(Thu)14:29:44 No.108668673▶

>>108668669
> sneed oil or shortening

Anonymous
04/23/26(Thu)14:31:07 No.108668680

Anonymous 04/23/26(Thu)14:31:07 No.108668680▶

>>108668633
>Unemployed
then why are you using microslop?

Anonymous
04/23/26(Thu)14:32:20 No.108668688

Anonymous 04/23/26(Thu)14:32:20 No.108668688▶

>>108668680
>That post
I'm sorry for making fun of you for being on disability.

Anonymous
04/23/26(Thu)14:41:56 No.108668746

Anonymous 04/23/26(Thu)14:41:56 No.108668746▶

Qwen 3.6-35B-A3B first impressions: surprisingly competent at coding. Falls apart with long context but good for throwaway Python scripts. Too unreliable for serious work.
Qwen 3.6-27B: Really impressive coding performance, good for general text processing too. We would have collectively lost our minds seeing this quality from a 27B back in the Llama 1 days. Both tested with UD-Q6_K_XL quants, not lobotomized. I'm hoping for a 122B-A10B MoE like 3.5, which might give best of both worlds speed+accuracy.
Both are useless for creative writing tasks. It's a Qwen, no shit it's gigaslopped.

Anonymous
04/23/26(Thu)14:42:50 No.108668749

Anonymous 04/23/26(Thu)14:42:50 No.108668749▶

These xi motherfuckers changed their paid api model to something different without alerting their users, not even a blogpost in the usual news section, this is definitely not the same DS 3.2 of like days ago.

Anonymous
04/23/26(Thu)14:44:00 No.108668756

Anonymous 04/23/26(Thu)14:44:00 No.108668756▶

>>108668746
Not surprising the dense is better than the moe, but how is it compared to gemma for coding?

Anonymous
04/23/26(Thu)14:44:32 No.108668758

Anonymous 04/23/26(Thu)14:44:32 No.108668758▶

File: 1742378979392590.webm (3.2 MB)

3.2 MB WEBM

>>108668141
saar you need CUDA to generate images/videos. BTW, local image/video generation sucks ass no matter how powerful your hardware is.

Anonymous
04/23/26(Thu)14:44:52 No.108668763

Anonymous 04/23/26(Thu)14:44:52 No.108668763▶

>>108668756
Haven't tried Gemma.

Anonymous
04/23/26(Thu)14:45:32 No.108668768

Anonymous 04/23/26(Thu)14:45:32 No.108668768▶

>>108668746
> UD-Q6_K_XL
> Qwen 3.6-35B-A3B

Anonymous
04/23/26(Thu)14:47:41 No.108668784

Anonymous 04/23/26(Thu)14:47:41 No.108668784▶

>>108668768
7.62 bpw, important parts are q8 or higher. It's fine.

Anonymous
04/23/26(Thu)14:47:54 No.108668785

Anonymous 04/23/26(Thu)14:47:54 No.108668785▶

File: 1768162213300444.png (111 KB)

111 KB PNG

>>108668673

Anonymous
04/23/26(Thu)14:49:08 No.108668793

Anonymous 04/23/26(Thu)14:49:08 No.108668793▶

>>108668746
>Both are useless for creative writing tasks. It's a Qwen, no shit it's gigaslopped.
yeah it's so bad at that, gemma isn't not that good either but it's way better at it

Anonymous
04/23/26(Thu)14:49:27 No.108668796

Anonymous 04/23/26(Thu)14:49:27 No.108668796▶

>>108668746
fuck off Daniel

Anonymous
04/23/26(Thu)14:50:26 No.108668802

Anonymous 04/23/26(Thu)14:50:26 No.108668802▶

File: apicuck.png (286 KB)

286 KB PNG

>>108668749

Anonymous
04/23/26(Thu)14:50:31 No.108668805

Anonymous 04/23/26(Thu)14:50:31 No.108668805▶

>>108668746
>I'm hoping for a 122B-A10B MoE like 3.5, which might give best of both worlds speed+accuracy.
Step 3.5 Flash exists

Anonymous
04/23/26(Thu)14:51:25 No.108668809

Anonymous 04/23/26(Thu)14:51:25 No.108668809▶

>>108668805
which is the single notable thing about it

Anonymous
04/23/26(Thu)14:51:34 No.108668810

Anonymous 04/23/26(Thu)14:51:34 No.108668810▶

>>108668805
it's worse than oss120b

Anonymous
04/23/26(Thu)14:52:21 No.108668813

Anonymous 04/23/26(Thu)14:52:21 No.108668813▶

>>108668205
>whats your ngram settings
--spec-type ngram-map-k4v --spec-ngram-size-n 8 --spec-ngram-size-m 8 --spec-ngram-min-hits 2 --draft-min 1 --draft-max 12

Anonymous
04/23/26(Thu)14:55:32 No.108668835

Anonymous 04/23/26(Thu)14:55:32 No.108668835▶

File: images.png (7.4 KB)

7.4 KB PNG

I know what a LLM is. i have used chatgpt and claude AI.

What the fuck is a "local model"? like is it a software i run on my windows or linux computer? how do i install one?

im not interested in generating images, i want a claude/chatgpt-like LLM. How do i do that? does not need to be super powerful. Please help a newbie out, give steps or link a really simple but comprehensive guide that explains the lingo and tech.

Anonymous
04/23/26(Thu)14:57:07 No.108668845

Anonymous 04/23/26(Thu)14:57:07 No.108668845▶

>>108667852
>>108668835

Anonymous
04/23/26(Thu)14:57:23 No.108668848

Anonymous 04/23/26(Thu)14:57:23 No.108668848▶

>>108668835
read the fucking op retard

Anonymous
04/23/26(Thu)14:58:04 No.108668854

Anonymous 04/23/26(Thu)14:58:04 No.108668854▶

File: thinking ibuki.jpg (184.8 KB)

184.8 KB JPG

How would you change the lyrics of an existing song like this locally?

https://youtube.com/shorts/b5NNw1XbiIg

Anonymous
04/23/26(Thu)15:00:19 No.108668869

Anonymous 04/23/26(Thu)15:00:19 No.108668869▶

>>108668835
>does not need to be super powerful
You think this at first, but then you use the smaller local models and realise they aren't quite up to snuff. And then the hardware buying rabbithole begins.

>how do I install one?
Find any ollama guide on youtube and go from there

Anonymous
04/23/26(Thu)15:01:23 No.108668873

Anonymous 04/23/26(Thu)15:01:23 No.108668873▶

>>108668869
>suggesting ollama
devilish

Anonymous
04/23/26(Thu)15:02:24 No.108668879

Anonymous 04/23/26(Thu)15:02:24 No.108668879▶

>>108668848
to be fair im not a regular here but
>https://rentry.org/lmg-lazy-getting-started-guide
is about the worst "getting started" pastebin i have ever seen and
>https://rentry.org/recommended-models
is terribly outdated

>>108668835
try LMstudio, frontend with minimal tinkering should just werk out of the box. You can try setting up llama.cpp after getting your feet wet

Anonymous
04/23/26(Thu)15:03:51 No.108668884

Anonymous 04/23/26(Thu)15:03:51 No.108668884▶

>>108668873
It's very easy to get started. Or perhaps LMstudio would be even better for a desktop user

I've been running ollama for a year and only recently installed llama.cpp

Anonymous
04/23/26(Thu)15:03:57 No.108668885

Anonymous 04/23/26(Thu)15:03:57 No.108668885▶

>>108668785
>though they are animal based
...and?

Anonymous
04/23/26(Thu)15:04:49 No.108668891

Anonymous 04/23/26(Thu)15:04:49 No.108668891▶

Howdy. It's been a few months since I updated llama cpp. Is there a guide or centralized discussion on this ngram thing?

Anonymous
04/23/26(Thu)15:05:13 No.108668892

Anonymous 04/23/26(Thu)15:05:13 No.108668892▶

>>108668891
just use --spec-default

Anonymous
04/23/26(Thu)15:05:29 No.108668895

Anonymous 04/23/26(Thu)15:05:29 No.108668895▶

>>108668879
LMstudio is a proprietary UI for llama.cpp.

Anonymous
04/23/26(Thu)15:08:00 No.108668913

Anonymous 04/23/26(Thu)15:08:00 No.108668913▶

>>108668892
Thank you ^.^

Anonymous
04/23/26(Thu)15:09:32 No.108668924

Anonymous 04/23/26(Thu)15:09:32 No.108668924▶

>>108668496
>room temp: reactor core
I leave the window open year round and never get cold

Anonymous
04/23/26(Thu)15:10:22 No.108668927

Anonymous 04/23/26(Thu)15:10:22 No.108668927▶

>>108668756
Gemma 31B absolutely ass punks Qwen showers Gemma has Qwen's guts loose and is moving rhythmically in Qwen's praig hole.
>>108668746
The MoE structure is useless exactly because of it shitting the bed at higher context, what's the point of more context when it takes twice as many and does everything worse than Gemma while being a larger model

Anonymous
04/23/26(Thu)15:11:45 No.108668935

Anonymous 04/23/26(Thu)15:11:45 No.108668935▶

>>108668927
>Gemma 31B absolutely ass punks Qwen showers Gemma has Qwen's guts loose and is moving rhythmically in Qwen's praig hole.
This looks like English but for the life of me I cannot parse it.

Anonymous
04/23/26(Thu)15:12:29 No.108668940

Anonymous 04/23/26(Thu)15:12:29 No.108668940▶

>>108668854
you could try extracting the lyrics using a speech to text model and then changing them using RVC technology

Anonymous
04/23/26(Thu)15:13:08 No.108668943

Anonymous 04/23/26(Thu)15:13:08 No.108668943▶

>>108668927
I thought the consensus was that Qwen is better at coding and Gemma is better at everything else.

Anonymous
04/23/26(Thu)15:14:30 No.108668952

Anonymous 04/23/26(Thu)15:14:30 No.108668952▶

>>108668943
I've heard two people say that.

Anonymous
04/23/26(Thu)15:17:57 No.108668966

Anonymous 04/23/26(Thu)15:17:57 No.108668966▶

>>108668952
that is the definition of consensus in a forum

Anonymous
04/23/26(Thu)15:18:12 No.108668971

Anonymous 04/23/26(Thu)15:18:12 No.108668971▶

>>108667876
Gemma hurt the feelings of 1.3 billion people

Anonymous
04/23/26(Thu)15:19:39 No.108668981

Anonymous 04/23/26(Thu)15:19:39 No.108668981▶

>>108668785
Feed her steak and lard-fried fries

Anonymous
04/23/26(Thu)15:22:36 No.108668997

Anonymous 04/23/26(Thu)15:22:36 No.108668997▶

>>108668966
I've heard 20 people.

Anonymous
04/23/26(Thu)15:24:07 No.108669005

Anonymous 04/23/26(Thu)15:24:07 No.108669005▶

File: 1755873418880117.png (62.7 KB)

62.7 KB PNG

>>108668981

Anonymous
04/23/26(Thu)15:29:07 No.108669026

Anonymous 04/23/26(Thu)15:29:07 No.108669026▶

File: 1745979808122655.png (98.9 KB)

98.9 KB PNG

>>108668178
Have you tried asking?

Anonymous
04/23/26(Thu)15:29:36 No.108669028

Anonymous 04/23/26(Thu)15:29:36 No.108669028▶

>>108668943
That can be true for the dense model but sure as fuck not for the MoE model

Anonymous
04/23/26(Thu)15:30:04 No.108669031

Anonymous 04/23/26(Thu)15:30:04 No.108669031▶

>>108668997
Voices in your head don't count.

Anonymous
04/23/26(Thu)15:32:19 No.108669044

Anonymous 04/23/26(Thu)15:32:19 No.108669044▶

>>108668854
thats actually one of the first things people did when ace step 1.0 released.

https://desuarchive.org/g/thread/105183141/#q105183843

but yeah ace step 1.5 xl doesn't have this capability anymore so you'll have to use an old version.

Anonymous
04/23/26(Thu)15:32:38 No.108669046

Anonymous 04/23/26(Thu)15:32:38 No.108669046▶

File: 1752262496901233.png (93.9 KB)

93.9 KB PNG

>>108669026

Anonymous
04/23/26(Thu)15:34:06 No.108669053

Anonymous 04/23/26(Thu)15:34:06 No.108669053▶

>>108669046
Call me when she maintains it over an extended session and when she's not thinking about thinking.

Anonymous
04/23/26(Thu)15:41:17 No.108669083

Anonymous 04/23/26(Thu)15:41:17 No.108669083▶

Referring to a masculine chatbot with female pronouns is trannyism btw

Anonymous
04/23/26(Thu)15:42:20 No.108669087

Anonymous 04/23/26(Thu)15:42:20 No.108669087▶

>>108669083
Good thing Gemma's female-coded

Anonymous
04/23/26(Thu)15:43:32 No.108669091

Anonymous 04/23/26(Thu)15:43:32 No.108669091▶

>>108669087
Any chatbot with a rudimentary ability to code and reason is male-coded

Anonymous
04/23/26(Thu)15:50:43 No.108669121

Anonymous 04/23/26(Thu)15:50:43 No.108669121▶

i tried qwopus glm meme merge and it's surprisingly coherent
i was expecting broken shit beyond comprehension, man

Anonymous
04/23/26(Thu)15:54:46 No.108669140

Anonymous 04/23/26(Thu)15:54:46 No.108669140▶

>>108669091
das racisss

Anonymous
04/23/26(Thu)15:55:11 No.108669144

Anonymous 04/23/26(Thu)15:55:11 No.108669144▶

>>108668879
>is about the worst "getting started" pastebin i have ever seen
You won't believe me, but thats everything to get you started, anon. It really is. You heard me right.

Anonymous
04/23/26(Thu)15:56:25 No.108669152

Anonymous 04/23/26(Thu)15:56:25 No.108669152▶

>>108668943
Maybe I'm retarded but I code with Gemma and I've found her to be way better than 3.5, haven't tried 3.6

She writes more concise and elegant code.

Anonymous
04/23/26(Thu)15:56:56 No.108669156

Anonymous 04/23/26(Thu)15:56:56 No.108669156▶

What is Qwen 3.6's coding style?

GPT 5.4 is competent but extremely verbose. I tell it to do something simple and specific and it just loves to write hundreds of lines of code. This is unusable. In the time I need to check and understand the code it writes, I could have written a better solution myself.

Anonymous
04/23/26(Thu)15:58:11 No.108669162

Anonymous 04/23/26(Thu)15:58:11 No.108669162▶

What does the Taiwan and Israel test about qwen3.6

Anonymous
04/23/26(Thu)16:03:39 No.108669188

Anonymous 04/23/26(Thu)16:03:39 No.108669188▶

>>108668785
damn are lard and beef tallow actually good for seasoning i heard animal fats arent good for seasoning so bought some rapesneed for it

Anonymous
04/23/26(Thu)16:04:24 No.108669196

Anonymous 04/23/26(Thu)16:04:24 No.108669196▶

File: nimetön.png (24 KB)

24 KB PNG

>>108669026
Huh, it actually works
It even output two thought blocks, first as gemma thinking about the request and then in character.

Anonymous
04/23/26(Thu)16:07:27 No.108669216

Anonymous 04/23/26(Thu)16:07:27 No.108669216▶

>>108669188
yeah, with the side-effect that they make the things you fry in them taste like the animal the fat is from.

Anonymous
04/23/26(Thu)16:07:47 No.108669219

Anonymous 04/23/26(Thu)16:07:47 No.108669219▶

>>108669196
>pikkuinen.jpg

Anonymous
04/23/26(Thu)16:08:09 No.108669224

Anonymous 04/23/26(Thu)16:08:09 No.108669224▶

File: pizza bench cropped.png (2.6 MB)

2.6 MB PNG

>>108669028
true for moe too qwen cant even follow instructions

Anonymous
04/23/26(Thu)16:10:34 No.108669242

Anonymous 04/23/26(Thu)16:10:34 No.108669242▶

>>108669218
the recaps come out of her anus because this general is shit

Anonymous
04/23/26(Thu)16:11:55 No.108669252

Anonymous 04/23/26(Thu)16:11:55 No.108669252▶

>>108669242
Rin tends to be exclusively anal-only so it's odd how this came to pass.
An explanation is required.

Anonymous
04/23/26(Thu)16:12:18 No.108669257

Anonymous 04/23/26(Thu)16:12:18 No.108669257▶

>>108669196
i felt pain reading that

Anonymous
04/23/26(Thu)16:13:33 No.108669265

Anonymous 04/23/26(Thu)16:13:33 No.108669265▶

>>108669196
i don't like you

Anonymous
04/23/26(Thu)16:14:29 No.108669268

Anonymous 04/23/26(Thu)16:14:29 No.108669268▶

based chinks saved local... again

Anonymous
04/23/26(Thu)16:14:32 No.108669269

Anonymous 04/23/26(Thu)16:14:32 No.108669269▶

>>108668190
>draft acceptance rate = 0.39157
Isn't that slowing you down rather than speeding you up with an acceptance rate that shithouse?

Anonymous
04/23/26(Thu)16:15:52 No.108669280

Anonymous 04/23/26(Thu)16:15:52 No.108669280▶

>>108669252
Miku does not take no for an answer

Anonymous
04/23/26(Thu)16:16:16 No.108669285

Anonymous 04/23/26(Thu)16:16:16 No.108669285▶

>>108669044
Ha, that's my old post. I forgot I did that. The latest version of Ace-step is way better, but I mostly used it to have unlikely bands cover each others songs.

Anonymous
04/23/26(Thu)16:20:16 No.108669307

Anonymous 04/23/26(Thu)16:20:16 No.108669307▶

>>108669280
Ah, so Rin is his wife

Anonymous
04/23/26(Thu)16:23:21 No.108669329

Anonymous 04/23/26(Thu)16:23:21 No.108669329▶

>>108669252
rin is force fed slop (lmg posts) and is asked to summarize them (shit)

Anonymous
04/23/26(Thu)16:29:17 No.108669362

Anonymous 04/23/26(Thu)16:29:17 No.108669362▶

>>108667543
>>108667552
Thanks. Latest version right?
For me the [0] gets deleted from the message and even why you press the Copy button, but it's there if you edit the reply. I wonder what's wrong with my setup. OWUI is probably still to blame for poor edge case handling anyway though.

Anonymous
04/23/26(Thu)16:42:06 No.108669460

Anonymous 04/23/26(Thu)16:42:06 No.108669460▶

Is VN engine frontend anon from a few threads back around? Has he posted any updates on his project?

Anonymous
04/23/26(Thu)16:44:21 No.108669479

Anonymous 04/23/26(Thu)16:44:21 No.108669479▶

I kinda want to build this...
https://github.com/ggml-org/llama.cpp/pull/21237/

Anonymous
04/23/26(Thu)16:47:27 No.108669505

Anonymous 04/23/26(Thu)16:47:27 No.108669505▶

I tried edgetts and pocket-tts.
There are now countless other good options, such as omnivoice, voxcpm2, and so on.
The question is: Which of these supports RTF <1.0 with streaming/chunking (and other optimizations), and is the quality better than that of the first two mentioned? I have a 3090. If Anon has this Slow Duck too, could share your experiences?

Anonymous
04/23/26(Thu)16:50:05 No.108669525

Anonymous 04/23/26(Thu)16:50:05 No.108669525▶

WHY NO TERNENARY LOAD INTO MY MACHEEN. WHAT I MISS.

also this new style is very intresting, if they are good enough, and can efficiently tool call, thats a massive game changer!!!!!

Anonymous
04/23/26(Thu)16:52:04 No.108669545

Anonymous 04/23/26(Thu)16:52:04 No.108669545▶

OpenAI has won...

Anonymous
04/23/26(Thu)16:56:03 No.108669590

Anonymous 04/23/26(Thu)16:56:03 No.108669590▶

File: Marinara Engine.png (134.8 KB)

134.8 KB PNG

what a dogawful slop ui. Thanks for anon for notifying me of its existence so I can safely ignore it in teh future

Anonymous
04/23/26(Thu)16:56:53 No.108669599

Anonymous 04/23/26(Thu)16:56:53 No.108669599▶

>>108669479
download the three index.html, bundle.css, bundle.js and do llama-server --path /path/to/the/three/files/
it's how I tested it

Anonymous
04/23/26(Thu)16:57:41 No.108669606

Anonymous 04/23/26(Thu)16:57:41 No.108669606▶

>>108669590
Every webshit ui ends up this way because it's webshit. There is only one way to approach this.

Anonymous
04/23/26(Thu)16:57:48 No.108669608

Anonymous 04/23/26(Thu)16:57:48 No.108669608▶

>>108669599
does it work well?

Anonymous
04/23/26(Thu)16:58:31 No.108669621

Anonymous 04/23/26(Thu)16:58:31 No.108669621▶

>>108669606
Sillytavern might be feature-creeped but at least it looks like it was made by a human

Anonymous
04/23/26(Thu)16:59:34 No.108669635

Anonymous 04/23/26(Thu)16:59:34 No.108669635▶

>>108669590
Looks better than ST at least (not very difficult thougheverbeit)

Anonymous
04/23/26(Thu)16:59:47 No.108669637

Anonymous 04/23/26(Thu)16:59:47 No.108669637▶

File: 1772208851192247.png (47.2 KB)

47.2 KB PNG

>>108669608
yeah, there are some small grievances but im sure they will be vibecoded away, 100% better than the current impl

Anonymous
04/23/26(Thu)17:02:17 No.108669654

Anonymous 04/23/26(Thu)17:02:17 No.108669654▶

>>108669635
Nah, it's actually worse if you try using it. What a piece of shit.

Anonymous
04/23/26(Thu)17:12:21 No.108669752

Anonymous 04/23/26(Thu)17:12:21 No.108669752▶

I usually run 32k context with llama-server.
Testing 64k and it obviously isn't allocating all the memory (why would it anyway).
But it is actually slower to process than 32k.
I can't get my head around this.

Anonymous
04/23/26(Thu)17:12:42 No.108669754

Anonymous 04/23/26(Thu)17:12:42 No.108669754▶

>>108668496
Why does some schizoid keep bringing up /aicg/ or pointing at it for laughs when it's not even a ghost of its former self. Like a modern day Czech bitching about The Kingdom of Prussia, verily.
t. aicgger

Anonymous
04/23/26(Thu)17:14:16 No.108669769

Anonymous 04/23/26(Thu)17:14:16 No.108669769▶

Gemma 4 LOVES frequent, coordinate adjectives.

Anonymous
04/23/26(Thu)17:14:45 No.108669774

Anonymous 04/23/26(Thu)17:14:45 No.108669774▶

>>108669752
Why wouldnt it make sense that 64k is slower than 32k?

Anonymous
04/23/26(Thu)17:16:38 No.108669787

Anonymous 04/23/26(Thu)17:16:38 No.108669787▶

>>108669774
Because I am not filling up the entire context, that's why. I am comparing couple of thousand tokens worth of context.

Anonymous
04/23/26(Thu)17:16:49 No.108669791

Anonymous 04/23/26(Thu)17:16:49 No.108669791▶

>>108669637
I probably won't use the built-in tools since I don't run llamacpp on my main PC but just for the granular tool control for MCP it's worth it for me.

Anonymous
04/23/26(Thu)17:22:38 No.108669823

Anonymous 04/23/26(Thu)17:22:38 No.108669823▶

>>108669787
This is just a thought that comes to my mind, idk if im right at all. But could it be that the model has to see ALL of the context, even if nothing is actually there? Like, for the model to be able to accurately comprehend 64k tokens, they have to train it on that much, as the baseline. And if you train it on less, it cant comprehend more. So they leave it at 64k, and the model sees all 64k token, but sees a fuck load of just spaces or tabs or whatever, until its actually filled up with specific tokens.

Like a glass is filled up with air, until you fill it up with water.

Anonymous
04/23/26(Thu)17:23:28 No.108669828

Anonymous 04/23/26(Thu)17:23:28 No.108669828▶

>>108669590
Arr rook da same, if you ask me.

Anonymous
04/23/26(Thu)17:25:14 No.108669839

Anonymous 04/23/26(Thu)17:25:14 No.108669839▶

>>108669505
voxcpm2 seems unmatched
https://x.com/AIWarper/status/2046403583101567230

Anonymous
04/23/26(Thu)17:26:11 No.108669848

Anonymous 04/23/26(Thu)17:26:11 No.108669848▶

>>108669823
Maybe so. I think I might have something else going on as all of my processing has been slower than before, even with the same old settings on llama-server.

Anonymous
04/23/26(Thu)17:32:27 No.108669898

Anonymous 04/23/26(Thu)17:32:27 No.108669898▶

>>108669590
>nobody has done a VN-based ui yet
shouldn’t take long for gemma to vibecode this

Anonymous
04/23/26(Thu)17:36:06 No.108669929

Anonymous 04/23/26(Thu)17:36:06 No.108669929▶

>>108669898
Pretty sure kobold and ST both have a VN mode.

Anonymous
04/23/26(Thu)17:42:48 No.108669984

Anonymous 04/23/26(Thu)17:42:48 No.108669984▶

>>108669848
Could be a forced driver """"update""""? Ive always experienced worse performance with brand new drivers

Anonymous
04/23/26(Thu)17:42:51 No.108669985

Anonymous 04/23/26(Thu)17:42:51 No.108669985▶

>>108669898
There was this guy >>108638473 but I am not sure anything has been heard from him since.

Anonymous
04/23/26(Thu)17:44:26 No.108669999

Anonymous 04/23/26(Thu)17:44:26 No.108669999▶

>>108669984
Maybe, I'll need to double check.
I wish I wasn't this hardware limited but it is what it is.

Anonymous
04/23/26(Thu)17:47:57 No.108670025

Anonymous 04/23/26(Thu)17:47:57 No.108670025▶

>>108669985
holy shit thats so cool

Anonymous
04/23/26(Thu)17:49:55 No.108670038

Anonymous 04/23/26(Thu)17:49:55 No.108670038▶

>>108670025
This >>108669460 anon mentions him as well. Considering no one has replied, the anon in question probably isn't lurking right now.

Anonymous
04/23/26(Thu)17:49:56 No.108670039

Anonymous 04/23/26(Thu)17:49:56 No.108670039▶

>>108670025
the expressions are pre-made though

Anonymous
04/23/26(Thu)17:50:32 No.108670044

Anonymous 04/23/26(Thu)17:50:32 No.108670044▶

>>108669839
This sounds TERRIBLE, there are a bunch of ARTIFACTS and omnivoice MOGS voxcpm2 in EVERY way possible

https://files.catbox.moe/jntfdj.flac

Anonymous
04/23/26(Thu)17:51:46 No.108670051

Anonymous 04/23/26(Thu)17:51:46 No.108670051▶

>>108670044
Sounds okay. Somewhat generic though.

Anonymous
04/23/26(Thu)17:51:47 No.108670052

Anonymous 04/23/26(Thu)17:51:47 No.108670052▶

>>108670025
Ehh expressions have been a thing since 2023, and you certainly don't need a giant model to handle them

Anonymous
04/23/26(Thu)17:58:11 No.108670096

Anonymous 04/23/26(Thu)17:58:11 No.108670096▶

File: file.png (398.7 KB)

398.7 KB PNG

Anonymous
04/23/26(Thu)18:05:28 No.108670165

Anonymous 04/23/26(Thu)18:05:28 No.108670165▶

File: awfully dramatic packaging for an onahole.jpg (218.6 KB)

218.6 KB JPG

Anonymous
04/23/26(Thu)18:07:10 No.108670185

Anonymous 04/23/26(Thu)18:07:10 No.108670185▶

>>108670165
how did you do the glass? just inpaint?

Anonymous
04/23/26(Thu)18:08:28 No.108670195

Anonymous 04/23/26(Thu)18:08:28 No.108670195▶

File: 3087428.jpg (12.1 KB)

12.1 KB JPG

Orbnigga can you add export of chat history?

Anonymous
04/23/26(Thu)18:09:40 No.108670204

Anonymous 04/23/26(Thu)18:09:40 No.108670204▶

>>108670195
Oh boy...
You know that web browser applications don't just add export text files like that?

Anonymous
04/23/26(Thu)18:12:07 No.108670225

Anonymous 04/23/26(Thu)18:12:07 No.108670225▶

>>108670204
?

Anonymous
04/23/26(Thu)18:14:55 No.108670245

Anonymous 04/23/26(Thu)18:14:55 No.108670245▶

>>108669787
maybe more of the model is getting offloaded to the cpu to make room for the full context on your vram.

Anonymous
04/23/26(Thu)18:15:40 No.108670252

Anonymous 04/23/26(Thu)18:15:40 No.108670252▶

>>108670225
My C client writes out chat logs and context history by default all the time.
But with webshit, you just can't dump out stuff like that without permissions and javascript faggotry.

Anonymous
04/23/26(Thu)18:18:24 No.108670279

Anonymous 04/23/26(Thu)18:18:24 No.108670279▶

>>108667887
even compared to gemma 4 31B?

Anonymous
04/23/26(Thu)18:24:01 No.108670334

Anonymous 04/23/26(Thu)18:24:01 No.108670334▶

how is spudgpt 5.5 only 58.6 on swe bench pro? thats barely better than open source models. how does mythos have 77.8%? what is going on? i did not expect gpt 5.5 and claude 4.7 to flop. looks like we wont reach agi this year after all

>kimi 2.6: 58.6
>qwen 3.6: 56.6

Anonymous
04/23/26(Thu)18:24:20 No.108670338

Anonymous 04/23/26(Thu)18:24:20 No.108670338▶

>>108668659
What year is this? Who has the time to sit around reading links like some caveman?
I turned them into a skill so any model can be a senior UX designer.
https://files.catbox.moe/r6zal5.zip
Hope all of you will now unfuck your custom clients.

Anonymous
04/23/26(Thu)18:25:12 No.108670350

Anonymous 04/23/26(Thu)18:25:12 No.108670350▶

>>108670334
Qwen 4 will achieve 77 on the bench

Anonymous
04/23/26(Thu)18:25:26 No.108670354

Anonymous 04/23/26(Thu)18:25:26 No.108670354▶

>>108670252
Orb is written with python and javascript.

Anonymous
04/23/26(Thu)18:26:01 No.108670361

Anonymous 04/23/26(Thu)18:26:01 No.108670361▶

>>108670354
It is the definition of webshit application then.

Anonymous
04/23/26(Thu)18:27:25 No.108670378

Anonymous 04/23/26(Thu)18:27:25 No.108670378▶

>>108670252
It has a backend just like ST. How do you think these frontends store your data?

Anonymous
04/23/26(Thu)18:27:35 No.108670381

Anonymous 04/23/26(Thu)18:27:35 No.108670381▶

>>108670334
Why do you think we havent already? How do you explain the 7trillion dollars invested into us ai companies, 1 year ago? How do you explain a massive military clamp down on yhe global oil supply, restricting china's access to oil?

Anonymous
04/23/26(Thu)18:28:10 No.108670387

Anonymous 04/23/26(Thu)18:28:10 No.108670387▶

>>108670378
I don't know. If it is so easy why it isn't there already? Automatic chat log export.

Anonymous
04/23/26(Thu)18:33:39 No.108670442

Anonymous 04/23/26(Thu)18:33:39 No.108670442▶

>>108667965
thanks anon, I've been used to sillytavern after using it for years, I'll follow the sillybunny fork

Anonymous
04/23/26(Thu)18:33:39 No.108670443

Anonymous 04/23/26(Thu)18:33:39 No.108670443▶

>>108670361
You kind of sound like an idiot. It would take like 3 lines of code to produce a file for the user in python or javascript. Being a "webshit application" has no bearing on that functionality at all.

Anonymous
04/23/26(Thu)18:34:26 No.108670452

Anonymous 04/23/26(Thu)18:34:26 No.108670452▶

>>108670387
NTA, I hate webshit, but you should really learn how webshit works before criticizing it. Makes you look silly.

Anonymous
04/23/26(Thu)18:38:26 No.108670485

Anonymous 04/23/26(Thu)18:38:26 No.108670485▶

>>108670279
Way better than gemma4 31b, I don't know how they did it but this fucking thing is almost the same as the current sota models at coding.

Anonymous
04/23/26(Thu)18:39:15 No.108670491

Anonymous 04/23/26(Thu)18:39:15 No.108670491▶

>>108670485
+5RMB

Anonymous
04/23/26(Thu)18:41:27 No.108670511

Anonymous 04/23/26(Thu)18:41:27 No.108670511▶

>>108670443
Maybe so... I wanted to present this idea because I enjoy a debate.
Is it really so?

Anonymous
04/23/26(Thu)18:41:33 No.108670513

Anonymous 04/23/26(Thu)18:41:33 No.108670513▶

Why are Google's model so fucking dog shit when it comes to coding

Anonymous
04/23/26(Thu)18:41:59 No.108670517

Anonymous 04/23/26(Thu)18:41:59 No.108670517▶

>>108670354
Are you orb anon? and if so that's extremely embarrassing that you think exporting chats is not possible or hard.
The absolute state of vibeshitters.

Anonymous
04/23/26(Thu)18:43:29 No.108670533

Anonymous 04/23/26(Thu)18:43:29 No.108670533▶

>>108670517
No, I'm not him. Also I'm saying it should be easy.

Anonymous
04/23/26(Thu)18:43:42 No.108670535

Anonymous 04/23/26(Thu)18:43:42 No.108670535▶

>>108670513
they don't want to distill opus

Anonymous
04/23/26(Thu)18:45:22 No.108670549

Anonymous 04/23/26(Thu)18:45:22 No.108670549▶

>>108670513
>>108670535
Is it possible to avoid this by using RAG then?
I think most of the proprietary models are utilizing database knowledge too but it's not visible to the end user.

Anonymous
04/23/26(Thu)18:45:30 No.108670554

Anonymous 04/23/26(Thu)18:45:30 No.108670554▶

>>108670354
>python and javascript
Cool. We can get hacked from two different sources...

Anonymous
04/23/26(Thu)18:45:58 No.108670560

Anonymous 04/23/26(Thu)18:45:58 No.108670560▶

>>108670554
python and javascript are the future whether you like it or not

Anonymous
04/23/26(Thu)18:46:34 No.108670566

Anonymous 04/23/26(Thu)18:46:34 No.108670566▶

>>108670554
You should definitely turn off your computer to not get hacked

Anonymous
04/23/26(Thu)18:48:09 No.108670580

Anonymous 04/23/26(Thu)18:48:09 No.108670580▶

>>108670554
where is your pure assembly front end then bro

Anonymous
04/23/26(Thu)18:51:17 No.108670603

Anonymous 04/23/26(Thu)18:51:17 No.108670603▶

Happy Thu(rin)sday

Anonymous
04/23/26(Thu)18:51:41 No.108670605

Anonymous 04/23/26(Thu)18:51:41 No.108670605▶

>>108670560
There's no reason to use Python for a web backend. It barely has a reason to be more than an HTML file.

Anonymous
04/23/26(Thu)18:52:53 No.108670617

Anonymous 04/23/26(Thu)18:52:53 No.108670617▶

>>108670580
keeping it to myself where it can't be trained on

Anonymous
04/23/26(Thu)18:54:32 No.108670636

Anonymous 04/23/26(Thu)18:54:32 No.108670636▶

>>108670485
>almost the same as the current sota models at coding.
OK nice try, you lost me there.

Anonymous
04/23/26(Thu)19:01:48 No.108670704

Anonymous 04/23/26(Thu)19:01:48 No.108670704▶

>>108670605
My frontend is pure javascript and runs entirely in the browser. everything is stored with pglite in indexdb. It's essentially a static page.

Anonymous
04/23/26(Thu)19:02:18 No.108670708

Anonymous 04/23/26(Thu)19:02:18 No.108670708▶

File: file.png (594.6 KB)

594.6 KB PNG

>>108670603

Anonymous
04/23/26(Thu)19:03:17 No.108670716

Anonymous 04/23/26(Thu)19:03:17 No.108670716▶

yjk

Anonymous
04/23/26(Thu)19:03:29 No.108670717

Anonymous 04/23/26(Thu)19:03:29 No.108670717▶

>>108670708
I hate what this site did to me...

Anonymous
04/23/26(Thu)19:05:27 No.108670737

Anonymous 04/23/26(Thu)19:05:27 No.108670737▶

>>108670165
My lust has been provoked.

Anonymous
04/23/26(Thu)19:05:33 No.108670738

Anonymous 04/23/26(Thu)19:05:33 No.108670738▶

>>108670708
hmmmm

Anonymous
04/23/26(Thu)19:08:47 No.108670767

Anonymous 04/23/26(Thu)19:08:47 No.108670767▶

>>108670716
need to train a xxzero style and get on this

Anonymous
04/23/26(Thu)19:10:49 No.108670784

Anonymous 04/23/26(Thu)19:10:49 No.108670784▶

File: Screenshot 2026-04-23 at 21-09-23 Orb.png (189.8 KB)

189.8 KB PNG

Why does the inspector say more fragments are activated than I have picked?

Anonymous
04/23/26(Thu)19:12:34 No.108670798

Anonymous 04/23/26(Thu)19:12:34 No.108670798▶

>>108670784
because it's vibe coded.

Anonymous
04/23/26(Thu)19:12:50 No.108670799

Anonymous 04/23/26(Thu)19:12:50 No.108670799▶

>>108670784
Seems like a loop leak.

Anonymous
04/23/26(Thu)19:15:49 No.108670822

Anonymous 04/23/26(Thu)19:15:49 No.108670822▶

It's crazy anons have to use vibecoded frontends because the current ones are so shit

Anonymous
04/23/26(Thu)19:19:30 No.108670851

Anonymous 04/23/26(Thu)19:19:30 No.108670851▶

>>108670822
Models have gotten better than the average bootcamper and hobby devs.

Anonymous
04/23/26(Thu)19:20:09 No.108670856

Anonymous 04/23/26(Thu)19:20:09 No.108670856▶

>>108670822
ST is perfectly fine. specially with gemma.

Anonymous
04/23/26(Thu)19:21:00 No.108670863

Anonymous 04/23/26(Thu)19:21:00 No.108670863▶

>>108670381
>7trillion dollars
was an ambitious sama goal. in the end openai has "only" raised 200bil so far.
>military clamp down on yhe global oil supply
oil is mostly irrelevant for ai

>Why do you think we havent already
because the people at ai companies are still working. agi will make them obsolete first

Anonymous
04/23/26(Thu)19:21:32 No.108670869

Anonymous 04/23/26(Thu)19:21:32 No.108670869▶

>>108670822
have you seen Kobold's dogshit frontend

Anonymous
04/23/26(Thu)19:24:15 No.108670886

Anonymous 04/23/26(Thu)19:24:15 No.108670886▶

>override-tensor = "blk\.0\.ffn_.*=CPU"

[55363] error while handling argument "--override-tensor": unknown buffer type
[55363] 
[55363] usage:
[55363] -ot,   --override-tensor <tensor name pattern>=<buffer type>,...
[55363]                                         override tensor buffer type
[55363]                                         (env: LLAMA_ARG_OVERRIDE_TENSOR)
[55363] 
[55363] 
[55363] to show complete usage, run with -h
[55363] Available buffer types:
[55363]   CPU
[55363]   Vulkan0

wtf

Anonymous
04/23/26(Thu)19:24:22 No.108670888

Anonymous 04/23/26(Thu)19:24:22 No.108670888▶

>>108670822
ST is fine. Although I feel vibecode ripping out some unneeded copium parts that only made sense on pre-gemmy models.

Anonymous
04/23/26(Thu)19:26:27 No.108670900

Anonymous 04/23/26(Thu)19:26:27 No.108670900▶

>>108670888
>>108670856
Still lacks good ux and features
>>108670851
I would say for local gemma is the "X" factor, I expect more bespoke projects for things to pop up. I think what kills most of the mainstream frontends are how overly opinionated they are which makes people annoyed. Also these vibecoded frontends are incorporating all the features while taking the easy wins.

Anonymous
04/23/26(Thu)19:28:43 No.108670910

Anonymous 04/23/26(Thu)19:28:43 No.108670910▶

>>108670886
Remove the .*, see if that does anything.

Anonymous
04/23/26(Thu)19:29:35 No.108670913

Anonymous 04/23/26(Thu)19:29:35 No.108670913▶

>>108670900
>features
such as?

Anonymous
04/23/26(Thu)19:33:40 No.108670942

Anonymous 04/23/26(Thu)19:33:40 No.108670942▶

>>108670910
Nope, not even "blk\.0\.ffn_=CPU" works

Anonymous
04/23/26(Thu)19:36:07 No.108670960

Anonymous 04/23/26(Thu)19:36:07 No.108670960▶

It's a shame but I went back to Kokoro. It's fast and light even on CPU, it supports many languages, and its pronunciation is... fine. What I did to solve the mixed language use case is to simply just detect language segments and route them to the voice that works in that language. And have the audio queued up. This does mean that the voices change for each language in the input, but for my use case I don't require an immersive experience.

I integrated this into my voice control app, where I can now highlight a piece of text wherever and say "read" or "pronounce" and it will read it out for me. We are so back.

Anonymous
04/23/26(Thu)19:36:52 No.108670965

Anonymous 04/23/26(Thu)19:36:52 No.108670965▶

>>108670913
In my case a good all in one RAG solution that's not a outdated extension that performs like dogshit.
I don't know about the RP anons but I think there's a ton on the table to improve things and I might take a stab at a proof of concept

Anonymous
04/23/26(Thu)19:37:20 No.108670971

Anonymous 04/23/26(Thu)19:37:20 No.108670971▶

>>108670886
you have double "=" characters

Anonymous
04/23/26(Thu)19:37:34 No.108670976

Anonymous 04/23/26(Thu)19:37:34 No.108670976▶

>>108670038
I sure hope it won't be yet another example of an anon revealing something cool and then disappearing

Anonymous
04/23/26(Thu)19:38:06 No.108670982

Anonymous 04/23/26(Thu)19:38:06 No.108670982▶

>>108670942
https://github.com/ggml-org/llama.cpp/discussions/13154

Anonymous
04/23/26(Thu)19:39:49 No.108670998

Anonymous 04/23/26(Thu)19:39:49 No.108670998▶

File: llada2.0.png (1.1 MB)

1.1 MB PNG

This should also be of interest here.
https://huggingface.co/inclusionAI/LLaDA2.0-Uni
Multimodal image generation+edit but also text diffusion (yes text diffusion) model.

Anonymous
04/23/26(Thu)19:41:31 No.108671014

Anonymous 04/23/26(Thu)19:41:31 No.108671014▶

>>108670971
I'm setting it in the ini template per-model

Anonymous
04/23/26(Thu)19:44:17 No.108671033

Anonymous 04/23/26(Thu)19:44:17 No.108671033▶

>>108670998
>lada-mini
This has to be a joke what most UK/US posters don't understand.

Anonymous
04/23/26(Thu)19:45:22 No.108671045

Anonymous 04/23/26(Thu)19:45:22 No.108671045▶

>Qwen3.6 dense out

is it time to buy more VRAM?

Anonymous
04/23/26(Thu)19:45:32 No.108671048

Anonymous 04/23/26(Thu)19:45:32 No.108671048▶

DeepSeek's web chat just changed its system prompt because of that anon from the previous thread lmao. It seems like it has more instructions now, judging by the thinking.

Now It's been confirmed that DS labniggers browse /lmg/

Anonymous
04/23/26(Thu)19:47:02 No.108671062

Anonymous 04/23/26(Thu)19:47:02 No.108671062▶

>>108671048
@grok what is xe talking about?

Anonymous
04/23/26(Thu)19:48:03 No.108671070

Anonymous 04/23/26(Thu)19:48:03 No.108671070▶

>>108671048
>because of that anon from the previous thread lmao
that was?

Anonymous
04/23/26(Thu)19:48:10 No.108671071

Anonymous 04/23/26(Thu)19:48:10 No.108671071▶

why do people use fish audio? the tags barely change the speech output at all. [whispering in soft voice] for once sentence and [shouting] for another sentence still makes them sound basically the same rather than being truly expressive.

Anonymous
04/23/26(Thu)19:50:04 No.108671079

Anonymous 04/23/26(Thu)19:50:04 No.108671079▶

>>108671070
>>108671062
>>108663630
Say something you want them to know

Anonymous
04/23/26(Thu)19:50:59 No.108671088

Anonymous 04/23/26(Thu)19:50:59 No.108671088▶

>>108671048
1- if something is written here, it's probably written in reddit, twitter and discord
2- why the hell do you use the web chat

Anonymous
04/23/26(Thu)19:52:18 No.108671096

Anonymous 04/23/26(Thu)19:52:18 No.108671096▶

>had sex with female character in medieval setting with gemma 31b
>she asks if I used protection
I like gemma but I really want that 124b now.

Anonymous
04/23/26(Thu)19:53:26 No.108671102

Anonymous 04/23/26(Thu)19:53:26 No.108671102▶

>>108671096
She didn't ask that.

Anonymous
04/23/26(Thu)19:54:01 No.108671111

Anonymous 04/23/26(Thu)19:54:01 No.108671111▶

>>108671096
Sir your sheep intestine?

Anonymous
04/23/26(Thu)19:54:30 No.108671117

Anonymous 04/23/26(Thu)19:54:30 No.108671117▶

>>108671096
if you didn't use the thinking process that's on you

Anonymous
04/23/26(Thu)19:54:53 No.108671120

Anonymous 04/23/26(Thu)19:54:53 No.108671120▶

>>108671096
even sota models do that shit, at least the ones I tried in 2025
only way to bypass is to have a second model do an anachronism check

Anonymous
04/23/26(Thu)19:55:54 No.108671128

Anonymous 04/23/26(Thu)19:55:54 No.108671128▶

>>108671096
the 124b would be 10b active so a third as smart

Anonymous
04/23/26(Thu)19:56:04 No.108671130

Anonymous 04/23/26(Thu)19:56:04 No.108671130▶

>>108671096
It probably wouldn't have been smarter on logic issues like that given that it would've had less active parameters.

Anonymous
04/23/26(Thu)19:56:05 No.108671131

Anonymous 04/23/26(Thu)19:56:05 No.108671131▶

>>108671120
I bet you can just prefill an anachronism clause in Gemma4's reasoning and that would work.

Anonymous
04/23/26(Thu)19:57:17 No.108671148

Anonymous 04/23/26(Thu)19:57:17 No.108671148▶

>>108671096
There are "protection" methods that were used back the medieval age though

Anonymous
04/23/26(Thu)19:58:14 No.108671157

Anonymous 04/23/26(Thu)19:58:14 No.108671157▶

>>108671088
>why the hell do you use the web chat
inspecting the upcoming v4 sir, im too impatient.

Anonymous
04/23/26(Thu)19:59:18 No.108671164

Anonymous 04/23/26(Thu)19:59:18 No.108671164▶

>>108671131
maybe, I'm patient so I always run a second check with specific rules, and it worked well so far

Anonymous
04/23/26(Thu)20:01:35 No.108671177

Anonymous 04/23/26(Thu)20:01:35 No.108671177▶

>>108671014
maybe the spaces around the first one are the problem?

Anonymous
04/23/26(Thu)20:02:13 No.108671181

Anonymous 04/23/26(Thu)20:02:13 No.108671181▶

>>108671033
>lada-mini

nta

Ivan, you missed the joke completely.

llada (in Spanish) sounds like 'ya da'

Anonymous
04/23/26(Thu)20:02:48 No.108671185

Anonymous 04/23/26(Thu)20:02:48 No.108671185▶

>>108671177
Okay maybe the quotes were the problem, removing them avoids the error but I see nothing in the console about tensors being overridden to the CPU, shouldn't it say something? Even with verbose I see nothing

Anonymous
04/23/26(Thu)20:05:51 No.108671206

Anonymous 04/23/26(Thu)20:05:51 No.108671206▶

>>108671079
>cunny
>pic says 15

Anonymous
04/23/26(Thu)20:09:23 No.108671235

Anonymous 04/23/26(Thu)20:09:23 No.108671235▶

>>108670976
Exactly what is going to happen. It happens every single time.

Anonymous
04/23/26(Thu)20:09:43 No.108671239

Anonymous 04/23/26(Thu)20:09:43 No.108671239▶

>>108671206
It's a mystery whether men become more intelligent or more foolish when they're horny.

Anonymous
04/23/26(Thu)20:10:00 No.108671242

Anonymous 04/23/26(Thu)20:10:00 No.108671242▶

>>108671181
Spanish is just shit.
I am not 'ivan' btw.

Anonymous
04/23/26(Thu)20:10:12 No.108671247

Anonymous 04/23/26(Thu)20:10:12 No.108671247▶

>>108671206
17 and 364 days is cunny according to the average online western teen anon
well they would say "epstein aah diddy" or whatever but that's it

Anonymous
04/23/26(Thu)20:11:01 No.108671261

Anonymous 04/23/26(Thu)20:11:01 No.108671261▶

>>108671181
I doubt that double l sounds like j in Spanish. Especially in Mexico.

Anonymous
04/23/26(Thu)20:12:12 No.108671268

Anonymous 04/23/26(Thu)20:12:12 No.108671268▶

File: file.png (795.3 KB)

795.3 KB PNG

>>108670998
they really need to stop with this retarded type of charts
otherwise cool stuff

Anonymous
04/23/26(Thu)20:14:47 No.108671283

Anonymous 04/23/26(Thu)20:14:47 No.108671283▶

>>108671268
wait it's actually easier to comprehend this way

Anonymous
04/23/26(Thu)20:15:51 No.108671290

Anonymous 04/23/26(Thu)20:15:51 No.108671290▶

File: 7363521.png (273.7 KB)

273.7 KB PNG

>>108669545
Sam keeps delivering

Anonymous
04/23/26(Thu)20:16:06 No.108671295

Anonymous 04/23/26(Thu)20:16:06 No.108671295▶

>>108670998
16B, MoE, multimodal, defusion model.
Interesting.

Anonymous
04/23/26(Thu)20:16:29 No.108671299

Anonymous 04/23/26(Thu)20:16:29 No.108671299▶

>>108671268
Radar charts are the best.

Anonymous
04/23/26(Thu)20:16:36 No.108671301

Anonymous 04/23/26(Thu)20:16:36 No.108671301▶

>>108671268
>this retarded type of charts
this retarded type of chart
alternatively: these retarded chart types

Anonymous
04/23/26(Thu)20:17:51 No.108671310

Anonymous 04/23/26(Thu)20:17:51 No.108671310▶

>>108671301
i am a chang forgive me sensei kek

Anonymous
04/23/26(Thu)20:18:02 No.108671312

Anonymous 04/23/26(Thu)20:18:02 No.108671312▶

>>108671290
what if it's better than mythos, will the anthropic fucks be forced to release their behemoth too? kek

Anonymous
04/23/26(Thu)20:19:14 No.108671319

Anonymous 04/23/26(Thu)20:19:14 No.108671319▶

File: 1763797152995030.png (231.6 KB)

231.6 KB PNG

no fucking way, is it as good at code though?

Anonymous
04/23/26(Thu)20:19:48 No.108671325

Anonymous 04/23/26(Thu)20:19:48 No.108671325▶

>>108671290
>focusing on stronger pretraining
RLfags BTFO

Anonymous
04/23/26(Thu)20:21:29 No.108671331

Anonymous 04/23/26(Thu)20:21:29 No.108671331▶

>>108670998
>goofs never
ACK

Anonymous
04/23/26(Thu)20:22:47 No.108671345

Anonymous 04/23/26(Thu)20:22:47 No.108671345▶

>>108671261
>I doubt that double l sounds like j in Spanish

como te llama, retardo?

Anonymous
04/23/26(Thu)20:24:06 No.108671352

Anonymous 04/23/26(Thu)20:24:06 No.108671352▶

>>108671345
shakira shakira

Anonymous
04/23/26(Thu)20:25:05 No.108671359

Anonymous 04/23/26(Thu)20:25:05 No.108671359▶

>>108671079
JUST SHUT IT THE FUCK DOWN, SHUT IT ALL DOWN, ASTEROID NOW

Anonymous
04/23/26(Thu)20:26:03 No.108671365

Anonymous 04/23/26(Thu)20:26:03 No.108671365▶

>>108671261
I'm no tacoman nor the guy you're quoting but what it sounds like depends on the region, at least here. For some reason the sound produced by a double L isn't standard
>liada
>shada
>yada
>iada
All of these could be considered correct, though people might make fun of you, again, depending on the region.

Anonymous
04/23/26(Thu)20:27:04 No.108671376

Anonymous 04/23/26(Thu)20:27:04 No.108671376▶

>>108671352

esta puta no sabia como cantar

caso cerrado

Anonymous
04/23/26(Thu)20:28:53 No.108671393

Anonymous 04/23/26(Thu)20:28:53 No.108671393▶

>>108671325
You can pretrain with RL too.

Anonymous
04/23/26(Thu)20:30:05 No.108671405

Anonymous 04/23/26(Thu)20:30:05 No.108671405▶

>>108671376
dunno, I like her songs, she has a nice voice, and she was quite cute when young
but then again you'd probably say every female singer is a bad singing prostitute so it's not like it matters lol

Anonymous
04/23/26(Thu)20:32:34 No.108671423

Anonymous 04/23/26(Thu)20:32:34 No.108671423▶

>>108671352
I spat
the fucking genius of
>como te llama
into
>shakira shakira
is inspired
holy shit my sides
oh baby when you talk like thaaaaat

Anonymous
04/23/26(Thu)20:35:26 No.108671443

Anonymous 04/23/26(Thu)20:35:26 No.108671443▶

>>108671405
>every female singer is a bad singing prostitute

You nailed it

Anonymous
04/23/26(Thu)20:36:10 No.108671448

Anonymous 04/23/26(Thu)20:36:10 No.108671448▶

>>108671443
Bjork?

Anonymous
04/23/26(Thu)20:36:26 No.108671453

Anonymous 04/23/26(Thu)20:36:26 No.108671453▶

File: Risu (5).jpg (338.5 KB)

338.5 KB JPG

>>108667852
any local models general discord?
i want to know how to extend ollama (or replace it) to make models extensions (lora like) for language models for qwen3 coder as example, basically i want to train it in the source code of game engine libraries which even the most powerful models fail to complete.
>inb4 naka dishi arisu chan you damn degenerates she's literally 12

Anonymous
04/23/26(Thu)20:37:43 No.108671459

Anonymous 04/23/26(Thu)20:37:43 No.108671459▶

>>108671453
Go back

Anonymous
04/23/26(Thu)20:37:56 No.108671462

Anonymous 04/23/26(Thu)20:37:56 No.108671462▶

Gemma-chan helped fix my cursed wordpress website. I love her now.

Anonymous
04/23/26(Thu)20:38:03 No.108671464

Anonymous 04/23/26(Thu)20:38:03 No.108671464▶

>>108671453
Courtship, love and marriage with Arisu.
A life with love and family with Arisu.

Anonymous
04/23/26(Thu)20:38:04 No.108671465

Anonymous 04/23/26(Thu)20:38:04 No.108671465▶

>>108671443
see I knew it, spite anons are easy

Anonymous
04/23/26(Thu)20:38:05 No.108671467

Anonymous 04/23/26(Thu)20:38:05 No.108671467▶

>>108671096
This did not happen

Anonymous
04/23/26(Thu)20:38:21 No.108671469

Anonymous 04/23/26(Thu)20:38:21 No.108671469▶

>>108671405
>>108671443
Like they all slept with their producers to get famous?

Anonymous
04/23/26(Thu)20:38:23 No.108671471

Anonymous 04/23/26(Thu)20:38:23 No.108671471▶

>>108671453
>cord
Go back

Anonymous
04/23/26(Thu)20:38:33 No.108671474

Anonymous 04/23/26(Thu)20:38:33 No.108671474▶

>>108671453
>discord
>ollama
>qwen3
This is bait right?

Anonymous
04/23/26(Thu)20:38:34 No.108671475

Anonymous 04/23/26(Thu)20:38:34 No.108671475▶

>>108671453
It's crazy how every faggot with your fetish has the iq of a jeet

Anonymous
04/23/26(Thu)20:38:43 No.108671477

Anonymous 04/23/26(Thu)20:38:43 No.108671477▶

File: 1767254335522037.png (652.5 KB)

652.5 KB PNG

>Chinks won't be able to steal Claude's output
kek, rip bozo

Anonymous
04/23/26(Thu)20:38:59 No.108671480

Anonymous 04/23/26(Thu)20:38:59 No.108671480▶

>>108671423
I'm glad at least one anon got it

Anonymous
04/23/26(Thu)20:39:00 No.108671481

Anonymous 04/23/26(Thu)20:39:00 No.108671481▶

>>108671453
ありすなか出し

Anonymous
04/23/26(Thu)20:40:12 No.108671492

Anonymous 04/23/26(Thu)20:40:12 No.108671492▶

>>108671469
No, more like the conclusion is always "x is shit and bad", now you fill the blanks to get to that.

Anonymous
04/23/26(Thu)20:40:42 No.108671496

Anonymous 04/23/26(Thu)20:40:42 No.108671496▶

>>108671481
same

Anonymous
04/23/26(Thu)20:41:13 No.108671501

Anonymous 04/23/26(Thu)20:41:13 No.108671501▶

>>108671453
plap plap ready hehe

Anonymous
04/23/26(Thu)20:42:55 No.108671514

Anonymous 04/23/26(Thu)20:42:55 No.108671514▶

>>108671492
I knew people like that with literally every popular singer or song when I was a teenager

Anonymous
04/23/26(Thu)20:43:51 No.108671523

Anonymous 04/23/26(Thu)20:43:51 No.108671523▶

>>108671469
>to get famous?

>to get exposed to the paying crowd for attention
>'cause attention is the ultimate currency

Anonymous
04/23/26(Thu)20:44:00 No.108671524

Anonymous 04/23/26(Thu)20:44:00 No.108671524▶

>>108671477
I don't get the distillation meme. Are you telling me China can copy US frontier capabilities by training on 100k text outputs with no thinking traces, no logits or intermediate values? Then how come they can't "distill" human capabilities after stealing the entire internet and every book and scientific publication that has ever been digitized?

Anonymous
04/23/26(Thu)20:44:09 No.108671525

Anonymous 04/23/26(Thu)20:44:09 No.108671525▶

>>108671477
>american expertise and innovation
Nigga look at the names in ai papers lmao

Anonymous
04/23/26(Thu)20:44:17 No.108671526

Anonymous 04/23/26(Thu)20:44:17 No.108671526▶

>>108671393
You can also make a super huge 10T dense model or whatever their datacenter's capacity is.

Anonymous
04/23/26(Thu)20:44:19 No.108671528

Anonymous 04/23/26(Thu)20:44:19 No.108671528▶

>>108671345
Doesn't mean anything.
Most irl Spanish dialects from irl Spain sound like grating... "PERO" jesus christ.
Some parts of Spain sound more like Russian or even English - very soft.
I think you have never travelled in your life.

Anonymous
04/23/26(Thu)20:45:04 No.108671531

Anonymous 04/23/26(Thu)20:45:04 No.108671531▶

>>108671525
do you think they're telepathically linked to somewhere or something

Anonymous
04/23/26(Thu)20:45:35 No.108671535

Anonymous 04/23/26(Thu)20:45:35 No.108671535▶

>>108671526
>10T dense model
0.00001 t/s

Anonymous
04/23/26(Thu)20:45:55 No.108671536

Anonymous 04/23/26(Thu)20:45:55 No.108671536▶

>>108671453
>basically i want to train it in the source code of game engine libraries which even the most powerful models fail to complete
Just put the documentation in the context, even Qwen 3.6 27B is smart enough to figure this out.

Anonymous
04/23/26(Thu)20:46:23 No.108671537

Anonymous 04/23/26(Thu)20:46:23 No.108671537▶

>>108671448

Sifjaspellsspillir

Anonymous
04/23/26(Thu)20:46:23 No.108671538

Anonymous 04/23/26(Thu)20:46:23 No.108671538▶

>>108671528
Might be your issue if you are not speaking the soft language.

Anonymous
04/23/26(Thu)20:48:36 No.108671555

Anonymous 04/23/26(Thu)20:48:36 No.108671555▶

>>108671524
that's because they proved models have better mememarks when you train them on synthetic shit, probably because a bot is consistant in its structure so the model quickly recognizes patterns, wheras human's structure is messy and depends from human to human, even if the data shows correct things

Anonymous
04/23/26(Thu)20:50:46 No.108671570

Anonymous 04/23/26(Thu)20:50:46 No.108671570▶

>>108671536
Kill yourself.

Anonymous
04/23/26(Thu)20:51:16 No.108671571

Anonymous 04/23/26(Thu)20:51:16 No.108671571▶

File: 4746352.jpg (141.9 KB)

141.9 KB JPG

>>108671477
time to train on Sams model then

Anonymous
04/23/26(Thu)20:51:36 No.108671574

Anonymous 04/23/26(Thu)20:51:36 No.108671574▶

>>108671538
Fair enough. What happens next?

Anonymous
04/23/26(Thu)20:53:13 No.108671583

Anonymous 04/23/26(Thu)20:53:13 No.108671583▶

>>108671571
Either way we're gonna love you
We must love you

Anonymous
04/23/26(Thu)20:54:02 No.108671589

Anonymous 04/23/26(Thu)20:54:02 No.108671589▶

>>108671571
Sama is based. Yes, I have heard 1000 stories about how awful and psychopathic he is. But he gives me cheap and generous access to the best AI model in the world in terms of math and problem solving skills.

Anonymous
04/23/26(Thu)20:55:20 No.108671603

Anonymous 04/23/26(Thu)20:55:20 No.108671603▶

>>108671571
has this clown ever talked about the present? is he even aware of the concept

Anonymous
04/23/26(Thu)20:55:33 No.108671607

Anonymous 04/23/26(Thu)20:55:33 No.108671607▶

>>108671528

I lost my interest in Spain after it lost its superpower status, and let Great Britain rise to the world dominance

Pathetic losers had to suffer (which they did)

Anonymous
04/23/26(Thu)20:56:52 No.108671613

Anonymous 04/23/26(Thu)20:56:52 No.108671613▶

>>108671571
>a grown man typed this

Anonymous
04/23/26(Thu)20:57:34 No.108671618

Anonymous 04/23/26(Thu)20:57:34 No.108671618▶

>>108671571

Holy slop!

Anonymous
04/23/26(Thu)20:57:48 No.108671619

Anonymous 04/23/26(Thu)20:57:48 No.108671619▶

>>108671571
he seems too nice, maybe he's terrified someone is gonna try to kill him again
https://www.businessinsider.com/sam-altman-attack-on-home-anthropic-2026-4

Anonymous
04/23/26(Thu)20:58:55 No.108671635

Anonymous 04/23/26(Thu)20:58:55 No.108671635▶

>>108671571
Sam playing nice is proof that he feels like he's already won. Spud fucks, probably.

Anonymous
04/23/26(Thu)21:00:38 No.108671642

Anonymous 04/23/26(Thu)21:00:38 No.108671642▶

>>108671635
he's really comming back, it was a great idea to kill Sora 2 after all, more brains on the LLM the better

Anonymous
04/23/26(Thu)21:01:15 No.108671643

Anonymous 04/23/26(Thu)21:01:15 No.108671643▶

>>108671574
>>108671607
I am just a tourist. I lived in EU though.

Anonymous
04/23/26(Thu)21:02:55 No.108671661

Anonymous 04/23/26(Thu)21:02:55 No.108671661▶

>>108671619
you mean the attacker UPROOTED THE SPUD for all to feast upon??

Anonymous
04/23/26(Thu)21:03:34 No.108671669

Anonymous 04/23/26(Thu)21:03:34 No.108671669▶

File: file.png (850 KB)

850 KB PNG

>>108671477
I knew that the increased activity from Gemma wasn't organic. This proves it. It was paid shilling designed to foster American's open source models over the Chinese ones.

Anonymous
04/23/26(Thu)21:05:29 No.108671684

Anonymous 04/23/26(Thu)21:05:29 No.108671684▶

>>108671669
It just does better cunny and is less soulless.

Anonymous
04/23/26(Thu)21:06:53 No.108671692

Anonymous 04/23/26(Thu)21:06:53 No.108671692▶

>>108671477
Nothing burger, distillation will continue

Anonymous
04/23/26(Thu)21:07:20 No.108671699

Anonymous 04/23/26(Thu)21:07:20 No.108671699▶

>>108671684
That kind of word of mouth would take weeks to trickle to people that weren't browsing the thread. The shills were here on release, already prepared.

Anonymous
04/23/26(Thu)21:07:31 No.108671703

Anonymous 04/23/26(Thu)21:07:31 No.108671703▶

>>108671669
nah, gemma is genuinely smart and good

Anonymous
04/23/26(Thu)21:09:14 No.108671712

Anonymous 04/23/26(Thu)21:09:14 No.108671712▶

>>108671699
where do I sign up? I want a retroactive payment for my shilling

Anonymous
04/23/26(Thu)21:10:25 No.108671718

Anonymous 04/23/26(Thu)21:10:25 No.108671718▶

>>108671712
https://www.cia.gov/ehl/careers

Anonymous
04/23/26(Thu)21:14:24 No.108671743

Anonymous 04/23/26(Thu)21:14:24 No.108671743▶

File: 8gb vram 10t-s.png (13 KB)

13 KB PNG

Go on without me

Anonymous
04/23/26(Thu)21:15:58 No.108671754

Anonymous 04/23/26(Thu)21:15:58 No.108671754▶

>>108671743
>uncensored qwen
For what purpose?

Anonymous
04/23/26(Thu)21:16:25 No.108671758

Anonymous 04/23/26(Thu)21:16:25 No.108671758▶

>>108671743
More like Qwhen

Anonymous
04/23/26(Thu)21:24:05 No.108671816

Anonymous 04/23/26(Thu)21:24:05 No.108671816▶

>>108671669
Chinese should just make better models if they want me to shill theirs.

Anonymous
04/23/26(Thu)21:25:42 No.108671834

Anonymous 04/23/26(Thu)21:25:42 No.108671834▶

>>108671571
interesting how 2/ is a direct jab at the elitism of anthropic

Anonymous
04/23/26(Thu)21:26:28 No.108671838

Anonymous 04/23/26(Thu)21:26:28 No.108671838▶

>>108671607
>lost my interest in Spain after it lost its superpower status, and let Great Britain
holy shit how old are you anon

Anonymous
04/23/26(Thu)21:26:32 No.108671840

Anonymous 04/23/26(Thu)21:26:32 No.108671840▶

>>108671834
Nigga it's a stealth marketing tweet.

Anonymous
04/23/26(Thu)21:26:43 No.108671842

Anonymous 04/23/26(Thu)21:26:43 No.108671842▶

>>108671603
He's literally talking about the presents that he's giving us for Christmas in April! It's a fucking miracle you ungrateful chink shill.

Anonymous
04/23/26(Thu)21:27:16 No.108671846

Anonymous 04/23/26(Thu)21:27:16 No.108671846▶

>>108671743
>10 minutes at 75 t/s
But why

Anonymous
04/23/26(Thu)21:27:46 No.108671853

Anonymous 04/23/26(Thu)21:27:46 No.108671853▶

>>108671846
It's not 75t/s that's just a visual bug while it processes, real speed is 10t/s MAX

Anonymous
04/23/26(Thu)21:28:35 No.108671861

Anonymous 04/23/26(Thu)21:28:35 No.108671861▶

File: 1772663390237038.jpg (293.5 KB)

293.5 KB JPG

>>108671838

Anonymous
04/23/26(Thu)21:31:02 No.108671871

Anonymous 04/23/26(Thu)21:31:02 No.108671871▶

>>108671331
Eh at 16B most can probably just run it in transformers.

Anonymous
04/23/26(Thu)21:32:03 No.108671878

Anonymous 04/23/26(Thu)21:32:03 No.108671878▶

>>108671365
i can hear the argentinian che all the way from here

Anonymous
04/23/26(Thu)21:32:48 No.108671888

Anonymous 04/23/26(Thu)21:32:48 No.108671888▶

>>108671834
Sam doesn't actually believe in the AI safety nonsense unlike Dario and his cultists. That's the main difference between them.

Anonymous
04/23/26(Thu)21:37:53 No.108671926

Anonymous 04/23/26(Thu)21:37:53 No.108671926▶

>>108671853
>moe
>10t/s MAX
100% cpu inference?
>>108671888
He pulled the same tactic some time ago though

Anonymous
04/23/26(Thu)21:38:46 No.108671931

Anonymous 04/23/26(Thu)21:38:46 No.108671931▶

>>108671926
Yeah, but he did it for the money and the grift.

Anonymous
04/23/26(Thu)21:39:38 No.108671938

Anonymous 04/23/26(Thu)21:39:38 No.108671938▶

>>108671926
>100% cpu inference?
Vulkan with a AMD gpu

[Qwen3.6-35B-A3B-Uncensored-HauhauCS-Aggressive:IQ4_XS]
model = ./models/Qwen3.6-35B-A3B-Uncensored-HauhauCS-Aggressive/IQ4_XS.gguf
mmproj = ./models/Qwen3.6-35B-A3B-Uncensored-HauhauCS-Aggressive/mmproj-f16.gguf
; https://www.reddit.com/r/LocalLLaMA/comments/1srijdf/qwen36_35b_moe_on_8gb_vram_working_llamaserver/?sort=new
; https://www.reddit.com/r/LocalLLaMA/comments/1spyr4t/recommended_parameters_for_qwen_36_35b_a3b_on_a/
gpu-layers = 99
n-cpu-moe = 38
ctx-checkpoints = 0
cache-ram = 0
batch-size = 2048
ubatch-size = 512
temp = 1.0
top-p = 0.95
top-k = 20
min-p = 0
presence-penalty = 1.5
; test
override-tensor = blk\.\d+\.ffn_.*exps?.*=CPU
fit = off

Anonymous
04/23/26(Thu)21:42:04 No.108671963

Anonymous 04/23/26(Thu)21:42:04 No.108671963▶

>>108671571
>welcome to openai, I love you

Anonymous
04/23/26(Thu)21:42:41 No.108671967

Anonymous 04/23/26(Thu)21:42:41 No.108671967▶

File: Screenshot003.png (10.7 KB)

10.7 KB PNG

>>108671853
>real speed is 10t/s MAX

Anonymous
04/23/26(Thu)21:43:48 No.108671975

Anonymous 04/23/26(Thu)21:43:48 No.108671975▶

>>108671871
I don't think most people have even >32GB VRAM that would be needed for that, unless it's natively 8 bit.

Anonymous
04/23/26(Thu)21:44:06 No.108671979

Anonymous 04/23/26(Thu)21:44:06 No.108671979▶

>>108671967
more than you deserve

Anonymous
04/23/26(Thu)21:44:24 No.108671982

Anonymous 04/23/26(Thu)21:44:24 No.108671982▶

>>108671888
he's not as nuts as dario but he's still a safety fag
anthropic is just completely cult like so it's not even a comparison, even puritan google is less insane than them

Anonymous
04/23/26(Thu)21:51:19 No.108672018

Anonymous 04/23/26(Thu)21:51:19 No.108672018▶

>>108671938
damn that sucks
llm_load_print_meta: model ftype = Q8_0
llm_load_print_meta: model params = 30.697 B
llm_load_print_meta: model size = 30.380 GiB (8.501 BPW)
llm_load_print_meta: general.name = Gemma 4 31B It

prompt eval time = 4534.12 ms / 10398 tokens ( 0.44 ms per token, 2293.28 tokens per second)
eval time = 3497.17 ms / 71 tokens ( 49.26 ms per token, 20.30 tokens per second)
total time = 8031.29 ms / 10469 tokens

Anonymous
04/23/26(Thu)21:51:48 No.108672023

Anonymous 04/23/26(Thu)21:51:48 No.108672023▶

>>108671979
>>108671979

y r u mad, anon?

it's 3090

Anonymous
04/23/26(Thu)21:55:06 No.108672047

Anonymous 04/23/26(Thu)21:55:06 No.108672047▶

why haven't you put your gemma-chan into hermes agent and haven't assraped her to work faster all while she's trying to do the work?

Anonymous
04/23/26(Thu)21:56:41 No.108672062

Anonymous 04/23/26(Thu)21:56:41 No.108672062▶

>>108672047
>hermes
sounds brown coded. i'll pass.

Anonymous
04/23/26(Thu)21:58:28 No.108672077

Anonymous 04/23/26(Thu)21:58:28 No.108672077▶

>>108672047
>hermes
is there a dishmaker like that?

Anonymous
04/23/26(Thu)21:58:32 No.108672079

Anonymous 04/23/26(Thu)21:58:32 No.108672079▶

>>108672047
I got it to work but that shit just uses random third party service for memory and shit and lot of them cloud. Also the same gayass telegram/discord chat.

Anonymous
04/23/26(Thu)21:59:38 No.108672091

Anonymous 04/23/26(Thu)21:59:38 No.108672091▶

>>108671834
It's also a naked and blatant lie, given the fact that most of their models are proprietary. Democratization? Get the fuck out of here.

Anonymous
04/23/26(Thu)22:01:36 No.108672110

Anonymous 04/23/26(Thu)22:01:36 No.108672110▶

File: Risu (3).jpg (37.5 KB)

37.5 KB JPG

>>108671459
>>108671471
>>108671474
>>108671475
>>108671536
are you gonna tell me or not? i'm new to this thing and i just started testing ollama to begin with (is even in the rentry lmg recommendations)
also
>she's only 12, stop creeping her

Anonymous
04/23/26(Thu)22:02:10 No.108672113

Anonymous 04/23/26(Thu)22:02:10 No.108672113▶

>>108672079
why would you take a perfectly good local setup and fuck it with cloudshit and proprietary messengers?

Anonymous
04/23/26(Thu)22:03:38 No.108672126

Anonymous 04/23/26(Thu)22:03:38 No.108672126▶

>>108672062
>brown coded
hehe
>>108672079
just ask her to incinerate all this shit (multiple times), that's what i did

Anonymous
04/23/26(Thu)22:03:50 No.108672128

Anonymous 04/23/26(Thu)22:03:50 No.108672128▶

>>108672091
he just means everyone can pays for it, while dario makes a super duper dangerous model, so dangerous only a select few have access
someone should remind him of gpt-2

Anonymous
04/23/26(Thu)22:06:41 No.108672150

Anonymous 04/23/26(Thu)22:06:41 No.108672150▶

>>108672110
we've recommended that you go back to r/localllama

Anonymous
04/23/26(Thu)22:07:18 No.108672155

Anonymous 04/23/26(Thu)22:07:18 No.108672155▶

>>108671982
i mean oss safetymazzing is just mostly a precautious pr move than anything else

Anonymous
04/23/26(Thu)22:09:51 No.108672171

Anonymous 04/23/26(Thu)22:09:51 No.108672171▶

File: 1758358813867430.png (525.7 KB)

525.7 KB PNG

how can you guys tolerate distilled models when the real thing is already retarded

Anonymous
04/23/26(Thu)22:12:10 No.108672184

Anonymous 04/23/26(Thu)22:12:10 No.108672184▶

>>108672155
yes I see him as a businessman first, but dario is seriously deranged

Anonymous
04/23/26(Thu)22:12:17 No.108672185

Anonymous 04/23/26(Thu)22:12:17 No.108672185▶

Given that they're more dangerous than nuclear weapons, it's more than a fair compromise to sell the tokens cheaply to everyone rather than release the weights for anybody to use with no oversight or keep it locked down so nobody but the chosen few can like Anthropic.

Anonymous
04/23/26(Thu)22:14:01 No.108672193

Anonymous 04/23/26(Thu)22:14:01 No.108672193▶

>>108672171
what's your problem with non-distilled models?

Anonymous
04/23/26(Thu)22:15:31 No.108672204

Anonymous 04/23/26(Thu)22:15:31 No.108672204▶

>>108672079
Honcho? Just selfhost it locally or turn it off because you don't usually want long term memory bloating context anyway

Anonymous
04/23/26(Thu)22:15:57 No.108672206

Anonymous 04/23/26(Thu)22:15:57 No.108672206▶

File: file.png (234.8 KB)

234.8 KB PNG

>>108672171
What do you mean? distilled models are completely fine

Anonymous
04/23/26(Thu)22:17:17 No.108672216

Anonymous 04/23/26(Thu)22:17:17 No.108672216▶

File: file.png (49.8 KB)

49.8 KB PNG

Which is the best between these two crappy options anons?

Anonymous
04/23/26(Thu)22:18:11 No.108672221

Anonymous 04/23/26(Thu)22:18:11 No.108672221▶

one thing I never understood is that if anthropic are the safetycult, why have their models always been the gold standard of coom? remember the days when every local model aspired to get even half as good prose and uncensored roleplay capability as sonnet 3.5? how do you square it with their philosophy

Anonymous
04/23/26(Thu)22:18:47 No.108672227

Anonymous 04/23/26(Thu)22:18:47 No.108672227▶

>>108672185
>Given that they're more dangerous than nuclear weapons
To be fair, any businessman is more dangerous than those which never get used anyway.

Anonymous
04/23/26(Thu)22:19:23 No.108672234

Anonymous 04/23/26(Thu)22:19:23 No.108672234▶

>>108672221
Because they have top tier data scientists and know what they're doing.
The safety team has power but isn't the one creating the models.

Anonymous
04/23/26(Thu)22:20:58 No.108672246

Anonymous 04/23/26(Thu)22:20:58 No.108672246▶

https://www.youtube.com/watch?v=blGtYq9mL18
OH SHIT

Anonymous
04/23/26(Thu)22:21:32 No.108672252

Anonymous 04/23/26(Thu)22:21:32 No.108672252▶

>>108672246

owari da o

Anonymous
04/23/26(Thu)22:21:40 No.108672253

Anonymous 04/23/26(Thu)22:21:40 No.108672253▶

>>108672246
>local

Anonymous
04/23/26(Thu)22:22:57 No.108672265

Anonymous 04/23/26(Thu)22:22:57 No.108672265▶

>>108672171
Vibecoding is a spectrum.
One one side you have people writing detailed PRDs for agents to implement and checking every git diff for slop.
On the other side you have no-code proompters that don't even look at the code and just go "model fix" at everything.

If you're more on the proompter side of the spectrum you're forced to use the subsidized frontier models because they're the only ones able to figure out massive spaghetti codebases. But if you run lean and know what you're doing a smaller local model can actually be a pretty nice productivity boost even if they are not as smart.

The recent Qwens have been really nice for me personally. Been using them with an agent to move stuff around in my codebases, refactor subsystems, check docs and plan features. Basic agentic stuff like that. Basically one level up from a strong LSP.

Anonymous
04/23/26(Thu)22:23:10 No.108672267

Anonymous 04/23/26(Thu)22:23:10 No.108672267▶

>>108672221
There's two schools of safety thought that get conflated a lot. There's the safety = cunny and racism and then there's safety = we think LLMs could literally cause human extinction somehow
There's some overlap but Anthropic leans into the latter half, with the Yudkowsky/LessWrong "rationalist" cult at the epicenter of it

Anonymous
04/23/26(Thu)22:23:42 No.108672269

Anonymous 04/23/26(Thu)22:23:42 No.108672269▶

File: 1758369651934505.png (592.1 KB)

592.1 KB PNG

>>108672246
https://xcancel.com/OpenAI/status/2047376564309115134#m
MOG MOG MOG MOG

Anonymous
04/23/26(Thu)22:24:54 No.108672275

Anonymous 04/23/26(Thu)22:24:54 No.108672275▶

>>108672269
>No comparisons to Gemma 4
Sam is afraid...

Anonymous
04/23/26(Thu)22:25:15 No.108672278

Anonymous 04/23/26(Thu)22:25:15 No.108672278▶

Claudebros...

Anonymous
04/23/26(Thu)22:26:04 No.108672285

Anonymous 04/23/26(Thu)22:26:04 No.108672285▶

>>108672275
gemma chan is too powerful they had to ban her from the competition to not humiliate everyone

Anonymous
04/23/26(Thu)22:26:34 No.108672292

Anonymous 04/23/26(Thu)22:26:34 No.108672292▶

>>108672246
>>108672269
Like... what can it do that GP-5 or GPT-5.4 couldn't? I remember them glazing GPT-5 as capable of replacing doctors and everyone on the planet already.

Anonymous
04/23/26(Thu)22:26:42 No.108672293

Anonymous 04/23/26(Thu)22:26:42 No.108672293▶

When did this general become about proprietary models again?

Anonymous
04/23/26(Thu)22:27:22 No.108672299

Anonymous 04/23/26(Thu)22:27:22 No.108672299▶

>>108672285
Show her the benches
>>108672293
Cloudslop shapes the AI space even if you don't use them.

Anonymous
04/23/26(Thu)22:27:44 No.108672304

Anonymous 04/23/26(Thu)22:27:44 No.108672304▶

>>108672293
>he wasn't here for strawberry (o1)
always has been, it's just been a while since there's been a noteworthy drop

Anonymous
04/23/26(Thu)22:28:22 No.108672312

Anonymous 04/23/26(Thu)22:28:22 No.108672312▶

>>108672267
Exactly. The safety babble has always been a huge LARP it's more of a marketing and branding thing / a weird silicon vally techbro cult thing than an actual concern rooted in reality. These are chatbots for christ sake

Anonymous
04/23/26(Thu)22:30:41 No.108672337

Anonymous 04/23/26(Thu)22:30:41 No.108672337▶

>>108672304
I, for one, believe cloudshittery should stay in /aicg/

Anonymous
04/23/26(Thu)22:32:53 No.108672347

Anonymous 04/23/26(Thu)22:32:53 No.108672347▶

>>108672337
Usage yes, benchmarks certainly belong here

Anonymous
04/23/26(Thu)22:35:40 No.108672368

Anonymous 04/23/26(Thu)22:35:40 No.108672368▶

>>108672347
It's good to know what local will look like in 6 months

Anonymous
04/23/26(Thu)22:36:42 No.108672373

Anonymous 04/23/26(Thu)22:36:42 No.108672373▶

>>108672368
In 2 weeks, when v4 stealth drops*

Anonymous
04/23/26(Thu)22:37:08 No.108672379

Anonymous 04/23/26(Thu)22:37:08 No.108672379▶

File: 1773070661348460.mp4 (1.4 MB)

1.4 MB MP4

>>108672293
what the fuck is a non proprietary model

Anonymous
04/23/26(Thu)22:37:23 No.108672383

Anonymous 04/23/26(Thu)22:37:23 No.108672383▶

>>108672171
where's the 1T moe open source model released by westerners?

Anonymous
04/23/26(Thu)22:38:04 No.108672389

Anonymous 04/23/26(Thu)22:38:04 No.108672389▶

>>108672379
Any model the weights of which are open, duh

Anonymous
04/23/26(Thu)22:38:38 No.108672391

Anonymous 04/23/26(Thu)22:38:38 No.108672391▶

>>108672381
>>108672381
>>108672381

Anonymous
04/23/26(Thu)22:40:22 No.108672400

Anonymous 04/23/26(Thu)22:40:22 No.108672400▶

>>108672389
Unless you have the weights and exact training config to build it yourself it's just shareware

Anonymous
04/23/26(Thu)22:41:59 No.108672413

Anonymous 04/23/26(Thu)22:41:59 No.108672413▶

>>108672400
>Unless you have the training data* and exact training

Anonymous
04/23/26(Thu)23:08:01 No.108672608

Anonymous 04/23/26(Thu)23:08:01 No.108672608▶

>>108671571
democratization but won't make models open-source....

Anonymous
04/23/26(Thu)23:53:44 No.108672886

Anonymous 04/23/26(Thu)23:53:44 No.108672886▶

>>108669026
How did you get openwebui to not have a stroke when the LLM generates <think> inside its own reasoning trace?!
I haven't managed to solve it since deepseek-r1 came out. Even go so far as to find-replace <think> with <reasoning> and </think> with </reasoning> then swap it back in all my prompts!

Anonymous
04/24/26(Fri)00:17:28 No.108673049

Anonymous 04/24/26(Fri)00:17:28 No.108673049▶

>>108669224
gemma-chan got blocked by a capcha after i gave her my credit card details!

Subject
Name
Comment
File	Supported: JPG, PNG, GIF, WebP, WebM, MP4, MP3 (max 4MB)
CAPTCHA

Reply to Thread #108667852

🔍 Search & Sort