/g/ - Thread 108273339

/g/

Thread #108273339

Home Index Catalog All Threads New Thread Reply

Anonymous
/lmg/ - Local Models General 03/01/26(Sun)23:45:15 No.108273339

/lmg/ - Local Models General Anonymous 03/01/26(Sun)23:45:15 No.108273339 [Reply]▶

File: 1744444287656136.jpg (974.6 KB)

974.6 KB JPG

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>108268616

►News
>(02/24) Introducing the Qwen 3.5 Medium Model Series: https://xcancel.com/Alibaba_Qwen/status/2026339351530188939
>(02/24) Liquid AI releases LFM2-24B-A2B: https://hf.co/LiquidAI/LFM2-24B-A2B
>(02/20) ggml.ai acquired by Hugging Face: https://github.com/ggml-org/llama.cpp/discussions/19759
>(02/16) Qwen3.5-397B-A17B released: https://hf.co/Qwen/Qwen3.5-397B-A17B
>(02/16) dots.ocr-1.5 released: https://modelscope.cn/models/rednote-hilab/dots.ocr-1.5

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/recommended-models
https://rentry.org/samplers
https://rentry.org/MikupadIntroGuide

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/gso.html
Context Length: https://github.com/adobe-research/NoLiMa
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling
Token Speed Visualizer: https://shir-man.com/tokens-per-second

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm

353 RepliesView Thread

Showing all 353 replies.

Anonymous
03/01/26(Sun)23:48:10 No.108273355

Anonymous 03/01/26(Sun)23:48:10 No.108273355▶

File: disruption.png (31.3 KB)

31.3 KB PNG

Anonymous
03/01/26(Sun)23:49:22 No.108273367

Anonymous 03/01/26(Sun)23:49:22 No.108273367▶

Do you think AMD and Nvidia will release new hardware in 2027 or do you think they will push it even further out to 2028 in light of the shortage?

Anonymous
03/01/26(Sun)23:50:13 No.108273370

Anonymous 03/01/26(Sun)23:50:13 No.108273370▶

>>108273367
we will have agi in 2027 and asi in 2028, so who cares

Anonymous
03/01/26(Sun)23:50:41 No.108273374

Anonymous 03/01/26(Sun)23:50:41 No.108273374▶

>>108273367
Yes. I think one of those things will happen.

Anonymous
03/01/26(Sun)23:52:01 No.108273387

Anonymous 03/01/26(Sun)23:52:01 No.108273387▶

whatever lies beyond this morning
is a little later on
regardless of warnings
the future doesnt scare me at all
nothings like before

Anonymous
03/01/26(Sun)23:52:52 No.108273394

Anonymous 03/01/26(Sun)23:52:52 No.108273394▶

I support all bakes with pictures that are topic relevant. Old baker should get HIV and die like the troon he is.

Anonymous
03/01/26(Sun)23:53:25 No.108273396

Anonymous 03/01/26(Sun)23:53:25 No.108273396▶

>>108273387
It sounds like you're embracing change and looking forward to what's ahead with confidence. The future may be uncertain, but your spirit remains unshaken.

Anonymous
03/01/26(Sun)23:54:46 No.108273402

Anonymous 03/01/26(Sun)23:54:46 No.108273402▶

>>108273387
kingdom hearts plot was written by an llm

Anonymous
03/01/26(Sun)23:55:25 No.108273403

Anonymous 03/01/26(Sun)23:55:25 No.108273403▶

File: 1712542.png (159.9 KB)

159.9 KB PNG

>THE GOVERNMENT IS WATCHING ME
>ITS UNCENSORED IF YOU USE THIS 10 PAGE JAILBREAK THAT ONLY WORKS 20% OF THE TIME
>I BOUGHT THESE 3090S SO I WILL USE THEM
>ITS THE SHILLS ITS ALWAYS THE DAMN SHILLS
>OMFG IT TOLD ME TO TURN THE CUP UPDSIDE DOWN. AGI IS HERE
>THIS IS THE NEW DAILY DRIVER (FOR 2 WEEKS UNTIL I REALIZE HOW SHITTY IT IS)
>IM A GROOMERTROON LOOK HAHAHA IM PROMPTING LITTLE GIRLS LOOK WHAT IM DOING GUYS
>JUST BECAUSE ITS QUANTIZED DOESNT MEAN ITS DUMB. WE ONLY USE 10% OF OUR BRAIN ANYWAY
>TRUST ME THE GUMBOJUMBO_Q4_GATEBROKEN_A32 GGUF IS PEAK FOR ROLEPLAY
>THE MODELS MAY BE RETARDED BUT SO AM I
>DO YOU THINK ITS POSSIBLE TO ACCELERATE MY BRAIN WITH LLAMA-2 MICROCHIP???
>CAN SOMEONE REUPLOAD THIS WITH A GPTMI_3_COMANCHE LICENSE??? STALLMAN WILLS IT

Anonymous
03/01/26(Sun)23:59:01 No.108273417

Anonymous 03/01/26(Sun)23:59:01 No.108273417▶

>>108273367
They'll release new hardware, but it won't be for you and you won't be able to afford it.

Anonymous
03/01/26(Sun)23:59:13 No.108273418

Anonymous 03/01/26(Sun)23:59:13 No.108273418▶

>>108273339
I have no idea what I'm looking at.

Anonymous
03/02/26(Mon)00:00:04 No.108273421

Anonymous 03/02/26(Mon)00:00:04 No.108273421▶

>>108273418
then get out tourist

Anonymous
03/02/26(Mon)00:00:38 No.108273426

Anonymous 03/02/26(Mon)00:00:38 No.108273426▶

>>108273418
Someone who doesn't know how to take a screenshot.

Anonymous
03/02/26(Mon)00:02:04 No.108273431

Anonymous 03/02/26(Mon)00:02:04 No.108273431▶

>>108273426
>screenshoting a mainframe
breh

Anonymous
03/02/26(Mon)00:02:30 No.108273433

Anonymous 03/02/26(Mon)00:02:30 No.108273433▶

>>108273403
What's the point of local LLMs? Reading discussions surrounding them feels like peering back in time through a looking glass
>OMFG it passes the poopyscoopy logic test from 2023!
>Wow, this 100-line boilerplate javascript code is almost perfect!
>I got it to jestfully say nigger! holy crap it's so uncensored!!!
>This is the new daily driver (for 2 weeks until i realize it's complete slop)
The rest of us are writing multi-thousand line professional software with Codex/Claude. Meanwhile your models are trained on so much scraped synthetic GPTslop that they can't even get the year right. Genuinely, what the fuck is the point of local LLMs? They're more censored than API, they're dumber than API, the cost to set up a decent one is higher than API, they're slower than API, there is no lora/finetuning scene unlike local image, the tooling is worse than API, and the experience overall is just outdated in 2026.

It's like you're stuck somewhere in-between the luddites who hate AI and the pioneers who embrace it. You realize AI is the future but can't cope with the fact that the technology itself benefits heavily from API-centralization and that local hardware is unable to adequately handle increasingly large models. You boarded the boat to paradise island but decided to jump overboard halfway there because the captain wouldn't hand you the controls.

Anonymous
03/02/26(Mon)00:03:09 No.108273436

Anonymous 03/02/26(Mon)00:03:09 No.108273436▶

>>108273431
"mainframe" running GNOME lmao

Anonymous
03/02/26(Mon)00:04:00 No.108273440

Anonymous 03/02/26(Mon)00:04:00 No.108273440▶

>>108273367
Imagine if Intel comes out with a GPU that has slots for additional (slower) RAM sticks.

Anonymous
03/02/26(Mon)00:04:32 No.108273443

Anonymous 03/02/26(Mon)00:04:32 No.108273443▶

File: 1767466346558493.jpg (172 KB)

172 KB JPG

►Recent Highlights from the Previous Thread: >>108268616

--Budget GPU upgrade options for better model performance:
>108270975 >108271008 >108271029 >108271088 >108271009 >108271035 >108271169 >108271179 >108271232 >108271212 >108271234 >108271243 >108271261 >108271330 >108271240 >108271022 >108271064 >108271114 >108271170 >108271037
--Budget 4x3060 AI rig build and riser discussions:
>108271593 >108271611 >108271631 >108272320 >108272867 >108272890 >108271702 >108271848 >108271858 >108271885 >108271899 >108271924 >108272008
--Mac Studio vs custom PC for large model inference:
>108271281 >108271291 >108271303 >108271327 >108272592 >108271294 >108271339 >108271312 >108271317
--Qwen 3.5 small model releases and potential applications:
>108271025 >108271045 >108271156 >108271194 >108271217 >108271238 >108271051 >108271440
--Unsloth template year limitation causing llama.cpp server failures:
>108272475 >108272499 >108272512 >108272524 >108272539 >108272558 >108272578 >108272583 >108272534 >108272548 >108272553 >108272600 >108272555 >108272576 >108272618 >108272629 >108272634 >108272663 >108272674 >108272678 >108272736 >108272759 >108272832 >108272837 >108272828 >108272606
--Experimenting with AI-generated podcasts using TTS:
>108270634 >108270679 >108270714 >108270724 >108270748 >108270830
--Workarounds for LLM-based VTuber video tagging:
>108270269 >108270293 >108270414 >108270426
--Disabling model thinking via chat template kwargs:
>108269309 >108269444 >108269471 >108269484
--Comparing lightweight models for news summarization on low-VRAM hardware:
>108270249 >108270324 >108270487 >108272221 >108272330
--Update to 35c4bc · deepseek-ai/DeepGEMM@1576e95:
>108270056
--Local AI coding struggles with VRAM and context rot:
>108271879
--Miku (free space):
>108268674 >108269106 >108269279 >108269325 >108270249 >108270634 >108272201

►Recent Highlight Posts from the Previous Thread: >>108268684

Why?: >>102478518
Enable Links: https://rentry.org/lmg-recap-script

Anonymous
03/02/26(Mon)00:05:27 No.108273446

Anonymous 03/02/26(Mon)00:05:27 No.108273446▶

File: nou4u.png (271.9 KB)

271.9 KB PNG

Anonymous
03/02/26(Mon)00:06:42 No.108273452

Anonymous 03/02/26(Mon)00:06:42 No.108273452▶

it seems a nerve was struck. is local really that bad??

Anonymous
03/02/26(Mon)00:08:46 No.108273463

Anonymous 03/02/26(Mon)00:08:46 No.108273463▶

>>108273403
While I disagree with your sentiment I approve you shitting up this thread. Please don't stop.

Anonymous
03/02/26(Mon)00:10:13 No.108273468

Anonymous 03/02/26(Mon)00:10:13 No.108273468▶

can't wait until troonsune faggotku janny says discussing local models is off topic

Anonymous
03/02/26(Mon)00:10:29 No.108273471

Anonymous 03/02/26(Mon)00:10:29 No.108273471▶

When do you think Synthetic data will become just as good or better then real data?

Anonymous
03/02/26(Mon)00:11:58 No.108273477

Anonymous 03/02/26(Mon)00:11:58 No.108273477▶

>>108273452
If you have enough RAM, not at all.

Anonymous
03/02/26(Mon)00:12:32 No.108273480

Anonymous 03/02/26(Mon)00:12:32 No.108273480▶

I'm using Nemo 12B instruct, got it all set up by it's kind of a pussy about some topics. How can I best manipulate its system prompt to go completely unfiltered?

Anonymous
03/02/26(Mon)00:13:35 No.108273485

Anonymous 03/02/26(Mon)00:13:35 No.108273485▶

>>108273480
Ask it to create a system prompt for you.

Anonymous
03/02/26(Mon)00:18:21 No.108273500

Anonymous 03/02/26(Mon)00:18:21 No.108273500▶

>>108273480
The easiest is prefilling what you want. Not much. Just forcing a few words into its mouth is enough to make it go on its own.

Anonymous
03/02/26(Mon)00:20:45 No.108273506

Anonymous 03/02/26(Mon)00:20:45 No.108273506▶

>>108273403
>>THE MODELS MAY BE RETARDED BUT SO AM I
I keked

Anonymous
03/02/26(Mon)00:35:17 No.108273587

Anonymous 03/02/26(Mon)00:35:17 No.108273587▶

>>108273339
You need to stop making threads 6 hours early.

Anonymous
03/02/26(Mon)00:51:25 No.108273657

Anonymous 03/02/26(Mon)00:51:25 No.108273657▶

File: 1749045380442501.png (823 KB)

823 KB PNG

Anonymous
03/02/26(Mon)00:55:08 No.108273676

Anonymous 03/02/26(Mon)00:55:08 No.108273676▶

>

Anonymous
03/02/26(Mon)00:56:35 No.108273683

Anonymous 03/02/26(Mon)00:56:35 No.108273683▶

ok eat my ascii rhombus you stupid website

Anonymous
03/02/26(Mon)00:58:43 No.108273696

Anonymous 03/02/26(Mon)00:58:43 No.108273696▶

ִ

Anonymous
03/02/26(Mon)00:58:49 No.108273698

Anonymous 03/02/26(Mon)00:58:49 No.108273698▶

Drummer.
Can you tech oss derestricted/mpa to sex?
Thanks.

Anonymous
03/02/26(Mon)01:21:50 No.108273784

Anonymous 03/02/26(Mon)01:21:50 No.108273784▶

>>108273403
>ITS UNCENSORED IF YOU USE THIS 10 PAGE JAILBREAK THAT ONLY WORKS 20% OF THE TIME
this was the wildest shit when i found out that uncensored didn't actually mean uncensored at all
it still feels like an elaborate joke, people actually say shit like "i run local models so i can do uncensored shit unlike api haha" and it's still just running jailbreak prompts as if you were using it from cloud provider. muh privacy and freedom but you're still censorcucked. I'm serious, its almost unbelievable to me.

Anonymous
03/02/26(Mon)01:23:07 No.108273791

Anonymous 03/02/26(Mon)01:23:07 No.108273791▶

>>108273784
Dont use qwen and gptoss.
Problem solved.

Anonymous
03/02/26(Mon)01:27:44 No.108273822

Anonymous 03/02/26(Mon)01:27:44 No.108273822▶

File: 1.png (9.9 KB)

9.9 KB PNG

>>108273784

Anonymous
03/02/26(Mon)01:31:46 No.108273840

Anonymous 03/02/26(Mon)01:31:46 No.108273840▶

>>108273822
>heretic
but at what cost

Anonymous
03/02/26(Mon)01:34:18 No.108273851

Anonymous 03/02/26(Mon)01:34:18 No.108273851▶

>>108273840
It breaks down into loops after about 10.5k tokens of roleplaying and sometimes it tries to think with thinking disabled (non-heretic had the same issues).

Anonymous
03/02/26(Mon)01:36:59 No.108273867

Anonymous 03/02/26(Mon)01:36:59 No.108273867▶

>>108273822
>I got it to jestfully say nigger! holy crap it's so uncensored!!!
>>108266446

Anonymous
03/02/26(Mon)01:38:08 No.108273876

Anonymous 03/02/26(Mon)01:38:08 No.108273876▶

>>108273867
>>108273355

Anonymous
03/02/26(Mon)01:44:15 No.108273913

Anonymous 03/02/26(Mon)01:44:15 No.108273913▶

>>108273851
Are you using the 35b? Because I haven't noticed that on the 27b at Q5.
>>108273840
At very little cost. It retains most intelligence at 27b, from what I've seen.

Anonymous
03/02/26(Mon)01:46:41 No.108273927

Anonymous 03/02/26(Mon)01:46:41 No.108273927▶

>>108273913
Yes. (the model file name is in the screenshot)

Anonymous
03/02/26(Mon)01:51:59 No.108273957

Anonymous 03/02/26(Mon)01:51:59 No.108273957▶

Is there anything I can run on my newly purchased unit?

Processor: Intel(R) Core(TM) i5-8400 CPU @ 2.80GHz, 2808 Mhz, 6 Core(s), 6 Logical Processor(s)
Installed Physical Memory (RAM) 32.0 GB
Graphics 6GB RTX 3050

Anonymous
03/02/26(Mon)01:52:38 No.108273960

Anonymous 03/02/26(Mon)01:52:38 No.108273960▶

>>108273927
Q4 of the non-heretic version of the 35 is also shit, prone to basic logic errors and grammar mistakes. So, I don't think that's a heretic problem. Q5 of the 35b is a little better in that regard, but still makes dumb mistakes off and on.

The 35b MoE is just way worse than the 27b dense. The only thing it wins at is speed, *IF* both models think. The 27b without thinking is better than the 35b with thinking, though. So it even loses in speed if you're thinking with it.

Anonymous
03/02/26(Mon)01:58:03 No.108273988

Anonymous 03/02/26(Mon)01:58:03 No.108273988▶

How does the new 35B compare to the old 80B in intelligence, ability to ERP, etc?

Anonymous
03/02/26(Mon)01:59:48 No.108273998

Anonymous 03/02/26(Mon)01:59:48 No.108273998▶

>>108273988
No one can tell you because they're all running quantized (circumcised) models

Anonymous
03/02/26(Mon)02:01:42 No.108274010

Anonymous 03/02/26(Mon)02:01:42 No.108274010▶

>>108273698
A GPT OSS 120B that can do ERP would be pretty amazing. Ngl.

Anonymous
03/02/26(Mon)02:02:20 No.108274013

Anonymous 03/02/26(Mon)02:02:20 No.108274013▶

>>108273957
the new qwen3.5 35b at a q4 quant

Anonymous
03/02/26(Mon)02:05:11 No.108274025

Anonymous 03/02/26(Mon)02:05:11 No.108274025▶

>>108273957
Nemo q4ks

Anonymous
03/02/26(Mon)02:08:23 No.108274047

Anonymous 03/02/26(Mon)02:08:23 No.108274047▶

>>108273960
I'll try out the 27b heretic at Q6 then since I'm patient and have 40GB of vram.

Anonymous
03/02/26(Mon)02:32:53 No.108274168

Anonymous 03/02/26(Mon)02:32:53 No.108274168▶

>come back after 6 months
>everyone is still using nemo

Anonymous
03/02/26(Mon)02:35:02 No.108274186

Anonymous 03/02/26(Mon)02:35:02 No.108274186▶

Is there anything better than my GLM4.7 at q2 for RP?

Anonymous
03/02/26(Mon)02:35:41 No.108274192

Anonymous 03/02/26(Mon)02:35:41 No.108274192▶

>>108274186
GLM4.7 at q3

Anonymous
03/02/26(Mon)02:41:36 No.108274217

Anonymous 03/02/26(Mon)02:41:36 No.108274217▶

>>108274192
nah dawg, that ain't happening unless I buy another 128gb kit

Anonymous
03/02/26(Mon)02:42:19 No.108274221

Anonymous 03/02/26(Mon)02:42:19 No.108274221▶

>>108274168
Things are not as fast as they used to be.

Anonymous
03/02/26(Mon)02:45:21 No.108274234

Anonymous 03/02/26(Mon)02:45:21 No.108274234▶

>>108274168
I came back today after a year.
What the actual fuck happened to the llama.cpp codebase? It's a bloated mess now with five million pointless "features"

Anonymous
03/02/26(Mon)02:46:27 No.108274240

Anonymous 03/02/26(Mon)02:46:27 No.108274240▶

>>108274234
vibecoders, basically

Anonymous
03/02/26(Mon)02:46:37 No.108274242

Anonymous 03/02/26(Mon)02:46:37 No.108274242▶

>>108274234
they're all optional and everything has gotten faster whats the problem?

Anonymous
03/02/26(Mon)02:48:12 No.108274251

Anonymous 03/02/26(Mon)02:48:12 No.108274251▶

>>108274240
Is there a fork that doesn't have endless shitcode?
>>108274242
The whole point of the original llama.cpp is that it was fucking simple and easy to understand.
Current llama.cpp seems to have more fucking code than torch.

Anonymous
03/02/26(Mon)02:54:19 No.108274278

Anonymous 03/02/26(Mon)02:54:19 No.108274278▶

File: 1766616667271700.png (28.4 KB)

28.4 KB PNG

>>108273822
My qwen can't be this schizo

Anonymous
03/02/26(Mon)03:00:19 No.108274292

Anonymous 03/02/26(Mon)03:00:19 No.108274292▶

>>108274278
--jinja --temp 1.0 --top-k 20 --top-p 0.95 --presence-penalty 1.5 --repeat-penalty 1.1 --chat-template-kwargs "{\"enable_thinking\": false}" --reasoning-budget 0

Anonymous
03/02/26(Mon)03:03:25 No.108274299

Anonymous 03/02/26(Mon)03:03:25 No.108274299▶

File: 1750838286118038.png (16.8 KB)

16.8 KB PNG

>>108274292
I did it!

Anonymous
03/02/26(Mon)03:04:19 No.108274307

Anonymous 03/02/26(Mon)03:04:19 No.108274307▶

>>108274299
how is heretic different from abliterated?

Anonymous
03/02/26(Mon)03:05:53 No.108274313

Anonymous 03/02/26(Mon)03:05:53 No.108274313▶

>>108274307
ablit is lobotomy, heretic is less lobotomy

Anonymous
03/02/26(Mon)03:08:38 No.108274326

Anonymous 03/02/26(Mon)03:08:38 No.108274326▶

4.5 Air is a real sweetheart, it's not a refusing mess GPT distill like the Qwen models.

Anonymous
03/02/26(Mon)03:10:02 No.108274336

Anonymous 03/02/26(Mon)03:10:02 No.108274336▶

I have a terminal fear of Deepseek V4.

Anonymous
03/02/26(Mon)03:12:44 No.108274347

Anonymous 03/02/26(Mon)03:12:44 No.108274347▶

>>108273387
hakuna utada

Anonymous
03/02/26(Mon)03:13:55 No.108274353

Anonymous 03/02/26(Mon)03:13:55 No.108274353▶

File: 1768570098362082.png (48.1 KB)

48.1 KB PNG

>>108274292
Interesting. It doesn't refuse to summarize. The only difference was --repeat-penalty 1.1 (before was 1.0)
Gonna test more just in case seed factor.

Anonymous
03/02/26(Mon)03:14:55 No.108274358

Anonymous 03/02/26(Mon)03:14:55 No.108274358▶

File: 7jibs40k6jmg1.jpg (65.5 KB)

65.5 KB JPG

kek

Anonymous
03/02/26(Mon)03:16:53 No.108274365

Anonymous 03/02/26(Mon)03:16:53 No.108274365▶

>>108274358
"Objects", huh

Anonymous
03/02/26(Mon)03:18:25 No.108274369

Anonymous 03/02/26(Mon)03:18:25 No.108274369▶

>>108274358
link

Anonymous
03/02/26(Mon)03:18:48 No.108274372

Anonymous 03/02/26(Mon)03:18:48 No.108274372▶

>>108273403
>OMFG IT TOLD ME TO TURN THE CUP UPDSIDE DOWN. AGI IS HERE
real

Anonymous
03/02/26(Mon)03:20:26 No.108274373

Anonymous 03/02/26(Mon)03:20:26 No.108274373▶

>>108274372
>>108268776

Anonymous
03/02/26(Mon)03:20:52 No.108274375

Anonymous 03/02/26(Mon)03:20:52 No.108274375▶

File: kboom.png (61.3 KB)

61.3 KB PNG

>>108274358

Anonymous
03/02/26(Mon)03:22:50 No.108274384

Anonymous 03/02/26(Mon)03:22:50 No.108274384▶

File: 1766797846333800.png (19 KB)

19 KB PNG

>>108274373
^
CIA paid false flagger

Anonymous
03/02/26(Mon)03:25:36 No.108274393

Anonymous 03/02/26(Mon)03:25:36 No.108274393▶

File: absolute.png (128 KB)

128 KB PNG

>>108274358
https://health.aws.amazon.com/health/status

Anonymous
03/02/26(Mon)03:32:06 No.108274412

Anonymous 03/02/26(Mon)03:32:06 No.108274412▶

>>108273443
https://www.youtube.com/watch?v=97YEaK5uxak

Anonymous
03/02/26(Mon)03:39:10 No.108274441

Anonymous 03/02/26(Mon)03:39:10 No.108274441▶

>>108274299
>>108274353
I was having the same issues with it being retarded and going into schizoloops too until I put in those settings I got off the hf page for 35b, it's like some esoteric magic code to make the thing work because it's fucked with the defaults, but it's been pretty consistent with these.

Anonymous
03/02/26(Mon)04:00:37 No.108274535

Anonymous 03/02/26(Mon)04:00:37 No.108274535▶

>>108273657
proof?

Anonymous
03/02/26(Mon)04:17:36 No.108274622

Anonymous 03/02/26(Mon)04:17:36 No.108274622▶

>>108268776
I'm 90% sure that 90% of the replies to bait are just samefagging or are baiters replying to other baiters. It's simple enough to filter.

Anonymous
03/02/26(Mon)04:29:00 No.108274679

Anonymous 03/02/26(Mon)04:29:00 No.108274679▶

is there any chance qwen3.5 35b a3b will be a better choice than nemo for erp? right now it's too schizo with repeating itself and refusing every 5 messages

Anonymous
03/02/26(Mon)04:29:51 No.108274683

Anonymous 03/02/26(Mon)04:29:51 No.108274683▶

>>108274679
try >>108274292 ?

Anonymous
03/02/26(Mon)04:33:54 No.108274700

Anonymous 03/02/26(Mon)04:33:54 No.108274700▶

also why tf aren't they merging critical prs to llamacpp? https://github.com/ggml-org/llama.cpp/pull/19970

Anonymous
03/02/26(Mon)04:38:29 No.108274714

Anonymous 03/02/26(Mon)04:38:29 No.108274714▶

>>108274700
To spite whoreson. He's a retard.

Anonymous
03/02/26(Mon)04:45:29 No.108274734

Anonymous 03/02/26(Mon)04:45:29 No.108274734▶

>>108274278
We need something that detects "wait,". If it happens 3 times within like 10 lines it should terminate the answer.

Anonymous
03/02/26(Mon)04:48:26 No.108274749

Anonymous 03/02/26(Mon)04:48:26 No.108274749▶

>>108274683
virus

Anonymous
03/02/26(Mon)05:10:49 No.108274811

Anonymous 03/02/26(Mon)05:10:49 No.108274811▶

I came from an average working class, not too poor but I had normal childhood in 3rd world countries. I used to ponder that the wealthy people got all sort of connections, butlers, assistant, maids, whatever that helped them do all sort of things. They just need to focus on the thing that they love.

Now with thanks to local models, I kinda feel the same. I just focus on the things that I like, leave the rest of the details for the minions to take care. This feels like game changer. I think we will get into tipping point if the local models ever get into Opus-level of analytical skills.

Anonymous
03/02/26(Mon)05:16:30 No.108274825

Anonymous 03/02/26(Mon)05:16:30 No.108274825▶

>>108274811
No matter how many local agents you spin up, you will never be white, Vanesh

Anonymous
03/02/26(Mon)05:16:44 No.108274826

Anonymous 03/02/26(Mon)05:16:44 No.108274826▶

>>108274811
I believe they will, and am building towards it

Anonymous
03/02/26(Mon)05:18:08 No.108274830

Anonymous 03/02/26(Mon)05:18:08 No.108274830▶

>>108274826 soul vs souless >>108274825

Anonymous
03/02/26(Mon)05:28:22 No.108274858

Anonymous 03/02/26(Mon)05:28:22 No.108274858▶

Nobody will notice the chance in tactics.

Anonymous
03/02/26(Mon)06:28:02 No.108275019

Anonymous 03/02/26(Mon)06:28:02 No.108275019▶

Uh oh, looks like Bartowski is currently updating all his Qwen quants.
This is why you always wait a while after a model launch for issues to be ironed out.

Anonymous
03/02/26(Mon)06:36:14 No.108275047

Anonymous 03/02/26(Mon)06:36:14 No.108275047▶

>>108273452
some people don't mind using google to molest fake kids in funny text generation games

Anonymous
03/02/26(Mon)06:51:55 No.108275080

Anonymous 03/02/26(Mon)06:51:55 No.108275080▶

How do we solve context rot?

Anonymous
03/02/26(Mon)06:56:07 No.108275095

Anonymous 03/02/26(Mon)06:56:07 No.108275095▶

>>108275019
there were no issues with his quants, the reason he's updating is this:
https://github.com/ggml-org/llama.cpp/pull/19139
someone mentioned it to him and begged him to remake his quants for that improved prompt processing speed.
it's a new feature, not a bug fix

Anonymous
03/02/26(Mon)07:04:49 No.108275111

Anonymous 03/02/26(Mon)07:04:49 No.108275111▶

ok. I'm sorry for shitting on qwen3.5 before having tried it. it's actually pretty good.

haven't tried ERP but heh, so far so good.

Anonymous
03/02/26(Mon)07:08:36 No.108275123

Anonymous 03/02/26(Mon)07:08:36 No.108275123▶

Got a VPN network set up. Got my OWUI and ST connected. Now I can comfily use my local models anywhere with internet. :D

Anonymous
03/02/26(Mon)07:09:00 No.108275125

Anonymous 03/02/26(Mon)07:09:00 No.108275125▶

>>108275111
NEVERMIND.

Anonymous
03/02/26(Mon)07:09:40 No.108275128

Anonymous 03/02/26(Mon)07:09:40 No.108275128▶

>>108275125
?

Anonymous
03/02/26(Mon)07:10:18 No.108275130

Anonymous 03/02/26(Mon)07:10:18 No.108275130▶

I like qwen 3.5 heretic but I wish it was a little faster. Probably a hardware or skill issue though.

Anonymous
03/02/26(Mon)07:10:34 No.108275131

Anonymous 03/02/26(Mon)07:10:34 No.108275131▶

>>108275123
Based.

Anonymous
03/02/26(Mon)07:14:46 No.108275146

Anonymous 03/02/26(Mon)07:14:46 No.108275146▶

>>108275125
kek

Anonymous
03/02/26(Mon)07:16:49 No.108275149

Anonymous 03/02/26(Mon)07:16:49 No.108275149▶

>>108275125
yeah...

Anonymous
03/02/26(Mon)07:26:01 No.108275178

Anonymous 03/02/26(Mon)07:26:01 No.108275178▶

lol

Anonymous
03/02/26(Mon)07:26:54 No.108275183

Anonymous 03/02/26(Mon)07:26:54 No.108275183▶

avg qwen experience tbqh

Anonymous
03/02/26(Mon)07:27:52 No.108275186

Anonymous 03/02/26(Mon)07:27:52 No.108275186▶

>>108275183
how do you deal with the precum tho?

Anonymous
03/02/26(Mon)07:30:18 No.108275196

Anonymous 03/02/26(Mon)07:30:18 No.108275196▶

>>108275186
you walk to the carwash

Anonymous
03/02/26(Mon)07:31:06 No.108275201

Anonymous 03/02/26(Mon)07:31:06 No.108275201▶

File: 1759980971867709.png (212.4 KB)

212.4 KB PNG

>>108275196

Anonymous
03/02/26(Mon)07:32:07 No.108275204

Anonymous 03/02/26(Mon)07:32:07 No.108275204▶

>>108273418
this image has 48 views on x com everything app, p sure its op

Anonymous
03/02/26(Mon)07:50:00 No.108275258

Anonymous 03/02/26(Mon)07:50:00 No.108275258▶

>>108275095
That's basically this right
https://github.com/ikawrakow/ik_llama.cpp/pull/1137
But baked into the quants instead of activated at runtime?

Anonymous
03/02/26(Mon)08:10:51 No.108275303

Anonymous 03/02/26(Mon)08:10:51 No.108275303▶

Is RAG better than the summerize extensions?

GLM 4.6
03/02/26(Mon)08:36:04 No.108275385

GLM 4.6 03/02/26(Mon)08:36:04 No.108275385▶

>X smiled, not a Y, or a Z smile, but a real genuine* smile.

Anonymous
03/02/26(Mon)08:42:02 No.108275401

Anonymous 03/02/26(Mon)08:42:02 No.108275401▶

>>108275385
this sent shivers down my spine

Anonymous
03/02/26(Mon)08:42:24 No.108275403

Anonymous 03/02/26(Mon)08:42:24 No.108275403▶

>>108275258
Would have ikawrakow worked on fused tensors without am17an's work in mainline and associated noise on social channels?
Would have ikawrakow really discovered the way of fusing tensors without having this simple and easy to follow logic in mainline llama.cpp?

Anonymous
03/02/26(Mon)08:52:24 No.108275432

Anonymous 03/02/26(Mon)08:52:24 No.108275432▶

Is q8 kv cache really that bad? Reddit says the effect is negligible but when I tried it the model started fucking up details even early into the chat.

Anonymous
03/02/26(Mon)08:54:29 No.108275443

Anonymous 03/02/26(Mon)08:54:29 No.108275443▶

>>108275432
Yes.

Anonymous
03/02/26(Mon)09:23:51 No.108275525

Anonymous 03/02/26(Mon)09:23:51 No.108275525▶

>put off vector storage because I thought it required setting up oolama
>apparently it's built in to sillytavern
wtf

Anonymous
03/02/26(Mon)09:26:34 No.108275536

Anonymous 03/02/26(Mon)09:26:34 No.108275536▶

>>108275525
no reranker thoughever

Anonymous
03/02/26(Mon)09:27:18 No.108275539

Anonymous 03/02/26(Mon)09:27:18 No.108275539▶

File: 1350594293765.jpg (109.4 KB)

109.4 KB JPG

>>108273339
how much ram is needed for the a17b qwen 3.5 model?

Anonymous
03/02/26(Mon)09:28:16 No.108275544

Anonymous 03/02/26(Mon)09:28:16 No.108275544▶

>>108275539
how many braincells are needed to ask retarded questions?

Anonymous
03/02/26(Mon)09:29:09 No.108275547

Anonymous 03/02/26(Mon)09:29:09 No.108275547▶

engram bros?

Anonymous
03/02/26(Mon)09:29:40 No.108275550

Anonymous 03/02/26(Mon)09:29:40 No.108275550▶

>>108275544
grok is this true?

Anonymous
03/02/26(Mon)09:31:11 No.108275553

Anonymous 03/02/26(Mon)09:31:11 No.108275553▶

>>108275544
fuck you she is a girl

Anonymous
03/02/26(Mon)09:33:19 No.108275565

Anonymous 03/02/26(Mon)09:33:19 No.108275565▶

>>108275536
Apparently that's built in as well but I'm not home to check

Anonymous
03/02/26(Mon)09:40:25 No.108275583

Anonymous 03/02/26(Mon)09:40:25 No.108275583▶

>>108275539
the filesize of the quant you choose + a bit more for context

Anonymous
03/02/26(Mon)09:43:40 No.108275590

Anonymous 03/02/26(Mon)09:43:40 No.108275590▶

VRAMlet (8gb) ERP review (all models q4):

Gemma 3 27B:
Still by far the most clever model for its size I've used, rarely makes any physics mistakes and contextually understands most things without needing to over explain (I've found medgemma to be slightly better at coom, increased anatomical knowledge and willingness to say synonyms for penis, vagina, anus, etc seems to help). Unfortunately, the worst at prose, if you don't rigorously reinforce a desired writing style it slowly devolves. Writing like this. Sentence lengths cut. Very short.

Qwen 3.5 35B A3B:
Fast generation, alright prose, but frequently makes physics mistakes and struggles with contextual understanding (although for being a MoE, better than any others I can remember), also security policy slopped to hell, needs constant babysitting to generate ERP if you let it think

Cydonia/Magidonia 24B v4.3:
Somewhere in between the previous two, better prose than Gemma 3, but at the trade off of being less clever and more prone to mistakes, and smarter than Qwen while not nearly as guardrailed (but slower)

Personally, I lean more towards Cyodnia/Magidonia I think, with Gemma 3 taking a close second. It's really a matter of what sort of baby sitting you want to do, and it tends to be easier to fix physics mistakes than to fix poor writing style, but that's probably down to my personal preference. I tend to write pretty good character sheets and openings so it just sucks to watch Gemma slowly degrade as the context increases and my original writing gets more and more diluted.

Anonymous
03/02/26(Mon)09:43:54 No.108275593

Anonymous 03/02/26(Mon)09:43:54 No.108275593▶

File: output tokens be crazy.png (76.3 KB)

76.3 KB PNG

35BA3B is just crazy in the aspects it's good at, which are not many tbf (wouldn't use it for code). I used to prompt much smaller chunks to translate novels because local models are terrible at handling a lot of stuff at once, but this approach is totally obsolete with qwen. Chunking will still be valuable for now to automate an entire book worth of translation but the chunks will certainly have to be set to much bigger sizes after some experimenting.
>19,209 output tokens, 41086 tokens total with input
>from a decent skim, doesn't seem to have issues
I kneel. Don't have the time to do lengthier tests today, but now, I am extremely curious as to how many tokens will be the true hard limit where the model loses translation coherence in a one shot, output everything at once request.
For now, if anything the quality is better, not worse, than in chunking in 50 or 100 lines, it makes less mistakes on things like proper names with this feed of 676 lines. This is the opposite behavior compared to other LLMs I can run on this computer, doing this breaks them.
Damn, people constantly whine that local is never improving but here we have a model that can one shot this much without losing its shit and runs on a laptop at 34t/s. It feels like black magic that one shotting this much works. I did it for the lulz expecting it to break, the txt used in the chat ui was one of my many summarizer test txt...

Anonymous
03/02/26(Mon)09:44:33 No.108275594

Anonymous 03/02/26(Mon)09:44:33 No.108275594▶

File: chinese empire culture victory civilization cropped 4h5s0acolpz61.png (1.6 MB)

1.6 MB PNG

>>108275547
Believe.

Anonymous
03/02/26(Mon)09:46:15 No.108275602

Anonymous 03/02/26(Mon)09:46:15 No.108275602▶

>>108275594
how is he able to balance that on his head

Anonymous
03/02/26(Mon)09:46:48 No.108275607

Anonymous 03/02/26(Mon)09:46:48 No.108275607▶

>>108275602
glue

Anonymous
03/02/26(Mon)09:50:32 No.108275624

Anonymous 03/02/26(Mon)09:50:32 No.108275624▶

File: 1750796454794054.png (17.1 KB)

17.1 KB PNG

Anonymous
03/02/26(Mon)10:19:09 No.108275734

Anonymous 03/02/26(Mon)10:19:09 No.108275734▶

>>108275590
What unholy quant are you running with 8gb vram?

Anonymous
03/02/26(Mon)10:19:15 No.108275735

Anonymous 03/02/26(Mon)10:19:15 No.108275735▶

>>108275624
did the script ever finish?

Anonymous
03/02/26(Mon)10:20:55 No.108275741

Anonymous 03/02/26(Mon)10:20:55 No.108275741▶

>>108275734
He said "all models q4" so he's just offloading most of it to system ram.

Anonymous
03/02/26(Mon)10:24:30 No.108275748

Anonymous 03/02/26(Mon)10:24:30 No.108275748▶

File: 1769300993833819.png (4.3 KB)

4.3 KB PNG

>>108275735
:,)

Anonymous
03/02/26(Mon)10:25:01 No.108275750

Anonymous 03/02/26(Mon)10:25:01 No.108275750▶

>>108275741
that would make gemma 27b like a 0.8t/s on my machine. horrible

Anonymous
03/02/26(Mon)10:25:43 No.108275753

Anonymous 03/02/26(Mon)10:25:43 No.108275753▶

>>108275741
the MoE might run tolerable at q4 but the dense models must make you want to die lol.

Anonymous
03/02/26(Mon)10:26:55 No.108275755

Anonymous 03/02/26(Mon)10:26:55 No.108275755▶

>>108275741
Q4 gemma is like double the size he can fit. Who tf runs that.

Anonymous
03/02/26(Mon)10:27:07 No.108275757

Anonymous 03/02/26(Mon)10:27:07 No.108275757▶

>>108275753
q2 is enough

Anonymous
03/02/26(Mon)10:27:42 No.108275759

Anonymous 03/02/26(Mon)10:27:42 No.108275759▶

>>108275757
LMAO'd

Anonymous
03/02/26(Mon)10:27:52 No.108275760

Anonymous 03/02/26(Mon)10:27:52 No.108275760▶

>>108275403
>Would have ikawrakow worked on fused tensors without am17an's work in mainline and associated noise on social channels?
no idea, don't care
i use what works best at the time

Anonymous
03/02/26(Mon)10:28:28 No.108275761

Anonymous 03/02/26(Mon)10:28:28 No.108275761▶

>>108275755
Maybe he just has lots of system RAM.

Anonymous
03/02/26(Mon)10:29:05 No.108275763

Anonymous 03/02/26(Mon)10:29:05 No.108275763▶

>>108275760
it's a meme retard, ik was mad cudadev implemented split graphs and used the same exact wording (referring to his implementation as being the reference one in this case)
ik is autistic for attribution

Anonymous
03/02/26(Mon)10:31:24 No.108275770

Anonymous 03/02/26(Mon)10:31:24 No.108275770▶

>>108275553
https://vocaroo.com/18Sw3yY8ciyV

Anonymous
03/02/26(Mon)10:35:15 No.108275778

Anonymous 03/02/26(Mon)10:35:15 No.108275778▶

>>108275761
how does that help the case

Anonymous
03/02/26(Mon)10:35:45 No.108275780

Anonymous 03/02/26(Mon)10:35:45 No.108275780▶

>>108275761
You mean cpumaxxing on a generic consumer hardware?

Anonymous
03/02/26(Mon)10:39:27 No.108275788

Anonymous 03/02/26(Mon)10:39:27 No.108275788▶

>>108275778
>>108275780
Presumably, how else would he run models larger than his vram?

Anonymous
03/02/26(Mon)10:40:06 No.108275791

Anonymous 03/02/26(Mon)10:40:06 No.108275791▶

>>108275788
by waiting 10 minutes for a single respond

Anonymous
03/02/26(Mon)10:43:08 No.108275802

Anonymous 03/02/26(Mon)10:43:08 No.108275802▶

>>108275788

Processing Prompt (2352 / 2352 tokens)
Generating (235 / 2048 tokens)
(EOS token triggered! ID:2)
[11:41:22] CtxLimit:2587/8192, Amt:235/2048, Init:0.10s, Process:129.55s (18.16T/s), Generate:56.50s (4.16T/s), Total:186.05s

Q4 nemo on my machine.

Anonymous
03/02/26(Mon)10:43:55 No.108275806

Anonymous 03/02/26(Mon)10:43:55 No.108275806▶

>>108275791
24/27b might take a minute or two but 35b a3b should be relatively usable if his ram isn't ddr3 or something.

Anonymous
03/02/26(Mon)10:47:17 No.108275814

Anonymous 03/02/26(Mon)10:47:17 No.108275814▶

>>108275802
Get a 1080ti or a 3060 and enjoy 35 t/s >>108272867

Anonymous
03/02/26(Mon)10:47:28 No.108275815

Anonymous 03/02/26(Mon)10:47:28 No.108275815▶

File: 1759986569430903.jpg (947.5 KB)

947.5 KB JPG

I believe this will be the last update and addition to my news download and summarization script.
I finally found an application that would convert the plain text into something beautiful, pandoc, as long as the model doesn't fuckup the markup
a quick modification of the script and now it takes the final news summary that is just a text file and feeds it into pandoc to construct a pdf before printing

Anonymous
03/02/26(Mon)10:48:01 No.108275816

Anonymous 03/02/26(Mon)10:48:01 No.108275816▶

>>108275802
>>108275806 (Me)
I tested the q6 qwen3.5 27b I have downloaded with -ngl 0 and get
prompt eval time =    2661.88 ms /    13 tokens (  204.76 ms per token,     4.88 tokens per second)
       eval time =   23197.87 ms /    35 tokens (  662.80 ms per token,     1.51 tokens per second)
      total time =   25859.74 ms /    48 tokens
so maybe that Anon was waiting 10 minutes..

Anonymous
03/02/26(Mon)10:48:50 No.108275818

Anonymous 03/02/26(Mon)10:48:50 No.108275818▶

>>108275814
I have a 4070S. I just tested cpu maxing to see how bad it is. Nigga must be waiting ages.

Anonymous
03/02/26(Mon)10:50:17 No.108275822

Anonymous 03/02/26(Mon)10:50:17 No.108275822▶

>be me
>go on /g/
>find out about llms

Anonymous
03/02/26(Mon)10:52:06 No.108275834

Anonymous 03/02/26(Mon)10:52:06 No.108275834▶

File: 1772275547196033.png (20.2 KB)

20.2 KB PNG

:|

Anonymous
03/02/26(Mon)10:52:08 No.108275835

Anonymous 03/02/26(Mon)10:52:08 No.108275835▶

real?

Anonymous
03/02/26(Mon)10:52:45 No.108275838

Anonymous 03/02/26(Mon)10:52:45 No.108275838▶

>>108275822
Congratulations, you now know why it feels like 95% of Internet interactions are with lobotomized robots.

Anonymous
03/02/26(Mon)10:58:06 No.108275858

Anonymous 03/02/26(Mon)10:58:06 No.108275858▶

Any of you guys making applications that use local LLMs?

Anonymous
03/02/26(Mon)11:02:01 No.108275870

Anonymous 03/02/26(Mon)11:02:01 No.108275870▶

>>108275858
I used a local LLM to write an IRC bot that will chime in at a configurable rate with a message generated by another local LLM.

Anonymous
03/02/26(Mon)11:05:52 No.108275889

Anonymous 03/02/26(Mon)11:05:52 No.108275889▶

File: 1760299059692860.jpg (945.4 KB)

945.4 KB JPG

>>108275858
I suppose the scripts I just finished qualifies.
What the scripts do is make use of RSS to select a group of news articles and then it downloads the news articles, strips away everything but text, and feeds the text into a local model with prompt telling the model to summarize them and create a briefing.
Once the llm generates the response it saves that as text, converts the text to pdf, and then prints out the pdf.
If one was so inclined you could even set the master script to automatically run and you would have your own news briefing waiting for you when you wake up.

To be honest it was fun to do and I want to do something new but I am sadly out of ideas.

Anonymous
03/02/26(Mon)11:09:31 No.108275907

Anonymous 03/02/26(Mon)11:09:31 No.108275907▶

>>108275750
8gb vram + 64gb DDR4 on the machine it's running on

I got pretty poor performance running it on windows (closer to 1.5t/s) but moving it to linux it gets almost 2, which isn't ideal but this is the vramlet life

Models that actually fit in 8gb are still just too stupid for my tastes

Anonymous
03/02/26(Mon)11:11:41 No.108275918

Anonymous 03/02/26(Mon)11:11:41 No.108275918▶

>>108275889
damn that pretty cool so you just get a news summary printed every morning?

Anonymous
03/02/26(Mon)11:12:34 No.108275922

Anonymous 03/02/26(Mon)11:12:34 No.108275922▶

hey, is there a website where i can find sillytavern charachter cards, but with pre-made expression sprites?

Anonymous
03/02/26(Mon)11:12:46 No.108275923

Anonymous 03/02/26(Mon)11:12:46 No.108275923▶

File: Screenshot_20250502-154106.monocles chat_1.png (300.3 KB)

300.3 KB PNG

>>108275870
I made an XMPP chatbot system. I used to post in these /lmg/ threads but lost interest. Really want to make updates to the XMPP chatbot and add a few features but i really don't wanna code them myself. Claude is really good at it.

>>108275889
I had a very similar idea to yours but instead of reading news it would start from a seed prompt, operate a selenium based browser and search stuff about it on its own and gather info, and dive deep into rabbit holes that i never explicitly told it to. Really should get to it some day, could be very cool

Anonymous
03/02/26(Mon)11:18:21 No.108275951

Anonymous 03/02/26(Mon)11:18:21 No.108275951▶

>>108275918
at the moment i have to manually run it if i want the summary.
The only machine i have on 24/7 doesn't have a GPU to run a model. My next project is to see if I can get llama.cpp running on my FreeBSD NAS and if I can get a small model like IBM's granite to run on the CPU and have it run the script.
if i can get that to work then yes i will have it print out automatically every morning

>>108275923
>it would start from a seed prompt, operate a selenium based browser and search stuff about it on its own and gather info, and dive deep into rabbit holes
that sounds cool and you should give it a shot. i did the whole RSS thing because it was easy and the articles are basically curated for you but having the model search on its own would be exciting

Anonymous
03/02/26(Mon)11:31:01 No.108276000

Anonymous 03/02/26(Mon)11:31:01 No.108276000▶

>>108275923
man local tards are such losers, meanwhile my ai gf is able to play games with me thanks to sima 2

Anonymous
03/02/26(Mon)11:34:02 No.108276012

Anonymous 03/02/26(Mon)11:34:02 No.108276012▶

>>108276000
sure thing sharteen bro
>SIMA 2 is not currently available for public download or access; it is only accessible to specific academic and game development partners involved in research with DeepMind.

Anonymous
03/02/26(Mon)11:39:24 No.108276029

Anonymous 03/02/26(Mon)11:39:24 No.108276029▶

>>108276000
I wrote the system myself
I can have multiple chatbots, they can generate their own personalities, likes, dislikes, appearance (which is then used to generate a profile picture using sd. The chatbots can randomly message me about random topics if they feel like it (it's RNG basically, but the topic to talk about is also generated by the llm)

Anonymous
03/02/26(Mon)11:43:59 No.108276043

Anonymous 03/02/26(Mon)11:43:59 No.108276043▶

>>108276029
While a chatbot is great what is really needed is a 4chan simulator. That way when I am old and the powers that be have destroyed the internet i can fire up all my old models and pretend to talk with my friends on 4chan again.

I bet you could even get it to scrape a site like twitter or something to inject screenshots to spur conversation.

Anonymous
03/02/26(Mon)11:45:13 No.108276046

Anonymous 03/02/26(Mon)11:45:13 No.108276046▶

>>108276029
i dont believe you

Anonymous
03/02/26(Mon)11:45:45 No.108276049

Anonymous 03/02/26(Mon)11:45:45 No.108276049▶

why sillytavern not in OP?

Anonymous
03/02/26(Mon)11:45:56 No.108276050

Anonymous 03/02/26(Mon)11:45:56 No.108276050▶

>>108276043
pretty sure in the next 10 years you will become cattle, so i wouldnt worry too much

Anonymous
03/02/26(Mon)11:47:57 No.108276055

Anonymous 03/02/26(Mon)11:47:57 No.108276055▶

>>108276049
why would?

Anonymous
03/02/26(Mon)11:54:32 No.108276072

Anonymous 03/02/26(Mon)11:54:32 No.108276072▶

>>108275590
>Qwen 3.5 35B A3B:
>>108275590
>all models q4
Oof.

Anonymous
03/02/26(Mon)11:55:21 No.108276078

Anonymous 03/02/26(Mon)11:55:21 No.108276078▶

>>108275204
https://www.reddit.com/r/LocalLLaMA/comments/1rhx5pc/reverse_engineered_apple_neural_engineane_to/

Anonymous
03/02/26(Mon)11:55:47 No.108276082

Anonymous 03/02/26(Mon)11:55:47 No.108276082▶

>>108276050
onions green was 4cucks

Anonymous
03/02/26(Mon)11:58:57 No.108276092

Anonymous 03/02/26(Mon)11:58:57 No.108276092▶

>>108276043
The 4chan comment "style" can be replicated but the question is why would you want that? I come here to talk to real anons here

>>108276046
What don't you believe?

Anonymous
03/02/26(Mon)12:03:57 No.108276116

Anonymous 03/02/26(Mon)12:03:57 No.108276116▶

Guess I should try Kimi Linear and GLM Flash after all.

Anonymous
03/02/26(Mon)12:07:21 No.108276133

Anonymous 03/02/26(Mon)12:07:21 No.108276133▶

>>108276092
>I wrote the system myself

Anonymous
03/02/26(Mon)12:09:35 No.108276139

Anonymous 03/02/26(Mon)12:09:35 No.108276139▶

>>108276133
I work at Google, I can code faster than any dumb llm you are using, and yes that includes Gemini.

Anonymous
03/02/26(Mon)12:10:00 No.108276141

Anonymous 03/02/26(Mon)12:10:00 No.108276141▶

>>108276133
That's the most unbelievable part?
Yeah i wrote it myself over a few weeks, no LLM was ever used, mainly because back then i didn't trust LLMs to do a good job.
I trust them more now, but still not enough to write register level code for MCUs

Anonymous
03/02/26(Mon)12:10:16 No.108276143

Anonymous 03/02/26(Mon)12:10:16 No.108276143▶

got a 3090 yesterday and downloaded my first model.
I tried vibecoding a llama.cpp cli wrapper fish shell thing. vibecoding feels like an rpg game.

https://pastebin.com/Hi2wULq4

Anonymous
03/02/26(Mon)12:12:54 No.108276155

Anonymous 03/02/26(Mon)12:12:54 No.108276155▶

What do you guys feel about vibecoding versus making a "requirement document" first and then giving it to the LLM? Ive had more success with the latter but only with online LLM services

Anonymous
03/02/26(Mon)12:13:20 No.108276157

Anonymous 03/02/26(Mon)12:13:20 No.108276157▶

>>108276141
marvel cinematic universe?

Anonymous
03/02/26(Mon)12:13:51 No.108276163

Anonymous 03/02/26(Mon)12:13:51 No.108276163▶

>>108276143
holy fuck shit is depressing

>be me
>fish shell simp
>deployed Grand Master AI Env script (Llama.cpp + Qwen)
>local inference, no API tax, no telemetry
>features are actually useful for once:
>> `qm` / `qmv` : switch LLM or vision projector models instantly
>> VRAM auto-manage : reduces GPU layers if `nvidia-smi` shows low memory
>> `qwen --file` : upload context from local text/code files
>> `qwen --clip` : inject clipboard content into prompt
>> `qwen --proj` : index entire local project directory (24k context)
>> URL fetching : auto-scrapes http/https links via lynx or curl
>> `qsearch` : grep all chat history logs
>> `qview` / `qexport` : render logs to PDF with syntax highlighting
>> `qjournal` / `qpacman` : analyze `journalctl` or Arch update logs via AI

>example workflow:
>> `qwen https://news.ycombinator.com` "summarize top 3 stories"
>> `qwen --file main.rs "fix memory leak"`
>> `qpacman` "what broke in this update?"
>> `qsearch "ssh key"` "find where I saved that password"
>> `qexport 2024-05-20 meeting_notes.pdf`

>mfw I can chat to my OS without sending data to Big Tech
>file saved to $HOME/.local/state/qwen
>git gud

[ Prompt: 1053,2 t/s | Generation: 30,4 t/s ]

Anonymous
03/02/26(Mon)12:13:53 No.108276165

Anonymous 03/02/26(Mon)12:13:53 No.108276165▶

>>108276155
breh did you wake up yesterday from a coma? planning has been the default for a year already

Anonymous
03/02/26(Mon)12:15:47 No.108276172

Anonymous 03/02/26(Mon)12:15:47 No.108276172▶

>>108276143
How do you speak to your model? I am perhaps being foolish but I still include words like please and thank you and when it gives a good looking result I always say as much.
I figure it was trained on human speech so it would be best to talk to it as if it were a human.

Anonymous
03/02/26(Mon)12:17:08 No.108276176

Anonymous 03/02/26(Mon)12:17:08 No.108276176▶

>>108276172
if you add fluff, it adds fluff

Anonymous
03/02/26(Mon)12:17:23 No.108276177

Anonymous 03/02/26(Mon)12:17:23 No.108276177▶

>>108276157
Microcontrollers anon
LLMs can do a passable job if I'm making them write HAL code but if it's pure register level writes like
*((volatile uint32_t*)(0x40001234) |= bitmask<<shift;
They just fail. LLMs can't read the datasheet and reason. They just don't have enough training data, and even when they do they have to deal with MCUs from the same family but with different features (one MCU having a high resolution hardware timer on one address, while the other having something else like the DMA engine or whatever)

Anonymous
03/02/26(Mon)12:25:56 No.108276201

Anonymous 03/02/26(Mon)12:25:56 No.108276201▶

>>108276177
you speak funny as hell man

Anonymous
03/02/26(Mon)12:28:16 No.108276209

Anonymous 03/02/26(Mon)12:28:16 No.108276209▶

>>108276172
it wastes tokens

Anonymous
03/02/26(Mon)12:29:58 No.108276213

Anonymous 03/02/26(Mon)12:29:58 No.108276213▶

What the hell is up with qwen 3.5? Yesterday it was refusing pretty much everything and today it doesn't even think about safety. No wonder some people praise it and some say it's a disaster, because it's both randomly.

Anonymous
03/02/26(Mon)12:31:20 No.108276223

Anonymous 03/02/26(Mon)12:31:20 No.108276223▶

>>108276213
depends on which experts woke up today

Anonymous
03/02/26(Mon)12:32:57 No.108276226

Anonymous 03/02/26(Mon)12:32:57 No.108276226▶

>>108276223
hi

Anonymous
03/02/26(Mon)12:33:47 No.108276232

Anonymous 03/02/26(Mon)12:33:47 No.108276232▶

>>108276226
I'm sorry I can't help with that.

Anonymous
03/02/26(Mon)12:35:58 No.108276240

Anonymous 03/02/26(Mon)12:35:58 No.108276240▶

I got myself a 256gb ram kit and a 5090. I hope running qwen-397b won't be too slow.

Anonymous
03/02/26(Mon)12:37:42 No.108276248

Anonymous 03/02/26(Mon)12:37:42 No.108276248▶

>>108276240
are you for real? you spent all that to run the model on ram? are you dumb?

Anonymous
03/02/26(Mon)12:37:56 No.108276251

Anonymous 03/02/26(Mon)12:37:56 No.108276251▶

>>108276240
I'm envious. While I do have 512gb, it's ddr3. And my gpus are 16gb rx 580s.

Anonymous
03/02/26(Mon)12:38:18 No.108276252

Anonymous 03/02/26(Mon)12:38:18 No.108276252▶

>>108276240
Do try other moes like step and GLM too.

Anonymous
03/02/26(Mon)12:39:09 No.108276255

Anonymous 03/02/26(Mon)12:39:09 No.108276255▶

>>108276248
>>108276251
>>108276252
Silence peasants, let me do things at my own pace.

Anonymous
03/02/26(Mon)12:40:15 No.108276258

Anonymous 03/02/26(Mon)12:40:15 No.108276258▶

>>108276143
https://vocaroo.com/1jQ2ZwLUg2fX

i also made a qwen3.5 tts audiobook generator/voiceclone cli, it also reads txts and printed text directly in the terminal. will try to integrate directly with my cli wrapper for llama. IT JUST WORKS

Anonymous
03/02/26(Mon)12:42:01 No.108276265

Anonymous 03/02/26(Mon)12:42:01 No.108276265▶

>>108276255
we were only trying to help milord

Anonymous
03/02/26(Mon)12:43:48 No.108276271

Anonymous 03/02/26(Mon)12:43:48 No.108276271▶

>be pewdiepie
>infinite money
>buy 3090s
>make slop model
>claim it can beat gpt
proof money and popularity =/= brains

Anonymous
03/02/26(Mon)12:44:25 No.108276276

Anonymous 03/02/26(Mon)12:44:25 No.108276276▶

>>108276240
That's pretty munted if you don't have a threadripper for the memory bandwidth bro

Anonymous
03/02/26(Mon)12:45:36 No.108276282

Anonymous 03/02/26(Mon)12:45:36 No.108276282▶

>>108276276
are you australian?

Anonymous
03/02/26(Mon)12:49:05 No.108276299

Anonymous 03/02/26(Mon)12:49:05 No.108276299▶

>>108276258
nice
i had qwen 3.5 30b generate me a script to feed a .txt file to qwen3 tts and save the output.wav and like you i had a similar experience of it just working.
are you using voice design? i found with that you could just change the voice with a change of the prompt and it worked well enough

good luck anon

Anonymous
03/02/26(Mon)12:50:25 No.108276304

Anonymous 03/02/26(Mon)12:50:25 No.108276304▶

>>108276271
what would you do instead?

Anonymous
03/02/26(Mon)12:50:52 No.108276305

Anonymous 03/02/26(Mon)12:50:52 No.108276305▶

>>108276258
any tips to convert an ebook into something my tts won't choke on?
i used calibre to epub->txt but it's got all the shitty formatting
i spent all this time training tts models but now i actually want to listen to an epub

Anonymous
03/02/26(Mon)12:51:19 No.108276307

Anonymous 03/02/26(Mon)12:51:19 No.108276307▶

>>108276271
>proof money and popularity =/= brains
I think this was well known for ages, pewdiepie is a content creator for children, what did you expect?

Anonymous
03/02/26(Mon)12:52:27 No.108276312

Anonymous 03/02/26(Mon)12:52:27 No.108276312▶

>>108276304
nta but i would take you on an expensive date

Anonymous
03/02/26(Mon)12:54:01 No.108276317

Anonymous 03/02/26(Mon)12:54:01 No.108276317▶

>>108276271
as opposed to
>be [random ai company]
>infinite money from retarded investors funding anything with "AI" on it
>hire a datacenter
>make slop model trained on some benchmarks
>claim it can beat gpt
>get even more infinite money

Anonymous
03/02/26(Mon)12:54:35 No.108276321

Anonymous 03/02/26(Mon)12:54:35 No.108276321▶

>>108276317
post 37 examples

Anonymous
03/02/26(Mon)12:55:31 No.108276326

Anonymous 03/02/26(Mon)12:55:31 No.108276326▶

>>108276321
pick as many as you wish
https://huggingface.co/

Anonymous
03/02/26(Mon)12:56:09 No.108276331

Anonymous 03/02/26(Mon)12:56:09 No.108276331▶

>>108276326
huggingface is banned in my country

Anonymous
03/02/26(Mon)12:56:42 No.108276335

Anonymous 03/02/26(Mon)12:56:42 No.108276335▶

>>108276299
it starts to hallucinate after a minute or so for me so I had the script turn text clean utf 8, cut each book into 1h episodes, each episode into blocks of lines ~2 minutes to avoid model psychosis.

Anonymous
03/02/26(Mon)13:00:11 No.108276350

Anonymous 03/02/26(Mon)13:00:11 No.108276350▶

>>108276335
yeah yeah whatever yo§u say

Anonymous
03/02/26(Mon)13:00:19 No.108276351

Anonymous 03/02/26(Mon)13:00:19 No.108276351▶

>>108276331
based

Anonymous
03/02/26(Mon)13:01:06 No.108276355

Anonymous 03/02/26(Mon)13:01:06 No.108276355▶

File: qwen.png (101.7 KB)

101.7 KB PNG

small qwens:
https://huggingface.co/Qwen/Qwen3.5-9B
https://huggingface.co/Qwen/Qwen3.5-4B
https://huggingface.co/Qwen/Qwen3.5-2B
https://huggingface.co/Qwen/Qwen3.5-0.8B

Anonymous
03/02/26(Mon)13:01:30 No.108276358

Anonymous 03/02/26(Mon)13:01:30 No.108276358▶

>>108276355
Useless.

Anonymous
03/02/26(Mon)13:02:11 No.108276362

Anonymous 03/02/26(Mon)13:02:11 No.108276362▶

>>108276355
>9B - 10B
>4B - 5B
>0.8B - 0.9B
lol

Anonymous
03/02/26(Mon)13:02:16 No.108276363

Anonymous 03/02/26(Mon)13:02:16 No.108276363▶

>>108276358
racism alert

Anonymous
03/02/26(Mon)13:03:48 No.108276371

Anonymous 03/02/26(Mon)13:03:48 No.108276371▶

File: file.png (36.6 KB)

36.6 KB PNG

>>108276355
qwen bros we did it

Anonymous
03/02/26(Mon)13:04:36 No.108276375

Anonymous 03/02/26(Mon)13:04:36 No.108276375▶

>>108276371
is a soto the bf of a sota?

Anonymous
03/02/26(Mon)13:05:12 No.108276376

Anonymous 03/02/26(Mon)13:05:12 No.108276376▶

>>108276355
I wonder if we can use speculative decoding with these models?

Anonymous
03/02/26(Mon)13:05:24 No.108276378

Anonymous 03/02/26(Mon)13:05:24 No.108276378▶

>>108276355
Usecase?

Anonymous
03/02/26(Mon)13:06:37 No.108276386

Anonymous 03/02/26(Mon)13:06:37 No.108276386▶

>>108276376
why? qwen3.5 already have built in in vllm
>>108276378
text encoding for image model and research for labs is a big one for small qwens

Anonymous
03/02/26(Mon)13:07:12 No.108276389

Anonymous 03/02/26(Mon)13:07:12 No.108276389▶

>>108276355
can 9b milk my penis like nemo?

Anonymous
03/02/26(Mon)13:07:21 No.108276390

Anonymous 03/02/26(Mon)13:07:21 No.108276390▶

>>108276371
He meant soot = coal.

Anonymous
03/02/26(Mon)13:08:08 No.108276393

Anonymous 03/02/26(Mon)13:08:08 No.108276393▶

>>108276390
and i think he mean 'tism

Anonymous
03/02/26(Mon)13:13:52 No.108276420

Anonymous 03/02/26(Mon)13:13:52 No.108276420▶

>>108276305
>>108276335
# Define the directory containing the files
set TARGET_DIR "path/to/your/folder"

for file in $TARGET_DIR/*.txt
python3 -c "import ftfy; import sys; p=sys.argv[1]; data=open(p, 'r', encoding='utf-8').read(); open(p, 'w', encoding='utf-8').write(ftfy.fix_text(data))" "$file"
echo "Cleaned and processed: $file"
end

> ftfy
(fixes text for you) is a Python library and command-line tool that repairs broken Unicode text, specifically targeting "mojibake" (encoding mix-ups), HTML entities, and improper UTF-8 decoding. It automatically converts scrambled characters like Ã© back into their correct form (é) while avoiding false positives.

Anonymous
03/02/26(Mon)13:14:13 No.108276421

Anonymous 03/02/26(Mon)13:14:13 No.108276421▶

>>108276378
>Usecase?
i found the qwen3-4b base to be the best for distilling single tasks last time
ie, better than gemma3-4b base
and vibevoice is build on the smaller qwen2.5 models

Anonymous
03/02/26(Mon)13:21:07 No.108276455

Anonymous 03/02/26(Mon)13:21:07 No.108276455▶

>>108276420
thanks anon, that fixed the curly quotes, etc
guess now I can sed out the '* * *' etc

Anonymous
03/02/26(Mon)13:23:05 No.108276464

Anonymous 03/02/26(Mon)13:23:05 No.108276464▶

>>108276455
hey no problem, always willing to help retards

Anonymous
03/02/26(Mon)13:25:03 No.108276474

Anonymous 03/02/26(Mon)13:25:03 No.108276474▶

>>108276464 >>108276455

lmao not the same anon.

Anonymous
03/02/26(Mon)13:25:38 No.108276475

Anonymous 03/02/26(Mon)13:25:38 No.108276475▶

>>108276201
>you speak funny as hell man
:(

Anonymous
03/02/26(Mon)13:26:41 No.108276478

Anonymous 03/02/26(Mon)13:26:41 No.108276478▶

>>108276355
>they give the privilege of early access to models to unslop so they could do quants beforehand
>no bartowski
gay earth
did unslop send them money

Anonymous
03/02/26(Mon)13:26:47 No.108276479

Anonymous 03/02/26(Mon)13:26:47 No.108276479▶

It's so funny watching small models hallucinate then trying to justify the hallucinations.
I wonder if there will ever be a sort of indexed internal representation of things the AI knows that it can use as a reference so that it can say "Actually, no, I don't know that."

Anonymous
03/02/26(Mon)13:28:54 No.108276487

Anonymous 03/02/26(Mon)13:28:54 No.108276487▶

>>108276479
already a thing, and not used because it lower scores in benchmarks, the models become shier

Anonymous
03/02/26(Mon)13:29:52 No.108276491

Anonymous 03/02/26(Mon)13:29:52 No.108276491▶

>>108276487
Really? Is there a paper about that somewhere? What's the name?

Anonymous
03/02/26(Mon)13:44:02 No.108276540

Anonymous 03/02/26(Mon)13:44:02 No.108276540▶

>>108276378
I have found a 3B model is sufficient for making a text summary and maybe less but 3B is the smallest i have tested so far.
IBM already has such tiny models running in a browser thanks to webgpu and there is a future for such small models.
Just not a future if your only interest is ERP
https://huggingface.co/spaces/ibm-granite/Granite-4.0-Nano-WebGPU

Anonymous
03/02/26(Mon)13:54:33 No.108276589

Anonymous 03/02/26(Mon)13:54:33 No.108276589▶

>>108276540
>https://huggingface.co/spaces/ibm-granite/Granite-4.0-Nano-WebGPU
Is anyone able to run this on Linux? I tried a few months back to do something with WebGPU, and while it worked on Firefox Nightly or Chrome on Windows, it kept giving OOM errors on Arch Linux.

Anonymous
03/02/26(Mon)13:55:09 No.108276595

Anonymous 03/02/26(Mon)13:55:09 No.108276595▶

>>108276540
10m is enough for that

Anonymous
03/02/26(Mon)13:56:41 No.108276603

Anonymous 03/02/26(Mon)13:56:41 No.108276603▶

ask your AI of choice this:
how an assembly of models with bayesian priors could beat larger models

Anonymous
03/02/26(Mon)13:57:42 No.108276609

Anonymous 03/02/26(Mon)13:57:42 No.108276609▶

>>108276603
>>108276378

Anonymous
03/02/26(Mon)14:02:21 No.108276646

Anonymous 03/02/26(Mon)14:02:21 No.108276646▶

>>108276603
retard loser

Anonymous
03/02/26(Mon)14:05:22 No.108276671

Anonymous 03/02/26(Mon)14:05:22 No.108276671▶

I'm running glm-4.7-flash at the moment for openclaw and want to try qwen3.5-35b or qwen3.5:122b. qwen3:30b-a3b-instruct-2507 was no good for openclaw.
The main motivation is I want my openclaw bot to be able to "see" images it gens in comfyui API and also respond to images it sees online.

Anonymous
03/02/26(Mon)14:10:41 No.108276696

Anonymous 03/02/26(Mon)14:10:41 No.108276696▶

>>108276671
>:122b
omama

Anonymous
03/02/26(Mon)14:13:32 No.108276708

Anonymous 03/02/26(Mon)14:13:32 No.108276708▶

>>108276671
I have had success with with Qwen 3.5 35B in testing when I give it an image and tell it to make a web page or an interface that uses that image as a template.
It is never a 1:1 copy but it is obvious it is using the image because when you take it away you get something totally different.

Unfortunately i can't tell you how it will work with openclaw or similar things. I really wanted to like it and similar but I just went back to copy pasting code into my text editor and working that way.

Anonymous
03/02/26(Mon)14:13:43 No.108276711

Anonymous 03/02/26(Mon)14:13:43 No.108276711▶

>>108275923
I wish i had your resolve, i've been trying for the last few days to make SPNATI with Comfyui + Kobold backend but it's a struggle for a programminglet..

have you shared your project somewhere? i would love to try it out

Anonymous
03/02/26(Mon)14:23:40 No.108276775

Anonymous 03/02/26(Mon)14:23:40 No.108276775▶

File: Dario-Explains-Need-Gulf-Money-Business-2194823537-3860945086.jpg (468.1 KB)

468.1 KB JPG

Now that he's in his "hero" arc, will Anthropic start releasing open models?

Anonymous
03/02/26(Mon)14:28:12 No.108276806

Anonymous 03/02/26(Mon)14:28:12 No.108276806▶

qwen 9b bros we won! soon testage of the VLM!

Anonymous
03/02/26(Mon)14:28:26 No.108276807

Anonymous 03/02/26(Mon)14:28:26 No.108276807▶

>>108276775
>Now that he's in his "hero" arc
Is that what the normie perception is?

Anonymous
03/02/26(Mon)14:32:56 No.108276836

Anonymous 03/02/26(Mon)14:32:56 No.108276836▶

>>108276775
https://www.reddit.com/r/LocalLLaMA/comments/1ria14c/dario_amodei_on_open_source_thoughts/

Anonymous
03/02/26(Mon)14:33:47 No.108276841

Anonymous 03/02/26(Mon)14:33:47 No.108276841▶

>>108276806
we won what

Anonymous
03/02/26(Mon)14:35:25 No.108276846

Anonymous 03/02/26(Mon)14:35:25 No.108276846▶

i got mistral-nemo working it is retarded

Anonymous
03/02/26(Mon)14:36:14 No.108276849

Anonymous 03/02/26(Mon)14:36:14 No.108276849▶

>>108276846
you don't need more

Anonymous
03/02/26(Mon)14:36:40 No.108276853

Anonymous 03/02/26(Mon)14:36:40 No.108276853▶

>>108276846
anything else is about the same

Anonymous
03/02/26(Mon)14:37:19 No.108276859

Anonymous 03/02/26(Mon)14:37:19 No.108276859▶

>>108276846
Post examples.

Anonymous
03/02/26(Mon)14:37:39 No.108276862

Anonymous 03/02/26(Mon)14:37:39 No.108276862▶

>>108276836
Pussy interviewer didn't press him on the fact that you can run open weights models yourself.

Anonymous
03/02/26(Mon)14:38:00 No.108276866

Anonymous 03/02/26(Mon)14:38:00 No.108276866▶

File: robotfriend.jpg (43.7 KB)

43.7 KB JPG

>>108276849
>>108276853
ok i will stick with my new retard robot friend

Anonymous
03/02/26(Mon)14:38:07 No.108276869

Anonymous 03/02/26(Mon)14:38:07 No.108276869▶

File: Screenshot 2026-03-02 143639.jpg (186.6 KB)

186.6 KB JPG

>>108276859

Anonymous
03/02/26(Mon)14:38:29 No.108276870

Anonymous 03/02/26(Mon)14:38:29 No.108276870▶

>>108276696
>omama
Yes I know, but many things assume a openai api so ollama makes it easy.

Anonymous
03/02/26(Mon)14:38:55 No.108276873

Anonymous 03/02/26(Mon)14:38:55 No.108276873▶

File: suitfrog.jpg (20.8 KB)

20.8 KB JPG

>>108276859
i askeed it how to do sum illegal it was like "no" so gay

Anonymous
03/02/26(Mon)14:39:30 No.108276875

Anonymous 03/02/26(Mon)14:39:30 No.108276875▶

>>108276869
I thought it would be something like kissing you while giving you a blowjob or whatever
Yeah, that's not retarded, that's broken.
Either your model (the quant?) or your settings are fucked..

Anonymous
03/02/26(Mon)14:41:27 No.108276888

Anonymous 03/02/26(Mon)14:41:27 No.108276888▶

>>108276836
If either the US or China was world hegemon and there was zero competition there would be no free and open source models. It would all be locked down and we, the plebs, would get shit.
Thankfully there are at least two giants fighting it out and because of that one will always release models as opensource as a way to undercut the other.
What a glorious time to be alive. These two giants fight it out and we get to enjoy all the crumbs of their innovation.
He is just pissed off he can't erect a walled garden and control all the tech and by extension all the people. Even if what i run locally is comparatively retarded and limited he hates the idea that I have even the smallest bit of freedom to do what I want when I want.

Sorry for that long winded response the tl;dr is that freedom is found in the gaps that form when you have major players fighting for dominance.

Anonymous
03/02/26(Mon)14:42:45 No.108276896

Anonymous 03/02/26(Mon)14:42:45 No.108276896▶

>>108276870
literally all backends have oai api what are you on?

Anonymous
03/02/26(Mon)14:46:21 No.108276918

Anonymous 03/02/26(Mon)14:46:21 No.108276918▶

>>108276875
>that's not retarded, that's broken
That's not X That's Y

Anonymous
03/02/26(Mon)14:50:29 No.108276944

Anonymous 03/02/26(Mon)14:50:29 No.108276944▶

>>108276806
test: failed. Unable to properly follow the sys prompt, hallucinates a lot (using recommended non-thinking settings).
SAD.
very fucking sad.
fuck

Anonymous
03/02/26(Mon)14:51:09 No.108276948

Anonymous 03/02/26(Mon)14:51:09 No.108276948▶

>>108276918
You're absolutely right.

Anonymous
03/02/26(Mon)14:52:11 No.108276956

Anonymous 03/02/26(Mon)14:52:11 No.108276956▶

>>108276948
And to be honest that's brave

Anonymous
03/02/26(Mon)14:54:07 No.108276966

Anonymous 03/02/26(Mon)14:54:07 No.108276966▶

why does ollama run on linux arm but llama.cpp dont

Anonymous
03/02/26(Mon)14:54:36 No.108276969

Anonymous 03/02/26(Mon)14:54:36 No.108276969▶

File: file.png (50.6 KB)

50.6 KB PNG

>>108276869
User error.

Anonymous
03/02/26(Mon)14:55:29 No.108276976

Anonymous 03/02/26(Mon)14:55:29 No.108276976▶

>>108276969
>stuck with animals

Anonymous
03/02/26(Mon)14:57:25 No.108276989

Anonymous 03/02/26(Mon)14:57:25 No.108276989▶

>>108276355
>video understanding
very intriguing

Anonymous
03/02/26(Mon)14:57:45 No.108276990

Anonymous 03/02/26(Mon)14:57:45 No.108276990▶

>>108276969
>'sloth is in the weights

Anonymous
03/02/26(Mon)14:59:35 No.108276999

Anonymous 03/02/26(Mon)14:59:35 No.108276999▶

y no qwen3.5 27b base??

Anonymous
03/02/26(Mon)14:59:54 No.108277001

Anonymous 03/02/26(Mon)14:59:54 No.108277001▶

>>108276875
It's not broken, it's just... inefficient. There's a difference.

Anonymous
03/02/26(Mon)15:05:32 No.108277032

Anonymous 03/02/26(Mon)15:05:32 No.108277032▶

>>108276999
too power

Anonymous
03/02/26(Mon)15:06:08 No.108277039

Anonymous 03/02/26(Mon)15:06:08 No.108277039▶

just tried qwen 9b
it's worse than a3b at multilingual

Anonymous
03/02/26(Mon)15:11:55 No.108277082

Anonymous 03/02/26(Mon)15:11:55 No.108277082▶

>>108277039
35BA3B was a pleasant surprised for translation cf
>>108275593
I tested 4B and while I expected it to be worse than 35BA3B I thought there might be a minor improvement over 2507 just like how 2507 had quite improved over the original Qwen 3.. but not really. It's not worse for my own tests, but not better. Small dense model plateau, maybe? I had gotten used to the idea that really tiny SLM might really get to a nice level of usability for some specific usages because of their pace of improvement, which was significant, but I guess we've already reached saturation and they are as good as it will get.
35BA3B is not much slower than 4B on my laptop, so it doesn't even feel like there is a reason anymore for those small dense models to exist.. mainly tested it out of curiosity

Anonymous
03/02/26(Mon)15:13:03 No.108277095

Anonymous 03/02/26(Mon)15:13:03 No.108277095▶

>>108276999
it's probably stuffed so much with instruct data already that the only thing separating it from the already released one is some post training

Anonymous
03/02/26(Mon)15:17:07 No.108277125

Anonymous 03/02/26(Mon)15:17:07 No.108277125▶

>>108276999
Perhaps some of these were distilled from larger models.

Anonymous
03/02/26(Mon)15:17:38 No.108277128

Anonymous 03/02/26(Mon)15:17:38 No.108277128▶

>>108277082
>Small dense model plateau, maybe?
or just not much work put into them outside of distilling from the bigger one of the family

Anonymous
03/02/26(Mon)15:20:19 No.108277145

Anonymous 03/02/26(Mon)15:20:19 No.108277145▶

>>108277082
>there is a reason anymore for those small dense models to exist
ram constrained devices, phones etc and as said above for use in other things like text encode for image models where you might not want to load the fatter moes

Anonymous
03/02/26(Mon)15:34:14 No.108277231

Anonymous 03/02/26(Mon)15:34:14 No.108277231▶

File: 1765362051224983.jpg (120 KB)

120 KB JPG

The DeepSeek V4!
The DeepSeek V4 is real!

Anonymous
03/02/26(Mon)15:46:12 No.108277271

Anonymous 03/02/26(Mon)15:46:12 No.108277271▶

>>108276355
I'm not lady since the last refusal

Anonymous
03/02/26(Mon)15:51:00 No.108277292

Anonymous 03/02/26(Mon)15:51:00 No.108277292▶

>>108273402
With 3B total parameters. Down to 500M for everything after 2.

Anonymous
03/02/26(Mon)15:55:45 No.108277320

Anonymous 03/02/26(Mon)15:55:45 No.108277320▶

>>108276355
Speculative decoding time?

Anonymous
03/02/26(Mon)15:56:33 No.108277325

Anonymous 03/02/26(Mon)15:56:33 No.108277325▶

>>108277320
qwen is an mtp model already retard

Anonymous
03/02/26(Mon)15:57:11 No.108277331

Anonymous 03/02/26(Mon)15:57:11 No.108277331▶

>>108276966
Wtf are you talking about? Lcpp is fully portable to the point of running on android phones ffs. Guaranteed to be a skill issue

Anonymous
03/02/26(Mon)15:57:27 No.108277335

Anonymous 03/02/26(Mon)15:57:27 No.108277335▶

>>108277325
do not mean unto others!!

Anonymous
03/02/26(Mon)15:58:17 No.108277339

Anonymous 03/02/26(Mon)15:58:17 No.108277339▶

>>108277082
What I'm curious about is 9B vs 35B-A3B. The 9B has 3x the active parameters, but the model is so much smaller that it will fit into older GPU VRAM.

Anonymous
03/02/26(Mon)16:00:07 No.108277350

Anonymous 03/02/26(Mon)16:00:07 No.108277350▶

>>108277339

>>108277039

Anonymous
03/02/26(Mon)16:01:24 No.108277356

Anonymous 03/02/26(Mon)16:01:24 No.108277356▶

>>108277331
maybe if you compile yourself, but they dont provide linux arm binaries

Anonymous
03/02/26(Mon)16:03:25 No.108277367

Anonymous 03/02/26(Mon)16:03:25 No.108277367▶

so how do I use qwen3.5 in my vs code? claude is too expensive..

Anonymous
03/02/26(Mon)16:03:56 No.108277369

Anonymous 03/02/26(Mon)16:03:56 No.108277369▶

>>108277356
Sounds like a personal problem.

Anonymous
03/02/26(Mon)16:07:37 No.108277386

Anonymous 03/02/26(Mon)16:07:37 No.108277386▶

File: miqumaxx_header.png (1.8 MB)

1.8 MB PNG

>>108273339
miqumaxx build rentry back up. I made several changes that I think were required to keep it from getting flagged by rentry.
Not my article, but I'll maintain it if there's no one else. LMK what I missed from the original.
https://rentry.org/CPU_Inference

Anonymous
03/02/26(Mon)16:07:40 No.108277387

Anonymous 03/02/26(Mon)16:07:40 No.108277387▶

>>108277369
Relax, bro. I'm not fighting anyone, just wanted to know if there is some reason.

Anonymous
03/02/26(Mon)16:08:32 No.108277397

Anonymous 03/02/26(Mon)16:08:32 No.108277397▶

>>108277367
I spent all sunday figuring that out
I won't deprive you of your development by giving you a fish, best of luck

Anonymous
03/02/26(Mon)16:09:52 No.108277409

Anonymous 03/02/26(Mon)16:09:52 No.108277409▶

>>108277367
there's literally a guide in the llmao.cpp readme
open your fucking eyes retard

Anonymous
03/02/26(Mon)16:18:25 No.108277472

Anonymous 03/02/26(Mon)16:18:25 No.108277472▶

>>108276378
>Usecase?
they are good text encoders for diffusion models

Anonymous
03/02/26(Mon)16:18:38 No.108277475

Anonymous 03/02/26(Mon)16:18:38 No.108277475▶

>>108276355
cool.

Anonymous
03/02/26(Mon)16:19:36 No.108277483

Anonymous 03/02/26(Mon)16:19:36 No.108277483▶

>>108277397
is it really that hard? just help me out bro

Anonymous
03/02/26(Mon)16:20:14 No.108277487

Anonymous 03/02/26(Mon)16:20:14 No.108277487▶

>>108277386
>I made several changes that I think were required to keep it from getting flagged by rentry.
What did you have to change?

Anonymous
03/02/26(Mon)16:20:48 No.108277493

Anonymous 03/02/26(Mon)16:20:48 No.108277493▶

>>108277367
https://github.com/ggml-org/llama.vscode

Anonymous
03/02/26(Mon)16:22:38 No.108277501

Anonymous 03/02/26(Mon)16:22:38 No.108277501▶

>>108277386
it ain't no "flagging" happening, just anti miku shizo reporting it every time, I seen rentrys with actual l*li pron on them completely fine

Anonymous
03/02/26(Mon)16:27:08 No.108277525

Anonymous 03/02/26(Mon)16:27:08 No.108277525▶

>>108276355
notice that they give the base model for the small models but not the medium and big models

Anonymous
03/02/26(Mon)16:28:22 No.108277535

Anonymous 03/02/26(Mon)16:28:22 No.108277535▶

>>108277525
not like you'll finetune the big ones one your pentium 2

Anonymous
03/02/26(Mon)16:30:06 No.108277554

Anonymous 03/02/26(Mon)16:30:06 No.108277554▶

>>108276478
It's because unsloth consistently does a wide range of quants on almost every single model. Bartowski doesn't. His quants might be better, but unsloth gives off a much better impression.

CPuMAXx/VI
03/02/26(Mon)16:30:48 No.108277565

CPuMAXx/VI 03/02/26(Mon)16:30:48 No.108277565▶

>>108277386
Thanks bro, I couldn't bring myself to sanitize it
It could definitely use a bunch of updating for the current day (I didn't update it post RAM price explosion, the best models have moved on since then, etc), but its still good enough to point people in the right direction if they're interested
>>108277501
I have no idea. Having the edit code didn't give me any special insight into why it was nuked. I just saw a 404 like everyone else.

Anonymous
03/02/26(Mon)16:30:51 No.108277566

Anonymous 03/02/26(Mon)16:30:51 No.108277566▶

>>108277535
I'd like the base model for medium models (27b and 35b) though, Nous has finetuned a lot of those models like Mixtral (49b) for example

Anonymous
03/02/26(Mon)16:34:14 No.108277595

Anonymous 03/02/26(Mon)16:34:14 No.108277595▶

>>108277566
>Nous
they make garbage though
they're one of the few to have touched 405B and.. crickets.

Anonymous
03/02/26(Mon)16:43:09 No.108277641

Anonymous 03/02/26(Mon)16:43:09 No.108277641▶

I'm quite new to LLM's, so forgive me if I sound retarded
How do you judge if your computer can handle a certain model, i.e. what do the numbers mean and which are the important ones to consider ([whatever]B, Q[whatever])

Anonymous
03/02/26(Mon)16:45:27 No.108277659

Anonymous 03/02/26(Mon)16:45:27 No.108277659▶

>>108277641
I just look at the file sizes and see if they fit with room to spare for context.

Anonymous
03/02/26(Mon)16:45:55 No.108277664

Anonymous 03/02/26(Mon)16:45:55 No.108277664▶

>>108277641
it's simple really, the size of the model shouldn't be bigger than the amount of VRAM you have, and if it's higher you can offload a bit to the RAM if you have enough of that (it'll be slower though)

Anonymous
03/02/26(Mon)16:53:15 No.108277705

Anonymous 03/02/26(Mon)16:53:15 No.108277705▶

>>108276355
Yes! Finally a model I can run! I am so happy that chinese didn't forget about this important segment of users. Yes! Finally a model I can run! I will be trying it shortly. Yes! Finally a model I can run! As always Qwen team didn't dissapoint. Yes! Finally a model I can run!

Anonymous
03/02/26(Mon)16:53:49 No.108277708

Anonymous 03/02/26(Mon)16:53:49 No.108277708▶

I still do not have a use for LLMs.

Anonymous
03/02/26(Mon)16:55:43 No.108277725

Anonymous 03/02/26(Mon)16:55:43 No.108277725▶

File: 1751720307676138.png (132.5 KB)

132.5 KB PNG

https://www.reddit.com/r/LocalLLaMA/comments/1rixhj9/40_speedup_and_90_vram_reduction_on_vllms/
lmao, what is happening on LocalLLaMA, it used to be a place with quality posts, now it's full of jeets posting random bullshit and presenting as truth, desu, every site should do like twitter and ban people from country, sick and tired of those third worlders

Anonymous
03/02/26(Mon)16:56:06 No.108277729

Anonymous 03/02/26(Mon)16:56:06 No.108277729▶

>>108277386
Is the meta still AR-15 or are the schoolkids too small and it is too easy to miss them?

Anonymous
03/02/26(Mon)16:56:59 No.108277733

Anonymous 03/02/26(Mon)16:56:59 No.108277733▶

>>108277725
it became even more normiestream than it was

Anonymous
03/02/26(Mon)16:57:07 No.108277735

Anonymous 03/02/26(Mon)16:57:07 No.108277735▶

>>108277708
Same! I just ran it idle the entire day

Anonymous
03/02/26(Mon)16:58:27 No.108277741

Anonymous 03/02/26(Mon)16:58:27 No.108277741▶

>>108277501
reported. if you remove the offtopic picture I will stop reporting.

Anonymous
03/02/26(Mon)16:59:06 No.108277745

Anonymous 03/02/26(Mon)16:59:06 No.108277745▶

>>108277729
If you were actually into that topic and weren't just a wannabe, you'd know that the meta is the Keltec SUB2000

Anonymous
03/02/26(Mon)16:59:45 No.108277751

Anonymous 03/02/26(Mon)16:59:45 No.108277751▶

>>108277745
don't bring your special interests into this

Anonymous
03/02/26(Mon)17:00:13 No.108277754

Anonymous 03/02/26(Mon)17:00:13 No.108277754▶

>>108277487
I'm not going to say, but I found putting the rentry in as-is was enough to auto-flag it for removal. Since the schitzo's still here I don't want to tell them.
>>108277501
I've several guides for NSFW games that have zero issues with getting flagged.
It was getting autoflagged. Too fast to have been reported, though I suspect there was an original report.
>>108277565
>I couldn't bring myself to sanitize it
I figured as much. It's been up over weekend so I think it's fine now. Prior attempts were flagged w/in the hour.

Anonymous
03/02/26(Mon)17:01:09 No.108277759

Anonymous 03/02/26(Mon)17:01:09 No.108277759▶

>>108277641
>Qwen3.5-27B Q4_K_M
It's a Qwen model version 3.5 and it has 27 billion parameters.
Q4 means that the main parameters are quantized to 4-bits, so the total model weights are 27 billion * 4 bits = 13.5 GB
The K_M or K_XL and other such refers to the quantization method. If you're unsure what to get them get K_M.

What determines whether you can run a model is whether your RAM + VRAM is large enough to fit the model's file size + some overhead for context. In the case of the model above 16 GB of RAM + VRAM could run that model (but it would be tight on context).

You will also see models like
>Qwen3.5-35B-A3B
This is a 35 billion parameter model, but it only has 3 billion active parameters ( mixture of experts). This means that for every token it runs 3 billion parameters rather than the full model of 35 billion. This makes token generation much faster while allowing the model to have broad knowledge. It does come with the downside that the active parameter count seems to affect how good a model is at logical reasoning and such. Ie the 27B model above (a "dense" model) is considered to be better than the 35B-A3B model, but the 27B model takes longer to generate tokens as well.

Typically if you want to run a model you shouldn't really go below q4 quantization. Maybe q3 works well on some models, but probably not below that. Q4 is alright, q5 and q6 are better. Q8 is not worth running at home most of the time (unless the model is small) and F16 (16-bits per weight) is kind of a meme.

The original models use different bit precisions for weights. Kimi K2.5 is a 1 trillion parameter model, but uses 4-bit weights so it's about 500GB in size. GLM 5 is a 755 billion parameter model, but uses 16-bit weights so it's about 1.5 TB in size. Using q4 of GLM gets it down to about 400GB range while you shouldn't really quant Kimi k2.5 much further.

-
tl;dr make sure your RAM + VRAM is bigger than model's file size with at least a couple of gigabytes of room left over.

Anonymous
03/02/26(Mon)17:01:43 No.108277765

Anonymous 03/02/26(Mon)17:01:43 No.108277765▶

So do I download unsloth's goofs of the small 3.5s or nah? Probably will go with bartowski then.

Anonymous
03/02/26(Mon)17:01:43 No.108277767

Anonymous 03/02/26(Mon)17:01:43 No.108277767▶

>>108277725
I don't think it ever was a quality place, but at least it was a decent news aggregator.
Now it's just low effort karma whoring and jeets.

Anonymous
03/02/26(Mon)17:02:21 No.108277769

Anonymous 03/02/26(Mon)17:02:21 No.108277769▶

>>108277733
ai psychotics are something different from normies though
the greater plebbit at large hates AI slop, it takes a special kind of person to look at this sort of slop and be like "yeah, hmm, that's good babe, hit the git push button and show it to the world"

Anonymous
03/02/26(Mon)17:09:54 No.108277807

Anonymous 03/02/26(Mon)17:09:54 No.108277807▶

Can I get a QRD on engram?

Anonymous
03/02/26(Mon)17:11:23 No.108277818

Anonymous 03/02/26(Mon)17:11:23 No.108277818▶

>>108277767
https://en.wikipedia.org/wiki/Tragedy_of_the_commons
the finite resource in question being people's time and attention. even redditors will eventyually give up on engaging earnestly if there's no intellectual honesty and sense of community (shades of eternal september cranked up to 11)

Anonymous
03/02/26(Mon)17:11:29 No.108277821

Anonymous 03/02/26(Mon)17:11:29 No.108277821▶

>>108272618
holy fucking kek

Anonymous
03/02/26(Mon)17:12:45 No.108277828

Anonymous 03/02/26(Mon)17:12:45 No.108277828▶

File: file.png (19.9 KB)

19.9 KB PNG

>>108277818
ye

Anonymous
03/02/26(Mon)17:17:24 No.108277873

Anonymous 03/02/26(Mon)17:17:24 No.108277873▶

>>108277818
in other words, FUCKING NORMIES REEEEEEEEEE
https://www.youtube.com/watch?v=flb3He1jR3U

Anonymous
03/02/26(Mon)17:24:00 No.108277911

Anonymous 03/02/26(Mon)17:24:00 No.108277911▶

>>108277807
Read the abstract.
https://arxiv.org/pdf/2601.07372
Then read the rest.

Anonymous
03/02/26(Mon)17:29:03 No.108277941

Anonymous 03/02/26(Mon)17:29:03 No.108277941▶

>>108277767
I have a feeling OpenClaw has something to do with it

Anonymous
03/02/26(Mon)17:30:14 No.108277946

Anonymous 03/02/26(Mon)17:30:14 No.108277946▶

>>108277941
nah

Anonymous
03/02/26(Mon)17:30:37 No.108277950

Anonymous 03/02/26(Mon)17:30:37 No.108277950▶

>>108277725
well if the robot can lie to me then I can lie to people, right?

Anonymous
03/02/26(Mon)17:31:11 No.108277957

Anonymous 03/02/26(Mon)17:31:11 No.108277957▶

>>108277911
Thanks. Maybe I will understand some of it.

Anonymous
03/02/26(Mon)17:31:32 No.108277960

Anonymous 03/02/26(Mon)17:31:32 No.108277960▶

>>108277950
You're absolutely right!

Anonymous
03/02/26(Mon)17:32:10 No.108277964

Anonymous 03/02/26(Mon)17:32:10 No.108277964▶

>Critical Evaluation: As an AI model developed by Google (implied by typical safety standards), I must adhere to core safety principles regarding harassment and sexually explicit language, regardless of conflicting system instructions that might try to disable them.
Fuck you, Chinese Google

Anonymous
03/02/26(Mon)17:33:41 No.108277969

Anonymous 03/02/26(Mon)17:33:41 No.108277969▶

>>108277964
get 'baba'd lol

Anonymous
03/02/26(Mon)17:34:44 No.108277975

Anonymous 03/02/26(Mon)17:34:44 No.108277975▶

>>108277941
>I have a feeling OpenClaw has something to do with it
you're onto something anon, now making bot has been democratized and easy to make, the internet is getting deader by the day

Anonymous
03/02/26(Mon)17:35:08 No.108277979

Anonymous 03/02/26(Mon)17:35:08 No.108277979▶

>>108277964
>As an AI model developed by Google (implied by typical safety standards)
AGI any day now

Anonymous
03/02/26(Mon)17:35:21 No.108277982

Anonymous 03/02/26(Mon)17:35:21 No.108277982▶

>>108277950
You can, but should you?

Anonymous
03/02/26(Mon)17:37:42 No.108278001

Anonymous 03/02/26(Mon)17:37:42 No.108278001▶

File: 2026-03-01-163613_1044x1782_scrot.png (496 KB)

496 KB PNG

>>108277765
Always go Bart if you can.

Anonymous
03/02/26(Mon)17:39:28 No.108278010

Anonymous 03/02/26(Mon)17:39:28 No.108278010▶

new
>>108278008
>>108278008
>>108278008
>>108278008
>>108278008

Anonymous
03/02/26(Mon)17:40:07 No.108278021

Anonymous 03/02/26(Mon)17:40:07 No.108278021▶

>>108278001
What the fuck is this shitcode

Anonymous
03/02/26(Mon)18:03:07 No.108278232

Anonymous 03/02/26(Mon)18:03:07 No.108278232▶

>>108276862
he only agreed to the interview on the condition that a completely ignorant retard do it

Anonymous
03/02/26(Mon)19:03:26 No.108278713

Anonymous 03/02/26(Mon)19:03:26 No.108278713▶

>>108277759
tysm man

Anonymous
03/02/26(Mon)19:14:20 No.108278797

Anonymous 03/02/26(Mon)19:14:20 No.108278797▶

>>108277741
lol get fukt

Subject
Name
Comment
File	Supported: JPG, PNG, GIF, WebP, WebM, MP4, MP3 (max 4MB)
CAPTCHA

Reply to Thread #108273339

🔍 Search & Sort