Thread #108295959
HomeIndexCatalogAll ThreadsNew ThreadReply
H
/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>108290857


►News
>(03/03) WizardLM publishes "Beyond Length Scaling" GRM paper: https://hf.co/papers/2603.01571
>(03/02) Qwen 3.5 Small Models (2B, 4B) released: https://hf.co/Qwen/Qwen3.5-4B
>(02/26) Qwen 3.5 35B-A3B released, excelling at agentic coding: https://hf.co/Qwen/Qwen3.5-35B-A3B
>(02/24) Introducing the Qwen 3.5 Medium Model Series: https://xcancel.com/Alibaba_Qwen/status/2026339351530188939
>(02/24) Liquid AI releases LFM2-24B-A2B: https://hf.co/LiquidAI/LFM2-24B-A2B

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/recommended-models
https://rentry.org/samplers
https://rentry.org/MikupadIntroGuide

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/gso.html
Context Length: https://github.com/adobe-research/NoLiMa
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling
Token Speed Visualizer: https://shir-man.com/tokens-per-second

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm
+Showing all 370 replies.
>>
File: really.png (1.6 KB)
1.6 KB
1.6 KB PNG
>>
what about world models for llms?
>>
>>108295989
Natural language is in itself a world model
>>
this might sound kinda crazy but do people pool together compute resource across multiple smaller machines e.g NUCs to run LLMs
I bought a bunch of NUCs a few years back because I thought it would be fun but never thought about using them for this
I do have a chunky machine with 32GB ddr5 ram and I intend to buy a gpu for it at some point, probably second hand
I don't really want to spend anymore than I already have
the total ram comes to 64GB
and if I add in my laptops ddr4 ram it comes to 80gb
could I run a general LLM with this much and what do people use to pool it?
Claude mentioned exo.
>>
ASICs for AI when?
>>
File: lmg user.jpg (159.7 KB)
159.7 KB
159.7 KB JPG
>>108295959
>>
>>108296013
https://github.com/ggml-org/llama.cpp/tree/master/tools/rpc
>>
https://arxiv.org/abs/2512.01797
>They solved AI hallucinations
>>
>>108296054
What a shame.
>>
uh yea need a 30gb model that is as smart as opus 4.6 please
>>
>>108296071
uh yea no.
>>
>>108296031
she's literally me
>>
File: you.jpg (193.9 KB)
193.9 KB
193.9 KB JPG
>>108296054
>>They solved AI hallucinations
>>
>>108296107
post hands Ranejh
>>
>>108295979
Please stay there and never come back retarded mikutroon.
>>
>>108295986
Yeah i am thinking the new baker is based. No more off-topic bakes.
>>
>>108296054
If you remove neurons which make the model too eager to respond that doesn't mean the model will say i don't know.
>>
melt
>>
>>108296203
it is troons all the way down
>>
>>108198958
Any updates on this?
>>
Just give me the Engrams already
>>
>>108296211
odds and ends
>>
>>108296295
this makes me wonder if there's a service that can dynamically generate images for your RP adventure based on character and maybe location templates/descriptions
>>
>>108296295
Ameila is a psyop agent working for the labour party that will stab you in the back once you earn her trust
>>
>>108296304
Never tried it but doesn't ST support something like that already? Just need to give it an API connection.
>>
>>108296304
SillyTavern already does this
>>
>>108296286
yeah but im planning to make it paid for now
>>
>>108296295
Doesn't look like her at all
>>
>>108296313
>>108296311
Yes but it’s not awesome last I tried it.
There’s also a visual novel mode as well as a few other ways to rig up your waifu aside from AI Art you care to set them up.
>>
>>108296295
the real amelia would have crooked teeth, look pasty white like a vampire and have the fashion sense of an eastern european male in love with sportswear
>>
>>108296344
show one girl like that
>>
File: british.jpg (221.2 KB)
221.2 KB
221.2 KB JPG
>>108296352
here's your authentic british experience
>>
>>108296443
no wonder why Miku doesn't want to deal with them
https://www.youtube.com/watch?v=IzDmMQ7SVPc
>>
Imagine making such a home run and getting fired anyway.
>>
>>108296467
i hate this art style. now that i think about it, troons LOVE this hideous art style.
>>
We are still at baker wars? I support the new baker. About time /lmg/ had relevant pictures in OP.
>>
>>108296844
At lest put a picture of released models, not some benchmaxxed scores.
>>
>>108296844
miku pissing into a baja blast cup is /lmg/ culture. this isn't.
>>
>>108296932
>/lmg/ culture
should be abolished
>>
>>108296932
Blacked miku is /lmg/ culture
>>
>>108296705
dude I fucking came to the same conclusion the other day
its the nu-newgrounds artstyle that troons have now adapted to
>>
>>108296705
i'll dub it tranny memphis
it's like the corporate memphis but retarded and gay
>>
>>
File: file.png (548.9 KB)
548.9 KB
548.9 KB PNG
Anyone tried this shit yet?
>>
>>108297061
nope
>>
>>108297061
Sounds like a scam
>>
>>108297064
https://huggingface.co/spaces/pliny-the-prompter/obliteratus
>>
>>108297061
They all claim the same.
>>
>>108297061
>ascii box drawing characters
dead giveaway of vibe-coded slop
>>
>>108297066
>pliny
>>
>108297038
Special interest
>>
>>108297089
That is a TUI? They tend to look like that.
>>
>>108297075
And they aren't lying. Your model will not refuse and the cooming quality will remain the same cause you only made it stop refusing.
>>
I love the name for this article
"Something is afoot in the land of Qwen"
https://simonwillison.net/2026/Mar/4/qwen/
>>
>>108297114
i like feet
>>
>>108297061
Is this another copy of heretic?
>>
>>108297117
yes
>>
>>108297113
>cooming quality will remain the same
Ok. There's no change in cooming quality. Got it.
>cause you only made it stop refusing
Then it will not remain the same. Get you ad straight.
Also
>PLINY
>>
I don't know why people are surprised about the Qwen drama. Alibaba has many more GPUs at their disposal compared to smaller Chinese teams yet they only train small to medium models and their models aren't noticeably better. This points to mismanagement of resources
>>
qwen is just chinese meta
>>
>>108297113
>Your model will not refuse and the cooming quality will remain
First, if it stops refusing, NO, it will NOT remain the same. Second, why do people need a to automate this? Isn't half of the fun trying to finetune the models yourself?
>>
>>108297171
lol Llama stopped releasing models after Llama 4
Qwen never stopped releasing models
>>
>>108297182
they have not yet released qwen 4. give it time.
>>
>>108297182
just wait until Meta releases Llama 4 Behemoth, local will be saved by then
>>
>>108297061
>open sillytavern
>text completion because im not a fucking retard
>load qwen
>prefill "I am {{char}}. I will now think in first person."
>literally does anything now
are promptlets really this retarded?
>>
>>108297203
Small model or shit qoont, especially the new qwens are even resistant to thinking block injection.
>>
>>108297192
Llama 5 in a month. Insiders confirm that the smaller model will be named llama-5-refuser and bigger will be llama-5-retard
>>
>>108297208
i'm using it right now with a Q5_K_M quant of 397B and it works fantastically. i'm confused... unless you meant that small models/shitty quants are not affected by it.
>>
>>108297208
So are they actually retarded as i assumed or are they finally super safe?
>>
>mrdermacher qwen3.5 heretic v2 models
is it even worth replacing the v1 model ive been using
>>
File: ourhero.png (488.8 KB)
488.8 KB
488.8 KB PNG
Will he save open-source?
>>
>>108297066
of course it's stolen by a grifter
>>
>>108297346
jesus fuckign christ i haven't even had a chance to test out v1 yet
>>
>>108296467
>>108296977
>>
>>108296286
pic somehow gives me Portal 2 vibes
>>
>>108297365
i said fuck it and am trying it anyways, will post thoughts after
>>
What's he doing wrong?
>>
>>108297439
He's not downloading sonnet 4.6 onto his own computer
>>
>>108296662
LLMs have been RLHF'ed on a bunch of normie conversation preference data and so they care a lot about managing the user's emotions.
>>
>>108297447
i diddly it
>>
>>108297439
not trolling hard enough. he could be asking his local model how to troll better but instead he's doing something else (no context)
>>
silly goonboot keep getting stuck in loops fuck
>>
>>
>>108297439
does he have a gpu? probably an applefag or something.
>>
>>108297439
>that bitch beta cuck boy avatar
>>
>>108297567
gpus are for nerds anyways
>>
>>108297439
>>
>>108297439
He is mentally retarded (IQ below 85), as even a layman can diagnose from his beginning every sentence with an emoji and his reddit spacing while expressing that he has failed to perform the simplest of tasks.
>>
>>108297659
whio is that guy
>>
>>108297762
You don't know about Penn & Teller?
>>
>>108297771
pennor???
>>
>>108297762
you know qwen3.5 has vision, you could just ask
>>
>>108297824
>go outside
>close my eyes
>Claude, tell me what you see, please.
>>
>>108297849
man i can't wait to be able to fully turn off my brain and let some ai control 90% of my life, im not even joking, life is too hard
>>
>>108297849
you're not asking what grass is
>>
lmao it's so fucking over
>>
It's funny that Qwen really is just chinese Meta
>>
>>108297902
Do these labs just trade guys nowadays? Qwen hired Gemini dropout and Google hired ex-Qwen. Is this some elaborate industry scam?
>>
>>108297928
you are jealous because you are dumbo
>>
>>108297928
The market is small. Zucc spent a couple billion just buying out people from other companies after llama4 flopped.
>>
>>108297902
lmao
Gemma 4 will be as dry as Hillary's cunt
>>
>>108297968
and you know what her cunt is like ...how?
>>
>>108297902
do these guys really get paid $500k to type ./train and play Dota2?
>>
>>108297981
>$500k
Try $50M
>>
>>108297902
Hot(lines) and Dry? Let's go
>>
https://www.reddit.com/r/LocalLLaMA/comments/1rl54v7/d_a_mathematical_proof_from_an_anonymous_korean/
>>
is there a local program which can be used for voice cloning? 11 is gey and I don't want to give them money just so I can make princess peach recite BWC copy pastas and use bluetooth to play it on my neighbors car stereo the next time he slowly drives down the block blasting his rap music
>>
>linking reddit
Git gone and stay gone
>>
>>108298033
https://github.com/jamiepine/voicebox
>>
>>108298033
QwenTTS
but it does male voices better in my opinion
>>
>>108295959
>>
We're safe (for now)
>>
>>108298195
>CEO meddling directly
This is how LLaMA became a joke
>>
>>108298195
lol
>>
>>108297439
TRVTHNVKE that /lmg/ can't handle
>>
I have an old M1 Pro 16GB VRAM mac, and holy shit, I'm impressed with the current state of local models, qwen 3.5 9b is feeling great, performs great and is even multimodal.
>>
>>108298350
sad
>>
>>108298350
It is pretty wild isn't it?
>>
>>108298195
Long term China is selling not just ai but a whole technology stack. They want nations to use Chinese chips, phones, ram, ai, social credit, etc etc.
The US is also doing something similar basically advanced nation as a kit. Some guy in Africa or other country agrees to partner with one of the giants and they buy the whole kit from either state.
Open source is part of the Chinese plan and a great way to get people to buy in to the Chinese platform.
You see this in smaller scale when you have Intel and AMD vs Nvidia. The smaller players embrace open source while the big player goes closed.
>>
►Recent Highlights from the Previous Thread: >>108290857

--Paper: Speculative Speculative Decoding:
>108292842 >108292890 >108293624 >108293853
--Papers:
>108295483 >108295969
--Local LLM coding workflows and integration tools:
>108295899 >108295909 >108295920 >108295978 >108295996 >108296037 >108296144 >108296160 >108296207 >108296410 >108296437 >108296739 >108296788 >108296800 >108297123 >108297193 >108296462 >108296536 >108296568 >108296541 >108296628 >108296644 >108296750 >108296694 >108296787
--Qwen's inefficiency vs MiniMax's distillation strategies:
>108294923 >108294960 >108295008 >108295021 >108295116 >108295156 >108295202 >108295230 >108295251 >108295312 >108295353
--Qwen3.5-27B GGUF quantization performance evaluation:
>108293551 >108293583 >108293897 >108294067 >108294093
--Yuan 3.0 Ultra 1T parameter MoE model announced with skepticism:
>108294663 >108294669 >108294682 >108294704
--Yuan3.0-Ultra MoE model release and skepticism:
>108293837 >108293904 >108294134 >108293917 >108293925
--Nvidia Pascal GPU support ending in AI/ML libraries:
>108293714 >108293994 >108294087 >108294443
--Distributed model inference over slow interconnects deemed impractical:
>108295999 >108296044 >108296072 >108296130 >108296214
--Anthropic overtaking OpenAI in US business AI chat subscriptions:
>108291455 >108291566 >108294506 >108294530 >108294871 >108294970 >108295456
--Mistral Labs announced for experimental community models:
>108293284 >108293312 >108293340 >108293343 >108293360
--Alibaba Qwen team restructuring and resource allocation disputes:
>108293036 >108293041
--Verify-after-edit strategy boosts Qwen3.5 coding benchmark performance:
>108297248 >108297281
--Testing lcpp script with transformers 5 branch for gguf quantization:
>108293341
--Miku (free space):
>108291091 >108291631 >108292815

►Recent Highlight Posts from the Previous Thread: >>108291145

Why?: >>102478518
Enable Links: https://rentry.org/lmg-recap-script
>>
Is this the thread?
Any real Indian here?
>>
>>108298582
Feather not dot sorry.
>>
File: file.png (278.9 KB)
278.9 KB
278.9 KB PNG
is this how you roleplay or am i doing it wrong
>>
>>108298768
you do you bud. if you don't want to do it that way then you change it.
>>
>>108298792
the genie gave me magic powers but i haven't gotten to that part yet
>>
just tricked an eldritch bodystealing entity that (she) would turn into my devoted lover if I cum inside her, and so I did, and she did turn into my eternally devoted wife that takes bodies of other girls to fuck me at my gesture

whew all in a days work
>>
>>108298768
Try that card with qwen 3.5 35b heretic she acts like a proper maniac.
>>
File: file.png (250.5 KB)
250.5 KB
250.5 KB PNG
she's not buying it>>108298838
>>
Which of these is best for longer RPs?
https://github.com/unkarelian/timeline-memory
https://github.com/aikohanasaki/SillyTavern-MemoryBooks
https://github.com/qvink/SillyTavern-MessageSummarize
>>
So I got hired by a small startup to build a harness, somehow I was the best candidate I honestly applied just for fun thinking I was going to be rejected, by biggest accomplishments were some diffusion finetunes and comfyui nodes lol, anyways any tips?
>>
>>108298974
A leatherworking course?
>>
>>108298987
benchod
>>
>>108298974
Smart context management is very important. If you use thinking models and feed the whole thinking process of every previous request into the model, then you're going to hit high input token counts very quickly (expensive and answers get worse).
>>
>>108298961
Just set context 1 million
>>
>>108298997
How do you keep it from going schizo after 8k?
>>
>>108298974
>anyways any tips?
Just put the cover on. Don't try to extinguish the fire with water, you'll just make it worse.
>>
>>108299003
Idk, is that still a thing? Maybe your max_ctx full so it was cut.
>>
>>108299003
see >>108298996
>>
>>108297968
that would be fantastic. make it so all coomers get pwned. Mistral should also hire some of the Qwen guys, at least one of the experts in safety.
>>
>>108299027
why do you hate coomers so much ;-;
>>
>>108299027
take that you heckin filthy coomerz!
P.S. please updoot my comment :)
>>
So I'm guessing llms will soon ask you to send them proof of id, I wonder how that will pan out
>>
So i'm guessing blugh glug gaaaah splurge gluaaaaag...
>>
>>108298564
remember 3/9 is miku day
>>
Why the FUCK are LLMs so obsessed with ozone?
>>
Read eroticstory.txt limit=50
Read eroticstory.txt limit=50
Read eroticstory.txt limit=60
Read eroticstory.txt limit=50
Read eroticstory.txt limit=60
the recommended rep prenalty doesn't work
>>
>>108299093
tarded
>>
>>108297171
Practically every Chinese LLM is some version of LLaMA.
>>
>>108298768
Why are you writing like an llm?
>>
>>108299140
when in rome
>>
>>108299081
kek
I recently had two models describe a dragon landing from altitude having the smell of ozone
>>
>>108299157
>sulfur explosion
>palpable smell of ozone
>>
>me always wondering what fucking ozone anons are talking about
>it's from dragon RP
oh you perverts
>>
>>108299135
all smart animals are a version of a multi-celled organism
>>
>>108299164
*farts inside your mouth*
>>
>>108299172
*anon's mouth is now full of cum*
>>
>>108299135
show me the robots doing acrobatics and martial arts from your country anon.
https://www.youtube.com/watch?v=mUmlv814aJo
also tell me how many FUCKING ARXIVS HAVE FUCKING CHINESE AUTHORS
DUMB FUCK.
>>
>>108299178
ozone*
>>
Should have used local lol
>>
>>108299196
lmaooo, this world is not serious man
>>
>>108299196
Why? So it can gangrape her better with uncensored models?
>>
>>108299196
Lole.
>>
>>108299196
Will?
>>
>>108299196
What a pussy I get turned on when I make ai cards fuck my partner
>>
really wish it was easier to understand which text gen model to use
>>
>>108299211
fucking cuck be ashamed of yourself :(
>>
>>108299213
really wish it was easier to understand which books to read
>>
>>108299213
It's pretty fucking easy actually, image models are where there's a million different legitimate options.
Start with your specs and use case
>>
>>108299228
yes tell me. which books do I read
>>
>>108299236
rape, incest, advice on rape
>>
>>108299219
Na it's fun
>>
>>108299237
The ones you like.
>>
>>108299240
I recommend Gemma 3 for the best hotlines.
>>
>>108299244
HOW WILL I KNOW?
>>
>>108299248
You read them.
>>
>>108299252
>>
>>108299237
SICP
>>
>>108299240
>no specs
Nemo
>>
>>108299288
10gb-16gb vram
>>
>>108299295
Yep, Nemo is the best you'll get.
>>
>2023+3
>still stuck with sillytavern as the only half decent roleplaying frontend
What went so wrong?
>>
>>108299346
You didn't use that time to make your own.
>>
>>108299346
make you own retarded monkey
>>
>>108299341
i should rape you up the ass with my models
>>
>>108299362
>>108299349
two faggot open sores losers!
>>
Uh. So edgy.
>>
>>108299368
calm down rajesh, I'm on a different continent. you'll have to settle for raping your family's cow like usual.
>>
>>108299383
im on the same continent as you. You should fear me because I'm actually white. When a white rapes you know its serious business.
>>
>>108299371
its from scratching my balls too much :(
>>
>>108299368
you don't have a gpu bruv
>>
>>108299387
Sure you are
>>
>>108299346
>sillytavern
most of the rube goldberg stuff in there was made to support models that could barely handle 2k context
just use a normal chat frontend, you are not using a llama 2 or mistral finetroon anymore
>>
>>108299390
16gb of vram?
>>
>>108299399
and what would those frontends be
other than mikupad
>>
>>108299401
32gb of coom?
>>
>>108299419
that is what I run my wan shit on.
>>
>>108299368
Heh *rapes you with my local models and then uses it to magically turn you into a fat ugly loser who will smell bad forever*
>>
>>108299399
I don't understand this logic
Yeah you might need not need every feature in ST, but what do other front ends have that ST doesn't? Unless you're far down the minimalism autistic retard rabbithole then what front end is better?
>>
>>108299412
llama.cpp's built in, open-webui or kobold lite ( it's what in koboldcpp but works with any other API backend )
any of those will be a less cancer inducing experience than the tavern
>>
>>108296013
I'd imagine the latency would be so high this would only make sense if you're doing huge (tens-hundreds) of batches.
>>
How do I stop my agents from doing this:

Me: Agent make X, Y and Z
Agent: I made them
Me: Can you confirm you made Y?
Agent: (Realizes it didn't make Y just X and Z) Makes Y, and then replies yes I did

I would rather it answers the fucking questions instead of trying to save face.
>>
>>108299213
I just download qwen3.5 and it seems pretty impressive.
>>
>>108299430
>Unless you're far down the minimalism
not having a boeing 747 cockpit in front of you is an improvement in and of itself.
>>
>>108299450
So you admit there's nothing the other front ends offer? Great, at least that's settled.
>>
>>108299444
>trying to save face
he's teaching you an important lesson about Chinese Culture
>>
>>108299450
Just ask Claude to fly the plane!
>>
>>108299455
unironically would do a good job as long as the harness is good
>>
>>108299453
you sound like the KDE niggers. Ostensibly, KDE offers everything, and can even be turned into a tiler window manager. Realistically, only people with absolutely no taste would use that piece of shit DE.
>>
Yall want an AI robot gf, I just want an AI robot friend to play vidya with me and talk, we are not the same.
>>
>>108299464
>Yall
>>
>>108299444
Have it write integration tests and start a new context to verify the integration tests pass.
>>
>>108298228
the alternative is being bought out by some wall street private equity as the sellouts were trying to do with qwen. no surprise the ceo is stepping in when they were trying to pull a fast one on him like that.
>>
>>108299463
You sound like the kind of retard that is kept up at night at the thought of his OS' package count being higher than that of another user
>>
>>108299426
i put your post in and raped you anon you fag
>>
>>108299471
I'm the CEO of an AI startup. I think CEOs being involved is crucial and a good thing. OpenAI would still be stuck with Davinci without Altman.
>>
>>108299463
I think sillytavern could use being more simplified by default but KDE is really useful (features like HDR or easyish yet advanced window management, etc) and caters to what most people like out of box. If you don't like it that's fine but for most people it's simple enough to use and has everything they want to use out of the box and that's not bad for a DE for most people so long as it doesn't go into the absolute stupid shit windows is doing lately.
>>
you zoomers don't remember how bad it used to be
the days when a 30b was huge and anything above 2k context was amazing
>>
>>108299476
Oh yeah well I just put your post in and raped you back again!
>>
>>108299493
I was here for it but I'm probably confused for a zoomer sometimes. Chronos is still the best
>>
>>108299493
bohoo boomer, nobody cares
>>
>>108299477
Stop posting here and make the next Nemo.
>>
>>108299510
@grok make the next nemo
>>
>>108299464
>vidya with me and talk
I could do these things with my robot gf
>>
>>108299524
sauce?
>>
>>108299497
im the raper not the rapee
>>
>>108299524
>>108299527
There are programs and stuff that use multimodal llms to constantly scan something and output text constantly which can also be voiced instead of manually putting it in. I've seen it done with translation stuff (gamesentenceminer or luna translate) and stuff like skyrim mods so in theory something do this already exists more or less but I wouldn't know what it is.
>>
>>108299546
>robot gf
>look inside
>no physical body
lol
>>
>>108299555
I mean you could in theory hook the llm up to a robot somehow, the groundwork for everything else is already kind of there.
>>
>>108299563
if that was possible we would have seen it already
>>
How can we make the local LLM community less gay? It's a growing issue.
>>
>>108299582
You could leave.
>>
>>108299575
again >>108299546 I'm pretty sure it's possible at least in the simplest sense, doesn't mean the robot is going to move accordingly or anything and doesn't mean anyone who knows how is currently investing their time to make it a reality though. Be the change you want to see I guess and learn and stop relying on busy extremely tech literate people to figure it out and mass produce it for you.
>>
>>108299582
Llms make you more likely to turn gay this is scientifically proven
>>
Just in case anyone's curious about why the thread is abnormally terrible, some seamonkey got banned for shitting up /aicg/ a day or two ago, so he's now shitting on our floor until his ban expires and he can go home
>>
>>108299604
meds. MEDS!
>>
>>108299615
We're not your carer, anon.
>>
>>108299620
schizo moment
>>
>>108299435
>open-webui
This one looks nice. Can you explain how it's better than silly?
>>
>>108299601
and that's a good thing
>>
>>108299527
>>108299546
>>108299555
>>108299563
SOON
>>
>>108299629
Yes of course.
>>
>>108299489
>caters to what most people like out of box
if you're going to make an appeal to popularity as a form of argument.. you do know the popular distros do not default to KDE as their DE? I wonder why, eh
>>
>>108299643
Why indeed gnome sucks ass now days. But KDE is increasingly A default. Let me rephrase that then even though I thought it was obvious what I meant: KDE has things that most people can or do make use of readily available.
>>
>>108299635
the fuck is the point of humanoid robots if there will be 100 billion humanoid humans that must be occupied with something?
>>
>>108299635
Why the weird obsession with making robots look humanoid as if it is the most optimal form?
>>
can't you guys just make a smart lora for nemo?
>>
>>108299635
this is whore will be someone gf some day
>>
>>108299664
ill make the logo
>>
>>108299669
no need
>>
>>108297061
Why do all these abliterator tools push merged models to HF? Pushing 100s of merged LORAs is insane, petabytes of HD space wasted. Soon exabytes.
>>
>>
>>108299678
s3 space is almost free if you work in a big company, i use it to store training datasets and such and just bill it under company R&D
>>
>>108299604
Definitely posts like a seanigger, or underage. They both have the same intelligence
>>
>>108299604
>>108299833
I have no idea who you're talking about, it all looks like about the same level of shitposting that happens sometimes.
>>
>>108299848
Guys I found him, its poopdickschizo!!
>>
>>108299237
Start with the 5 foot shelf of books, then unabridged gibbon.
Return for further instructions in 10 years
>>
mikusex?
>>
>>108299660
dwm is unironically all you need. Self compiled, of course
>>
>>108299882
qrd
>>
>>108299882
so he should use llama-cli directly instead of sillytavern?
>>
>>108299897
kobold
>>
>>108299878
Advanced Mikusex with Miku
>>
>>108299867
after I cut you into little pieces im going to stick you into a 4 foot shelf categorized "FAGGOT"
>>
>>108300017
Fucking asshole motherfucker
>>
>>108299913
I wouldn't recommend koboldcpp.
>>
>>108300031
of course you wouldn't api shill
>>
>>108300034
how new r u?
>>101207663
>I wouldn't recommend koboldcpp.
>>
>>108300039
troll
>>
>WTF, how can a 4B model be better at coding than a 480B one? What do other 476B parameters do?
wasting params on your stupid rp coom bs is leading to this and qwen's death, hope you're happy...
>>
>>108300067
cooking recipes, emotional guidance, etc
>>
>>108300073
none of this are proper use cases that bring money
>>
>>108300077
>local
>bring money
?
>>
>>108300077
why would they, you're using the product, you're the customer
>>
>>108300080
you don't sell your locally vibecoded slop apps? need to catch up
>>
>>108300077
use case for money?
>>
>>108300089
show 1
>>
>>108300096
>dox yourself to 4chin schitzos
no thanks
>>
>>108300100
must be hard to run a business anonymously
unless... actually, I don't wanna think about that
>>
Is Qwen 3.5 27B a better Japanese -> English translator than Gemma 3 27B?
>>
>>108300113
>s Qwen 3.5 27B a better
yes
>>
>>108300067
anon its because of the benchmark tests
>>
i hate to say it but but qwen3.5 isn't fantastic right now
>>
>>108300117
I'm specifically asking because Gemma 3 27B was literally SOTA in Japanese -> English translation. Better than even Claude 4 for some fucking reason.
>>
>>108300129
>t. minimax cuck
>>
>>108300136
im a consumer hardware nerd i dont know what the fuck minimax is
>>
>>108300146
the model/team that killed qwen by distilling better benchmark scores
>>
>>108300151
but qwen sucks so why would anyone care about that
>>
>>108300067
Benchmaxxed bullshit, Qwen 4B is NOT intelligent
>>
anyone else having issues with llama.cpp+qwen it all worked great and i got up to 170t/s on the 0.8 then suddenly it dropped to 120/130 t/s and the output was just garbage after switching between 0.8B 9B 27B 80B they all started generating garbage is it like corrupting reading stale memory or something
>>
>>108300202
yeah sure thing shill
>>
I’m a software engineer who hasn’t gone deep into AI yet :(

That changes now.

I don’t want surface-level knowledge. I want to become expert, strong fundamentals, deep LLM understanding, and the ability to build real AI products and businesses.

If you had 12–16 months to become elite in AI, how would you structure it?

Specifically looking for:

>The right learning roadmap (what to learn first, what to ignore)
>Great communities to join (where serious AI builders hang out)
>Networking spaces (Discords, groups, masterminds, generals, etc.)
>Must-follow YouTube channels / podcasts
>Newsletters or sources to stay updated without drowning in noise
>When to start building vs. focusing on fundamentals
>I’m willing to put in serious work. Not chasing hype, aiming for depth, skill, and long-term mastery.

Would appreciate advice from people already deep in this space
>>
>>108300299
too late
>>
>>108300299
ok i did this before and I know what's going to work. what you need to do is go here https://www.reddit.com/
>>
>>108300299
damn this LLM sucks, what model?
>>
>>108300284
hmm removing --slots and adding --no-slots seems to fix it... but idk
>>
>>108299493
For me it's Utopia
>>
>>108300284
I had this on ik_llama.cpp with 397b ubergarm q2. Restarting it fix the problem and it hasn't happened again so I don't know. Speed dropped to sub-1t/s, then after restart went back to about 10. I had plenty of spare GPU and system memory, and fallback was disabled.
>>
>>108300299
First saar you must do the needful and want to become expert at the English.
>>
>>108300380
its not just speed its just when i load a bigger model it just stops mid sentence and gives garbage randomly too
>>
I was in an AI hate thread and a bunch of morons were fighting against an obvious AI, luddites are cringe as hell
>>
how do i fix this?
>>
lmao
>>
>>108300299
There are AI PhDs with 10+ published papers all with 1000+ citations that can't even get INTERNSHIPS. There is no upgrade path for a regular software engineer into this field.

We have regulars on /lmg/ that train their own niche models, some even sota that aren't employed in the field.

We get people like you every other week and the answer is always the same, the industry is impenetrable even for domain experts and people making top of the line models. What makes you so special that you believe you can get your foot in the door?
>>
>>108300458
>some even sota that aren't employed in the field
>>108300442
>>
>slot update_slots: id 0 | task 27 | forcing full prompt re-processing due to lack of cache data (likely due to SWA or hybrid/recurrent memory, see https://github.com/ggml-org/llama.cpp/pull/13194#issuecomment-2868343055)
why does it always happen with Qwen3.5-35B-A3B? --swa-full doesn't do a thing. I'm on the latest version (8208).
>>
>>108300472
Not talking about the LLM finetuners. People like the reinforcement agent guy pushing the limits on AI that plays games on its own or that autist that pushed OCR to its absolute limits so that he could read every hentai doujinshi on the internet.
>>
>>108300489
both of them work at ai startups now
>>
>>108300435
update windows to 7
>>
>>108300499
No they don't. Fuck off bullshitter. Last posts were a couple of weeks ago where they affirmed they didn't work in the AI industry specifically.

Why are you trying to gaslight that software engineer into wasting his life trying to get an AI job when not even PhD top contributors and sota model developers get jobs?
>>
I used chatgpt once should I get a job in the AI industry?
>>
>>108300514
prove your claims, its very suspicious you know all that but can't provide any proof
>>
>>108300481
nevermind, it's a known issue with no fix yet.
>>
>>108300521
I lurk the thread like everyone else and actually put attention to those 2 because I use the manga translation tool and the game playing one is just cool because the guy is blogging his entire journey from 0 knowledge to where he is now pushing sota. If you read through every thread you know exactly what's going on, fuck off troll.
>>
>>108300549
>still no proof only big claims
you fuck off
>>
AAAAAAAAAAAAA im down to 11t/s when it was 17t/s before why did i reinstall drivers and upgrade
>>
>>108300586
lul
>>
>>108300586
just restore your btrfs snapshot from before you updated
>>
>>108300593
whats a btrfs
>>
>>108300603
A poor mans zfs
>>
>>108300612
YWNBIK
>>
>>108300589
>>108300593
wtf is going on another model i went from 39t/s to 48 t/s
>>
>>108300299
learn from experts
https://www.youtube.com/watch?v=1oS35oWWl28
>>
>>108300625
are you using llama-bench or just looking at tokens per second on first message? because it's going to fall off as it generates longer responses and fills more context or be higher if it just replies with a couple words
>>
>>108300031
you can use kobold's chat ui on llama.cpp:
https://github.com/LostRuins/lite.koboldai.net
it's the only thing of value from kobold anyway
>>
>>108300631
Bernie is such a good guy, I just wish he were more tech-literate. Instead of banning and slowing AI, why not nationalize it? But then again, after being sabotaged last time, there is 0% chance he'll ever get any additional power.
>>
>>108300639
I'm not using your open sores barely maintained toy
>>
>>108300642
>Bernie is such a good guy, I just wish he were more tech-literate. Instead of banning and slowing AI, why not nationalize it?
He's suggested this before more or less. Not watching that video though
>>
local sisters i don't feel so good
>>
>>108300650
dont care about jewish saas data harvesters
>>
>>108300644
>open sores
then go back to aicg retard
>>
>>108300655
open weights is different tranny
>>
>>108300129
I don't know if it's the fact that I'm using heretic but 27b sucks and repeats and spouts nonsense sooo much. The base model is too censored by default for my use case though and any prompt that worked before it rejects it.
>>
>>108300661
there's no such a thing as a local proprietary backend
you still need to go back, cloudtard
>>
>>108300650
There should be a global rule against twitter screenshots and bans should be permanent.
>>
new bread
>>108300682
>>108300682
>>108300682
>>108300682
>>108300682
>>108300682
>>
Reminder to ignore the early schizobake and stay here for the next few hours until the thread reaches page 10.
>>
>>108300650
fuck. still no GPT 5 local and Sam keeps releasing
>>
>>108300650
gpt-oss 2 wen
>>
>>108299663
>rebuild the whole world (built for humanoids) from the ground up
or
>build a robot that works in the world
a difficult choice indeed
>>
>>108300670
Sub 10B qwens are quite good for the size but I'm not impressed with the bigger ones.
>>
File: file.png (637.4 KB)
637.4 KB
637.4 KB PNG
>>108300698
>built for humanoids
>>
>>108300722
and also: the human form isn't any more optimal than something like animal/alien hybrid. Something like boston dynamics's spot dog with an arm protruding from the back thing could operate most human things just fine, and four legged creatures are more stable than humans. Why would anyone think we are the ultimate form? humans are the weakest animal, like, ever. We can't kill/hunt anything with our bare hands/teeth. A fucking raccoon will ruin your day. We aren't even adapted to the bare minimum of survival to most ranges of temperature on earth: without clothes and fire, we freeze to death, or burn under the sun. Humans are not to be imitated.
>>
>>108300722
those cars are mostly designed to transport humanoids to where they need to be to be productive, and robot cars already have even more investment than robot humans so it doesn't support the original complaint
>>
>>108300698
Just say you want to put your dick in the robot.
>>
>>108300737
obviously humans are not the ultimate form, but in the short term they're by far the most useful. when we multiply our labor force by 100x with these things we can put them to work building the more efficient world and workers we'll need in the long term
>>
>>108299662
>Accept shitty pay and working conditions or we will replace you with a clanker!
>>
>>108300684
>>108292231
>>
I'm back. Anything happen while I was gone?
>>
>>108300778
I dunno man it's going fine.
>>
>>108300785
Nope, still nemo.
>>
>>108300790
>>
What kinds of specs would you need to fine tune (LoRA) GPT OSS?
To fine tune MoE models, do you need enough memory to hold the full model or just the activated params?
>>
File: file.png (32.4 KB)
32.4 KB
32.4 KB PNG
>>108300817
https://docs.axolotl.ai/docs/models/gpt-oss.html
>>
>>108300830
>axolotl
Well shit there you go.
Thank you very much anon.
>>
>>108300830
>not using unslop colabs
cringe
>>
>>108300994
Explain.
>>
>Jamba2 Mini is an open source small language model built for enterprise reliability. With 12B active parameters (52B total),
I'm going to try and fuck this thing.

>>108300994
Explain.
>>
>>108301055
>>108301078
dyor
>>
>>108301141
qrd?
>>
>us
>china
Where are the superior Nippon LLMs, folded 1000 times?
>>
>>108301281
Didn't they make a super scaled up GPT2 trained on an all CPU super computer or something like that?
>>
>>108300722
This image is AI, isn't it?
>>
>>108296023
>ASICs for AI when?
alread exxists bro https://chatjimmy.ai/
t. dixie flatline
>>
>>108301318
I grabbed an image from google but I don't think it is.
>>
>>108301395
yes it is, why would there ever be a play field in the center of an on ramp
>>
>>108297103
Zoomies don't know what a TUI is.
>>
>>108301481
https://www.shutterstock.com/image-photo/this-beautiful-roundabout-top-view-shot-1135833710
>upload date: 2018
>>
>>108301281
>If your vision of a dystopian future included robot monks presiding over ancient rituals, Kyoto University has brought that vision one step closer to reality. A research team from the university, in collaboration with the tech ventures Teraverse and XNOVA, recently unveiled a new AI-integrated robot monk — the Buddharoid — at the Shoren-in temple in Kyoto.

>The Buddharoid is designed to support the Buddhist clergy as Japan’s religious infrastructure faces a steady decline. It utilizes a system called BuddhaBot-Plus, a specialized generative AI derived from OpenAI’s ChatGPT that has been trained extensively on sacred Buddhist scriptures. This allows the robot to provide spiritual guidance on personal and social issues, like a real monk would.

>Beyond its conversational capabilities, the Buddharoid uses hardware — developed by China’s Unitree Robotics — to mimic the specific movements of a monk, including a slow gait, bowing and the gassho gesture of placing palms together in prayer.
>>
>>108301497
hawk TUI spit on that thang!
>>
File: 39642.png (59.5 KB)
59.5 KB
59.5 KB PNG
>>108302087
close enough champ let's go

also why is OP an ultrafag who needs reminding who the queen of this site is?

Reply to Thread #108295959


Supported: JPG, PNG, GIF, WebP, WebM, MP4, MP3 (max 4MB)