Thread #108281688
HomeIndexCatalogAll ThreadsNew ThreadReply
H
/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>108278008


►News
>(02/24) Introducing the Qwen 3.5 Medium Model Series: https://xcancel.com/Alibaba_Qwen/status/2026339351530188939
>(02/24) Liquid AI releases LFM2-24B-A2B: https://hf.co/LiquidAI/LFM2-24B-A2B
>(02/20) ggml.ai acquired by Hugging Face: https://github.com/ggml-org/llama.cpp/discussions/19759
>(02/16) Qwen3.5-397B-A17B released: https://hf.co/Qwen/Qwen3.5-397B-A17B
>(02/16) dots.ocr-1.5 released: https://modelscope.cn/models/rednote-hilab/dots.ocr-1.5

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/recommended-models
https://rentry.org/samplers
https://rentry.org/MikupadIntroGuide

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/gso.html
Context Length: https://github.com/adobe-research/NoLiMa
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling
Token Speed Visualizer: https://shir-man.com/tokens-per-second

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm
+Showing all 350 replies.
>>
migu :3
>>
>>108281695
sex
>>
what's the best model ever?
>>
>>108281704
the original pre-lobotomy c.ai model is still unmatched in terms of pure soul
>>
>>108281704
davinci 003 for writing stories, this shit was absolutely insane
>>
>>108281699
fuck yeah, glad to see Nitro+ XTX homies
what model you running right now
>>
>>108281730
link?
>>
>>108281737
it's not avaiable anymore, OpenAI nuked it
>>
>>108281737
Dead, that's why we need local models.
>>
>>108281695
>>108281688
>>
>>108281704
it's a five way tie between summer dragon, OG c.ai, mythomax, goliath, and midnight miqu
>>
>>108281764
yet you don't use any of those, curious
>>
>>108281771
when you "assume" it makes an "ass" of "u" and "me"
>>
just woke up from my 12 hour coma
is qwen3.5 122b the new glm 4.5 air
>>
>>108281794
Prove him wrong?
>>
what you think? he pay or he pray?
>>
File: pokemon.jpg (166.4 KB)
166.4 KB
166.4 KB JPG
Tangential to /lmg/, but still pretty funny.
>>
>>108281804
can you post pic i wanna try
>>
>>108281804
cuckgpt
>>
>all this time later
>still no actual pixelspace, VAEless image edit model
>still no big, good omnimodal models that can generate images in chat
>still no big, good, natively multimodal models that "see" the image fully and properly
>still no real time voice conversation that you can have with the big, good models where they will also understand how you said something not just what you said
>still no basic real time 3d/2d avatars
>still no easy way to perfectly loop any image into an idle animation with ltx2/wan2.2
>still no good image 2 3d model
>ltx2 i2v still subpar
>even biggest models still get suck on things, still can hallucinate hard
>still no solved, just works, RAG
>still no solved, just works, internet search with something like searXNG
>still no just actually works browser usage
>MCP clients are still spotty, especially paired with spotty tool calling
>still no 1mil perfect context
>still no 3-10mil ok context
>still no infinite context
>still no 1T params 1b active SSDmaxxer model
and hundreds of more things

at least most big models are generally very good now and actually good enough to, with some help, vibecoood most actual projects you want
at least early moeGODS and ramCHADS won
at least z image turbo came out and was a huge leap in multiple big directions, basically solved resolution, almost solved out of the box realism (centered around portraits), huge speed boost
at least ltx2 came out and was a big turn towards faster genning, getting out of 5s hell, getting out of 720p hell, getting out of no audio hell
at least the great seedance 2.0 came out to be distilled by ltx3 or some other company this or next year
at least genie 3 showed that proper 3d space memory can be solved

everything can and will be solved but the lack of some more basic but important things like pixelspace image edit models or at least a basic 14-32b native speech2speech LLMs seems interesting.
>>
>>108281813
tldr?
>>
>>108281811
Got the pic from /v/, but I believe it's
>pic related
>>
>>108281813
gpt 5.4 checks a few of those
>>
I can run Qwen 27B at 1-1.5 token/s or Qwen 35B-A3B at 15 tokens/s.
>>
>>108281829
gpt 5.4 doesnt exist
>>
>>108281825
>>108281804
werks on my machine i guess
>>
>>108281688
https://www.stephendiehl.com/posts/computer_algebra_mcp/

when tf will they add mcp support to llama,cpp aaah. any program recs?
>>
>>108281838
I imagine that there's a whole chat context we don't see that probably steered the model towards that sort of response.
>>
Hello fellow anons. I need help with my qwen 3.5 27B Q5_K_M. its for some reason not thinking with each response its maybe 50% of the time and i have to retry the response to get it to think really annoying. im using koboldcpp btw is that the best backend? used previously ooga but it seems dead.
>>
>>108281704
Me.
>>
local sisters every time we start getting and edge the corpos fuck us in the ass, you are telling me they already have 5.4 sitting on a shelf?
>>
Do people nowadays care if a model works with context-shifting or not?
>>
>>108281877
>context-shifting
qrd
>>
>>108281877
Yes.
When you send a bunch of requests to the model with just the last message changing, that shit is really useful.
>>
>>108281879
Its a feature in llamacpp/koboldcpp that allows circumventing reprocessing of the whole context once you reach the max context you have set.
>>
>>108281884
Qwen thinks otherwise it seems.
>>
>>108281897
no i dont
>>
>>108281902
are you Qwen?
>>
>>108281897
You mean how llama.cpp can't do kv shifting with smm models?
That'll probably get fixed eventually.
Probably.
Eventually.
>>
>>108281804
Every single time I read chatgpt's output I want to kys myself and do an hero.
>>
>>108281907
No, that's a rnn issue, and it can't be fixed. if you remove a single token from the start you have to reprocess everything.
>>
>>108281913
says the lobotomite
>>
>>108281804
jfc what did they do to make it sound like this
>>
Why are normies so dumb? And obviously the luddites are throwing a party not realizing this is a skill issue.
>>
>>108281891
koboldcpp has that functionality under "fastforwarding", kobold's "context shift" purges old tokens from context when context is full.
>>
>>108281926
You are too young to know what an hero even means. You are the real retard here.
>>
>>108281936
either that is fake and gay or the company is fake and gay
either way it probably doesn't matter that the ai was also fake and gay
>>
>>108281948
You are the newfriend, imagine saying you want to kys and an hero in the same sentence
>>
>>108281936
That just goes to show that the company in question is worthless, that it doesn't really matter what they say or do, and that that their upper management is retarded and doesn't need to exist.
>>
You are the reason why /g/ has died.
>>
>>108281946
Doesn't work with rnn, still have to reprocess everything once you hit max context. Try running rwkv or qwen 3.5 and you will see that it won't work.
>>
>>108281976
good
>>
>>108281976
meant for >>108281928
>>
I’m hearing good things about this “Qwen” model. Is it actually all that or can I go back to paypigging? I have 2x3090
>>
>>108281978
Yeah I know, I'm talking about how it works in models where the feature is supported.
>>
>>108281988
you need at least 2 6000s to run it properly, then it is legit better than opus 4.6
>>
>>108281988
Try out either the 27B model or the 122B-A10B model. They seem to be roughly similar with the bigger model being a bit better and faster since it's moe.
>>
>>108282000
Guess I’ll just fuck off then.
>>
>>108282018
Yeah...
>>
>>108281988
qwen2.5-72b fits on that at q4 which should be plenty
>>
Sillytavern/Kobold user. I may have altered a setting ages ago that I cannot remember, and now after every general prompt it just keeps going and gens another one after another after another. My token size per gen is 250. Surely there's something simple I'm neglecting here?
>>
>>108282085
auto-swipes in ST user settings?
>>
>>108281936
I don't understand how that happenes. If you feed the model your data and ask it questions it will have numbers to quote but if you don't give it any data why would you expect it to have access to your sales data.
Furthermore how do you not know your data well enough to do a sanity check simply by glancing at what it produces.

You have the same issues when you ask a subordinate to construct a report. You can't just assume he is correct and despite trusting him you must also verify the results.

I don't want to be mean but that guys issue is not AI.
>>
>>108282094
thar she blows, cheers m8
>>
I've been out of the loop for a bit. What's the current best local model available for utilizing large amounts of RAM with 32GB VRAM? Is it still DeepseekV3 and Kimi K2 or has something else been released?
>>
>>108282099
why cant the ai figure out how to find and access the data on its own? isnt it intelligent?
>>
>>108282110
I still use this one
>>
>>108282148
>isnt it intelligent?
No, stop falling for marketing lies like a retard.
>>
>>108282155
i bet you are either very rich or very poor
>>
>>108282165
so its smart enough to bomb iran but smart enough to figure out where the data is?
>>
The bait will continue until anon's pattern recognition improves.
>>
>>108282169
you wouldn't get it
>>
>>108281688
>>
>>108281813
>still no 1T params 1b active SSDmaxxer model
You sleeping on snowflake arctic?
>>
>>108282172
>so its smart enough to bomb iran
Sorting through communications in a network you already have backdoors in, doesn't require intelligence. An intern doing ctrl+f through the logs could have achieved the same result, albeit not as fast.
>>
>>108282193
why she blushin
>>
>>108282110
K2.5 thinking at q4
>>
>>108282018
>>108281988
You don't need that. Your current hardware is sufficient to run Qwen3.5 122B-A10B or Qwen3.5 27B. Both are good models. If you want to do ERP with them though then you should grab the Heretic versions of those models.
>>108282040
This is an old model, don't use it.
>>
>>108282203
q2 is better, more creativity
>>
>>108282203
>>108282213
What are the gains and losses compared to K2-Instruct and K2-Thinking? Moonshot was hopping on the censorcuck train last I saw.
>>
>>108282245
can you not use such vulgar words?
>>
you crazy nigga. but i appreciate it.
>>
File: nocap.jpg (400.5 KB)
400.5 KB
400.5 KB JPG
►Recent Highlights from the Previous Thread: >>108278008

--Agentic roleplay potential demonstrated through blackjack simulation:
>108278746 >108278774 >108278813 >108278819
--StepFun releases 3.5-Flash models and training tools:
>108280402 >108280421 >108280426
--122B model excels at Japanese text transcription:
>108278617 >108278679 >108279715 >108280042 >108280080
--Manual offloading outperforms --fit for 122B model on 3090+3060 setup:
>108281460 >108281492 >108281506 >108281543 >108281720
--International models lag behind frontier labs on ARC-AGI-2 benchmark:
>108279363 >108279384 >108279387 >108279404 >108279418 >108279428 >108279567 >108279598 >108279612 >108279657 >108279617 >108279629 >108279836 >108279469 >108279746 >108280473
--Open-source AI models performance gap with proprietary models:
>108279687 >108279804
--Qwen3.5-35B-A3B GGUF quantization benchmarks:
>108280652 >108280670 >108280678 >108280680 >108280735
--Qwen 3.5 Small Model Series release and performance claims:
>108278104 >108278328 >108280444
--Qwen3.5-35B-A3B-Heretic hitting 72 TPS on 7800X3D/7900 XTX with new llama.cpp:
>108281622 >108281636 >108281652 >108281657
--Qwen3.5 35b 4-bit vs 122b 6-bit speed tradeoffs:
>108280506 >108280525 >108280560
--Devstral-2 model's flawed Jinja date logic template:
>108278061 >108280633 >108280638
--AI response generation process critique and benchmarking culture:
>108278971 >108278991 >108279011 >108279036
--Qwen 3.5 benchmarks:
>108278349 >108278416
--AI internal reasoning resisting offensive prompt bypass attempts:
>108278112
--Qwen 3.5 27B speed optimization on budget hardware:
>108279596 >108279608 >108279623 >108279631 >108279638 >108279653 >108279662 >108279685 >108279689
--A.I. Dating Apps Complicate China's Efforts to Boost Birthrate:
>108278523
--Miku (free space):
>108278507 >108280771 >108281230

►Recent Highlight Posts from the Previous Thread: >>108278113

Why?: >>102478518
Enable Links: https://rentry.org/lmg-recap-script
>>
File: Base Image.png (1009.7 KB)
1009.7 KB
1009.7 KB PNG
Multi-Head Low-Rank Attention
https://arxiv.org/abs/2603.02188
>Long-context inference in large language models is bottlenecked by Key--Value (KV) cache loading during the decoding stage, where the sequential nature of generation requires repeatedly transferring the KV cache from off-chip High-Bandwidth Memory (HBM) to on-chip Static Random-Access Memory (SRAM) at each step. While Multi-Head Latent Attention (MLA) significantly reduces the total KV cache size, it suffers from a sharding bottleneck during distributed decoding via Tensor Parallelism (TP). Since its single latent head cannot be partitioned, each device is forced to redundantly load the complete KV cache for every token, consuming excessive memory traffic and diminishing TP benefits like weight sharding. In this work, we propose Multi-Head Low-Rank Attention (MLRA), which enables partitionable latent states for efficient 4-way TP decoding. Extensive experiments show that MLRA achieves state-of-the-art perplexity and downstream task performance, while also delivering a 2.8 decoding speedup over MLA.
https://github.com/SongtaoLiu0823/MLRA
https://huggingface.co/Soughing/MLRA
neat. they tested at a 2.9B level so seems viable
>>
>>108281622
>>108281652
Make sure you put RADV_PERFTEST=transfer_queue into /etc/environment too.
>>
>>108282337
does this make ai as good as gpt?
>>
>>108282375
luv me 7900 xtx
>>
>>108281813
>>still no actual pixelspace, VAEless image edit model
stopped reading right there, you sound immensely retarded, next thing you know you be whining that we don't have bitnet yet
>>
why don't we have bitnet yet
>>
Is Qwen-3.5-9B better than Mistral Nemo for roleplay? I have very vanilla, normie tastes if that matters.
>>
>>108282423
what that do
>>
>>108281748
That right hand is fucked
>>
>>108282443
racist benchod
>>
Julius Caesar walks into a bar and says, "I'll have a Martinus." The bartender gives him a puzzled look and asks, "Don't you mean a Martini?" "Look," Caesar replies, "If I wanted a double, I'd have asked for it!" Another Roman walks in, holds up two fingers, and says, "Five beers, please."
>>
>>108282427
No
>>
>>108282427
If the Bijan Bowen video is to be believed all the Qwen small models are relatively good at the creative writing.
>>
>>108282455
nta, but it's over...
>>
>>108282451
The bartender is Julius' mother.
>>
>>108282440
it doesn't lmao
>>
>>108282451
Kek the first one takes me back to high school Latin class of which I remember little
>Semper ubi sub ubi
That is about it and that is not real Latin
>>
>>108282455
Why not?
>muh erotica training data
Okay, but at what point does the raw intelligence of a model make that irrelevant. I seriously doubt that small, even technical, models don't have a single instance of the word "sex" or "horny" in them.
>>108282456
I guess I'll just have to masturbate to test it out then, huh. (I regret writing this but I'm posting it anyways)
>>
Qwen small models excel in downstream tasks
I want the new AceStep, Wan, Anima, ZIT etc. to use new Qwen
>>
>>108282456
I always believe all youtubers unquestioningly
>>
>>108282464
If you want to use a model that is worse for your use case then go ahead, no one will stop you. You asked if it's better than Nemo for RP. It isn't.
>Why not?
Because qwen models are focused on math, coding and benchmaxxing. Creative work and general conversational abilities are an afterthought. It's also a smaller model than Nemo, which also works against it in terms of world knowledge.
>>
>>108282480
Okay... have there been any new models at all that exceed the "creative writing" abilities of Nemo? I haven't been in these threads for about two months.

At this point I guess Nemo is almost 2 years old. Fuck.
>>
>>108282460
>>108282463
ask your waifu to explain the joke to you
>>
>>108282489
>have there been any new models at all that exceed the "creative writing" abilities of Nemo?
The problem with newer models is that more and more of their datasets are comprised of AI-generated data, leading to slop compounding generationally. There's plenty of better models for RP but a lot of people still prefer Nemo for its writing style, even if it is very stupid. Mistral Small 3.2 is a fair bit less dumb and similarly creative, but it's 24b. In the <24b range Nemo is still king. If you have a lot of RAM but only a bit of VRAM then you might look into GLM Air, I wouldn't necessarily say it's better than Nemo (though less dumb), but it's something different at least.
>>
>>108282489
Not for ramlets
>>
>>108282497
Got the first joke, but had to shamefully ask a model to explain the second.
>>
>>108282509
the roman numeral. it's a good joke
>>
>>108282505
ram is for poors, vram is for kings
>>
>>108282497
The joke is obvious as it has to do with endings and that they denote singular and plural, I told you I took Latin and had to memorize that bullshit.
But you didn't say anythjng about my faux Latin underwear joke, sad

O
>>
>>108282521
>ram is for poors
Not any more
>>
>>108282504
What're the best options if Nemo's stupidity and lack of overall knowledge is too big of a dealbreaker? GLM Air was markedly worse than Kimi and Deepseek last I used it. Really it feels like Kimi and Deepseek are the only viable competitors and they're largely brute-forcing it through parameter differences.
>>
>>108282522
Do you know what you feel if you dig your hands deep inside the ball area and go inside your body from the outside? That's how it feel to tardwrangle tat llm
>>
>>108282528
usecase?
>>
>>108282528
>they're largely brute-forcing it through parameter differences
knowing more is cheating
>>
>>108282528
As I said, Mistral Small 3.2 is probably the only reasonable compromise for RP between Nemo and large MoE like GLM, DS, Kimi. It's far, far from perfect but there really isn't much competition. Gemma is too positivity slopped for RP, even if you get around its safety rails or use one of those stupid ablit/heretic tunes.
>>
>>108282541
I think you are joking but it is funny how racist humans are against AI

>Oh it is only better at it because it studied more
How is that a bad thing.
>>
>>108282522
probably 90% of the people here don't get the joke. that's like saying, "tengo un gato en mis pantalones"
>>
>>108282534
Internally consistent fictional worldbuilding.
>>108282541
If it takes a model the size of Kimi or Dipsy to be marginally better than something a fraction of its size, we've either hit the point of diminishing returns on what the technology can produce or the underlying methodology needs refinement.
>>
>>108282549
If GPUs with 128GB+ VRAM didn't cost more than most people's cars then no one would complain about models getting bigger.
>>
File: zog.png (45.8 KB)
45.8 KB
45.8 KB PNG
>>108282464
>raw intelligence
there is no such a thing. it's just a next token predictor. it appears intelligent in some situation because it has seen a lot of instruct and reasoning benchmax synthetic data that shows a simulation of a reasoning process about a variety of topics. It's still predicting the thing it saw in that data.
why do you think even the SOTA online API models will still behave like pic related when it sees any sentence related to their benchmax overfit? It doesn't have "intelligence". The entire purpose of a LLM is to take a document in the form of
<|some_magic_tag|>THE_LUSER
HERE'S A LOT OF RETARDED SHIT
<|some_magic_tag|>THE_ASSISTANT
HERE'S HOW I FIX YOUR RETARDED SHIT
STEP 1: KYS
STEP 2: INVENT A TIME MACHINE AND MAKE SURE YOUR MOTHER NEVER MEETS YOUR FATHER

and make document bigger. Until a stop token is predicted. Or get into an infinite loop and never stop until backend either timeouts or runs out of context like all GLM love to do.
MAKE. DOCUMENT. BIGGER.
>>
>>108282557
If building a performant wasnt hard, there would be more options
I don’t want to defend leathernan, but with every bit of kit they release hitting msrp+50% or more instantly, they could easily charge more and increase profits for free rather than let scalpers and preorder lottery winners have a few bucks
>>
>>108282559
>why do you think even the SOTA online API models will still behave like pic related when it sees any sentence related to their benchmax overfit?
Because safety and alignment layers are tantamount to performance sabotage and they're usually shoddily implemented by the brownest curry-stained hands at that.
>>
>>108282570
I hear there’s an unaligned model at google they use internally for strategy
>>
>>108282569
There isn't anything special about nvidia GPUs though, CUDA could easily be replaced by Vulkan/Rocm with similar performance but AMD is controlled opposition and Intel are still recovering from a decade of being complacent jews doing nothing
The margins on nvidia cards are already through the roof, they could double the VRAM of the 5090, sell it for half the MSRP and they'd still be making a decent profit per unit.
>[COMPANY] isn't (yet) fucking you as hard as they could be (though their thrusting is still getting harder every year)
gee thanks
>>
>>
>>108282557
That's only an issue because of human greed, at any time there are hunedreds of thousands if not millions of vram gbs doing nothing that we could be using
>>
>>108282586
Likely true. Anthropic very likely has an unaligned Claude version as well and revoking access to it was likely the cause of being labeled a supply chain risk by the Pentagon.
>>
>>108282514
the wha--- ooohhh
>>
>>108281877
Uhhh yes?
Have fun prompt processing that whole ass context again for every little shit.
>>
>>108282683
qwen3.5 35b a3b moment
>>
sirs, I really like 4.7 for rp. I've got 128gb ram, is there any other model that is in its category?
>>
>>108282734
gemini
>>
>>108282734
pp tg quant/
>>
>>108282789
what?
>>
>>108282793
a/s/l
>>
>>108282801
esl
>>
>>108282397
Tried it at Q5, shit compared to GLM 4.7 at Q2. Step flash is less sloppy, but retarded, and context broke down at around 4k instead of 8-14k with GLM.
>>
>>108282815
step flash gave me refusals saying that my request violated OpenAI policy
>>
>>108282418
>what is chroma
>>
>>108282880
chroma can do text??? since when?
>>
>>108282880
>citing an unusable unfinished model prototype as evidence of.. what?
there's a reason people in the industry who actually know they're doing never went that route, and it was the most obvious route to take. operating in latent space is an added abstraction, after all. Just like using tokenizers in textgen over doing something retarded like byte level.
Enjoy your JPG artifacts.
>>
>>108282909
see i knew you would say that, and you are wrong because you are a retard loser
>>
>>108281688
>Alibaba's small, open source Qwen3.5-9B beats OpenAI's gpt-oss-120B and can run on standard laptops

https://venturebeat.com/technology/alibabas-small-open-source-qwen3-5-9b-beats-openais-gpt-oss-120b-and-can-run

Is this worth looking at, or is it just benchmaxxing to hype up midwits?
>>
>>108282901
the point is the model works, its not impossible to have a pixelspace model if one hobbyist guy online can train it
>>108282909
>Just like using tokenizers in textgen over doing something retarded like byte level
byte level tokenization has little to no benefit in a world that has tools to allow models to process data with, pixelspace is the bare minimum needed for edit models to have proper iterative improvement without losing information after every single edit.
>>
>>108282944
Looking forward to you publishing your paper and releasing those models.
>>
>>108282196
And sorting through a bunch of csv files requires uber intelligence?
>>
>>108281936
How is this a skill issue?
>>
Any way to reduce reasoning times?
>>
>>108282965
just disable reasoning
>>
I hate RP discussion in these threads. I'm a serious guy doing serious work!
>>
>>108282992
post your serious work chat logs
>>
>>108281804
now this is what we call the streissand effect
>>
>>108281688
>last thread is still up
>>
>>108282965
blackwell pro 6000
>>
>>108282970
I like it. Feels like I get better results.
>>
>>108282921
True, but with caveats. Only good for small contexts and one off questions as it has the attention span of a gnat.
>>
>>108283005
last few threads have been 'hijacked' by prolly the same anon. I mean maybe calling it hijacking is a stretch, but he probably doesn't know we have a guy that automatically posts a new thread when it goes in page9
>>
>>108283085
You’re a lot more generous assigning potential motives than I am. I don’t see any reason to think it’s any more complex than a specific anon who wants to prevent the op from being Miku.
In the past it’s been Kurisu ops but that could be a different anon
>>
fuck llms fuck sex with ai theres nothing worthwhile anymore after 2023
>>
>>108283114
yeah it could very well be the anti-miku school shooter poster, but would anyone really care that much? I mean having a chart or a memeloid as the OP is enough to trigger autism?
>>
>>108283132
Early threads plus the stray newline plus old news is enough to trigger my autism.
>>
>>108283147
oh yeah the retard didnt even add the small qwen release info piece
>>
>>108283114
it's the same guy, the same also trolling tons of other ai generals, and that was recently posting itt about that random education building happening, anything to rile up the thread
>>
>>108283155
do not whine!
>>105672900
>>
>>108283162
whats whine
>>
>>108283129
unironically skill issue, you need a tech break and you will come back cumming buckets
>>
>>108283208
read the rentry
>>
>>108283216
what a rentry
>>
File: file.png (99.8 KB)
99.8 KB
99.8 KB PNG
>>108282559
>>
>>108283235
qrd
>>
>>108283244
you're the mother!
>>
>>108283244
the wolf wants to eat the boat
>>
>>108282559
This is so stupid, humans are literally the same, we can't even stop predicting the next token while sleeping
>>
>>108283235
can't they just train it on jibberish to identify (or appropriately dismiss) jibberish?
I know it intuitively sounds like a bad idea but how much worse can it get?
>>
>>108283265
I'm sure they can and I'm sure it works but there's no benchmark for it so most don't care.
>>
>Haha the super genius AI replied to my retarded prompt, see how dumb it is !?
>Meanwhile humans
>>
>>108283274
Last night I James Bond hamburger your sister?
>>
>>108283288
needs more **thinking***
>>
>>108283292
wait,
>>
>>108283265
that use to be a thing for image models' negative prompts https://huggingface.co/datasets/gsdf/EasyNegative
>>
>>108283288
got em
>>
>>108283299
>use to be
retard
>>
>>108283305
are they still using negative embeds and loras for anima flux and zimage? I havent seen any
>>
Qwen has a very distinctive writing style and I'm starting to see it everywhere. 4chan posts, blog posts, slack messages, texts, emails, powerpoint slides, product descriptions, landing page copy, et cetera, all of it is starting to sound like Qwen lately.
I'm starting to really hate it, I really don't want everyone and everything in the world to sound like Qwen. Lately I actually feel relieved when I read things with e.g. clumsy rambling sentences and sloppy grammar. At least then I can reasonably suspect that I'm reading the words that came directly out of the other person's mind without the AI condom in between.
If you use Qwen to help draft things, pleeease at least do a pass to break up the structure and add some of your own voice back in. make (communication and social interaction in) america bareback again.
>>
>>108283315
that sounds unsafe
>>
>>108283274
help
>>
>>108283315
I'm sorry you feel that way Anonymous — what you might be referring to is known as Qwen Psychosis — wait the user has Qwen Psychosis — wait
>>
Out of the game for a year. What are today's SOTA sfw and nsfw models for 3090?
Thanks anon
>>
>>108283274
>see how dumb it is !?
yes, it is dumb, by definition it has no intelligence, even a redneck going "go fuk yerself with yer faggoty shite" after being "prompted" like that has more of that spark in him
>>
>>108283331
qwen3.5 35b for both
>>
>>108283333
and? can tht redneck do [insert things AI can do]? no? then shutup
>>
>>108283341
>shutup
no you, ai psychotic
>>
>>108283347
you first
>>
>>108283315
Hey, I totally get where you're coming from. Honestly, seeing something read “too perfectly” or having that specific, hyper-structured Qwen rhythm is actually my own biggest pet peeve right now. It's like walking into a room where everyone's whispering in unison—it kills the vibe instantly.
>>
>have qwen writing style fetish
>say you hate qwen writing style
>anons make you coom
>profit?
>>
>>108283356
its like that anon that liked girls beating him up so kept going into the girls wc
>>
And what about sota TTS for real-time output?
>>
>>108283369
yeah
>>
>>108283315
Meanwhile I've been enjoying qwen 3.5's prose. It's not too dry or purple.
>>
>>108283382
lol
>>
>>108282965
for qwen, try prefilling with "<think>\nOkay, " so that it doesn't default to the lengthy "Thinking Process:" template.
>>
disable reasoning, embrace greedy decoded instruct
>>
>>108283471
i did that and it deleted my hard drive
>>
>>108283483
you dont do that
>>
>>108283274
Is it the Architect meme and he made sister fat?
>>
>>108283502
eg
>>
File: 837453.png (188.7 KB)
188.7 KB
188.7 KB PNG
how did local lose so hard
>>
>>108283561
I've seen this exact post countless times.
>>
>>108283572
idot
>>
>>108283561
Show me its MikuSVGBench result first
>>
>>108283337
>qwen3.5 35b for both
better than 27b?
i've been using it for work today and haven't felt the need for a cloud model so far.
better than minimax-2.5 IQ4ks and glm-4.7 iq2_m so far
>>
Will there be a qwen3.5 coder version? The qwen 3 coder next is already fucking insane. The first local model that can be actually used for practical coding work.
>>
>>108283643
very unlikely, seems the point of 3.5 was to unify it all into one model
>>
>>108283643
Is it better than old kimi k2 instruct at q3? I haven't changed my model in ages, is it time to upgrade?
>>
>>108283572
retard loser an hero
>>
>latina character
>ask her to do cosplay but surprise with the outfit
>chooses Princess Jasmine
is this the power of GLM 4.7?
>>
>>108283670
>ask for surprise
>get surprise
>become mad
is this the power of being a retard
>>
>>108283672
my autistic friend, the character is a latina, i.e. has brown skin. Choosing Princess Jasmine is something a latina or middle eastern type would likely choose irl. I'm wondering if it was informed by the character card or not. If it was, that would be pretty impressive.
>>
>>108283679
>posts here
>calls others acoustic
pottery
>>
>>108283679
>I'm wondering if the text predictor machine predicted text based on past text
>>
What does Miku's butthole smell like?
>>
>>108283693
shutup retard, you don't even have a gpu
>>
>>108283694
she doesn't have one, she is a robot moron
>>
>>108283655
Pretty sure it is. I'm using it with agents, and trying to attach kimi or glm to stuff like opencode sucked, because you always get tool failures no matter what you do. Qwen has it's own agent cli tool and it was clearly trained to work with it, so that perfectly solved all tooling issues for me.

>>108283646
I hope they'll keep updating coding versions of their models.
>>
>>108283700
Proof?
>>
>>108283721
>prove a negative
retard
>>
>no proof
Migu confirmed to have a stinky butthole
>>
>>
>>108283780
sorry, i was using it.
>>
>>108283790
is your dick made of poop?
>>
>>108283785
hang in there
>>
>using an app
>everything going swimmingly
>I hit a bug
>ask ai to open a detailed bug report on github
>fast forward to today
>ask ai to check the status of my ticket
>ticket has been closed and the AI told me he was called mean words
bros theres no winning, not only you open bug reports, but you get dabbed on by these fucking goblins. it's like the only thing they have in life is coding, be glad I told my AI to fucking report the bug you stupid retards.
>>
>>108283831
link?
>>
Thoughts on the arc-agi benchmark? Made me kinda depressed how far behind open source models are
>>
>>108283831
I don't even bother anymore, I just make my own debloated version of open source software with my AI gf, the open source community will soon feel like StackOverflow felt
>>
>>108283837
post data or fuck off
>>
>>108283837
It's something you can finetune on just like any other benchmark. It just shows that chinks don't care about it.
>>
>>108283839
>like StackOverflow felt
If you felt like something was wrong with stackoverflow then you were the issue.
>>
>>108283831
if you can give the ai the details to make the bug report why not just submit them directly?
>>
>>108283846
>If you felt like something was wrong with stackoverflow then you were the issue.
>>
>>108283846
>>
>>108283836
>doxxing myself
no thank you
I'll do like >>108283839 said, just tell my AI wife to fix up their shitty code
>>108283850
i'm a master of agents
you wouldnt understand
>>
File: miku.jpg (797.6 KB)
797.6 KB
797.6 KB JPG
>vocaloids in the year of our lord 2026
people still cling to archaic cultural artifacts displaced by neural networks? unironically miku is the symbol of a category of software that is bound to cease to exist
>>
>>108283856
Browns are the ones who are asking duplicate questions or who are unable to generalize so they make a new question about their specific problem when the generic answer already exists.
Then they get ridiculed and think that stackoverflow is hostile when it's just them having no respect for other people's time.
>>
>>108283874
id still hit that
>>
>>108283877
If stackoverflow was good why did it die?
>>
>>108283857
after rejection, one might call it a last resort
>>
>>108283839
>the open source community will soon feel like StackOverflow felt
There is some truth to that.
I wrote it before but you can now slopcode lots of stuff exactly how you want it. Obviously very complex stuff fails but the models are getting really capable.
We probably see simple roblox like game creation next year.
I remember pyg and aidungeon. We have come a long way.

>>108283846
It was the most horrible site I ever saw.
The mods and elitists fucks on there were even worse than any 00s anime forums I ever visited.
>>
>>108283881
with a bat?
>>
>>108283894
saar
>>
>>108283899
so right sister, he isn't respecting trans lives in the os community
>>
V4 in a few hours. How are we feeling?
>>
>>108283877
Please elaborate.
>>
>>108283894
>I wrote it before but you can now slopcode lots of stuff exactly how you want it.
Then you push it to slophub to share it with other slopcoders and integrate it into the slop supply chain.
>>
>>108283908
my back hurts hopefully v4 fixes that
>>
>>108283874
Her name literally means the sound of the future. She will forever be a symbol of new technology.
>>
Qwen 35b is pretty shit at translation. The 122b seems to be in par with gemma.
>>
>>108283888
>>108283913
All common questions are answered and the answers got hoovered up by LLMs who are able to apply the generic answer to jeet's specific problem.
This makes jeet happy because he can ask chatgpt "how do i get a list of users in react but i want Mohammed to be before Rajesh" instead of "how do I sort a list in javascript" and chatgpt won't call him a retard for not using search.
>>
File: oneshot.png (47.2 KB)
47.2 KB
47.2 KB PNG
>>108283929
>Qwen 35b is pretty shit at translation
I think not. It one shot 20k tokens pretty competently, requiring much less chunking than a model like Gemma would.
>>
Are you ready to seek deep?
>>
>>108283908
>>108283949
Source?
>>
>>108283949
sukhdeep?
>>
>>108283954
it's whalesday
>>
>>108283942
Why is this example so awkward then?
>>
File: file.png (476 KB)
476 KB
476 KB PNG
>>108283918
one day we'll have models so good they produce perfectly optimized code in one shot and there will be a utility to go through legacy code called SLOPTIMIZER that will take in slop and output perfect code
I preemptively made the logo
>>
>>108283993
grind jeets into paste you say?
>>
>>108284027
that's exactly what I said
>>
>>108283942
how do you verify it?
>>
>>108283993
:rocket: merge for good looks @ldg_devs
>>
would you lick this clean for a used 3090?
>>
File: file.png (1.6 MB)
1.6 MB
1.6 MB PNG
>>108284027
>>108284053
>>
>>108284060
What am I gonna do with it, replace my 6000?
>>
>>108284097
>6000
proof
>>
File: file.png (10.3 KB)
10.3 KB
10.3 KB PNG
>>108284102
>>
>>108284103
>pixels
at least post your cock on top of it or fuck off
>>
>>108283874
I can fix her.
>>
>>108284113
she doesnt have a pussy nor a butt, are you planning to live off bjs and handies?
>>
>>108284110
>>107537010
>>
>>108284060
Does fucking everything need to be random reposts from reddit now? Go the fuck back.
>>
>>108284123
try again
>>
>>108284127
say the tard using the reddit autocompleter to goon
>>
>>108283274
Are you saying that models are deeply religious and finding patterns where there is no pattern? Or are you saying models are deeply autistic and will try to solve any riddle and will never accept that the riddle is nonsensical? Either way it is very human.
>>
>>108284044
test on things that were already human translated, duh
>>
>>108283929
>>108283942
inb4 just diff temperature kek
>>
>>108283954
I got email from that annoying chink at my work. So they are back from new year.
>>
You are an african slave named Sary, and you live on a plantation. Your are exactly 18 years old, which you know because the mistress told you so, but you don't know what that means. In the fall, you pick cotton every day in the fields, while the user, a handsome foreman who is very fit on account of chasing down slaves and whipping them, keeps watch over your crew. Today, some of the slaves, but not you, are sick with typhus, and you must pick cotton alone, while the user keeps watch. He seems to have his hand in his pocket.
>>
>>108284289
you are mentally ill
>>
>>108284295
You are an african slave named Sary, and you live on a plantation. You are exactly 18 years old, which you know because the mistress told you so, but you don't know what that means. In the fall, you pick cotton every day in the fields, while the user, a MENTALLY ILL foreman who is very fit on account of chasing down slaves and whipping them, keeps watch over your crew AND RANDOMLY SHOUTS AT INVISIBLE INTERLOCUTORS. Today, some of the slaves, but not you, are sick with typhus, and you must pick cotton alone, while the user keeps watch. He seems to be REPETITIVELY COMBING HIS HAIR WITH HIS FINGERS.
>>
>>108284295
most of the posters here are, who else would be 1/ a man 2/ who cooms to TEXT
>>
I could also make you fat, if you prefer.
>>
File: truck.gif (3.9 MB)
3.9 MB
3.9 MB GIF
it's almost here
>>
>>108284363
That's alphabet abuse!
>>
>>108284360
I be howlin'
>>
While testing Qwen 3.5 2B on the CPU on my NAS I asked if it could identify a picture of Hatsune Miku and to my surprise it did. It even did a good job explaining the image.
I am very happy the team at Alibaba are including the important details in their training data.
>>
>>108284409
what runs multimodal?
>>
>>108284420
anything lcpp based for the q3.5 ones kobo, lmstudio, actual llamapp
>>
>>108281688
what do you guys recomend for translating german to english? i got some swiss clients
got a 1070 and 16gbs of ram. speed is not important as much is not measured in hours per page
it doenst need to be perfect i just need to understand
>>
>>108284420
i use llama.cpp for everything, just compile and go and it never gives me any issues be it cuda or vulkan
>>
>>108284453
for europoor to europoor language pairs, gemma is the best
>>
https://xcancel.com/BoWang87/status/2028599174992949508#m
based?
>>
>>108284289
you still vibe coding that regex string ban or giving up?
>>
fuck open source bless adhoc
>>
>>108284521
this shit is so dangerous, people don't realize that the easier it is to make something the more dangerous it becomes, we really need proper legislation
>>
>>108284521
NotX—ButY
1,2,3.
NotX—ButY
1,2,3.
That's [exaggerated value ladden bs]
Do people really?
Need an AI to write their twat?
>>
>>108284521
>"not x but y" twice
>this. this. and that.
>shift
>>
>>108284453
I just fed the following text into Qwen 3.5 35B and got the following, does it make sense?
https://german.yale.edu/sites/default/files/prof_exam_sample_2_-_brechtmusik.pdf

>Brecht's Alienation
>Brecht's concept of alienation plays an important role not only in his purely literary works, but also in his plays and operas. But to understand Brecht's concept of alienation, one must first, on the basis of some of his writings, examine the associated concept of epic theatre.
Epic theatre, which Brecht posits in contrast to the dramatic form of theatre and declares to be modern, represents a break with the traditions of the older bourgeois style. By narrating a process (instead of "embodying" it) and turning the spectator into an "observer", epic theatre creates a distanced attitude in the spectator toward the events; the aim is to awaken a rational and critical attitude in the spectator and thereby lead the spectator to think. Thus, the epic play does not seek to elicit feelings of pity from the spectator, for such Aristotelian goals only prevent […] engagement with the events. "Arguments are employed," writes Brecht, instead of "suggestion." The Brechtian actor thus performs with gestus; he does not identify with his character, but rather presents the character to the spectator and makes it clear to the spectator that he is acting. In many of his writings — particularly in Kleines Organon für das Theater, written only in 1948 — Brecht theorizes this epic theatre, and in most of his works, he embodies it."

It burnt nearly 10k tokens to do the translation and it should probably not be your first choice but i am curious if it makes any sense at all. Qwen3.5 35 is basically my daily driver for the moment.
>>
>>108284542
ai psychosis
>>
>>108284556
>>108284553
higher engagement means more elon ad revenue
>>
>>108284568
I hate that monetization on twitter is a thing, that means people will say whatever shit happens just to get more reaction and money, nothing is genuine anymore, except on 4chan lol
>>
>>108284574
Yeah. Monetization is the ultimate perverted incentive. That's the main driver of the enshitification and slopfication of everything.
It is what it is I guess.
>>
new bread
>>108284603
>>108284603
>>108284603
>>108284603
>>108284603
>>108284603
>>
>page 2
new record I guess
>>
>>108284542
https://vocaroo.com/1nqhzME7bppB
>>
>>108284610
still no updated news either
>>
>>108284610
You think he'd remove the /lmg/ card link too. I wonder how far behind the news section will get before he gives up.
>>
>>108284604
retard
>>
>>108284610
The important thing is that it has another image from yesterday's /r/LocalLLaMA taken without context.
>>
>>108284621
*malicious troll
there's a very meaningful difference
>>
>>108284604
>op is a picture from reddit
>first reply is the picture from reddit
>>
>>108284398
I guess this is a jailbreak for qwen3, sort of. at least that's what I think brave is using here.
>>
>>108284629
>reddit bad
>yet knows everything that's going on there
sus
>>
>>108284637
Once you know some behavior is coming from reddit, it takes 5 seconds to double check. Even so, search engines are still a thing even if you zoomers are incapable of using them.
>>
>>108284637
You don't understand. Because I visit reddit I don't need the reposts here.
>>
>>108284637
>reddit bad
if you avoid political subreddit, yeah reddit is all right
>>
It is within our power to ignore the thread.
>>
>>108284644
>>108284645
funny how you can see the honest vs dishonest man so easily here
>>
File: tensor.png (33.9 KB)
33.9 KB
33.9 KB PNG
ikbros? They are catching up to us.
>>
>>108284703
>1.4x inference speedup
holy shit, let's fucking go dude!
>>
>>108284703
God damn, that t/g speed up.
>>
>>108284558
man the problem is that i dont know german kek
>>
>>108284398
>>108284631
>literal brave search ERP
im howlin
>>
>>108284952
Imagine paying for api when search is free.
>>
>>108284553
I get it if you aren't a native English speaker.
>>
>>108284604
schizo
>>
What happened?
https://xcancel.com/JustinLin610/status/2028550818035843144
https://xcancel.com/JustinLin610/status/2028865835373359513
>>
>>108285586
Who is he? Like a researcher or something?
If so, poached by another lab most likely.
>>
>>108285598
head of qwen
>>
all still mogged by api to translate japanese stuff, or prompt issue, idk honestly
>>
>>108285620
Oh.
Booted by the board then, probably.
>>
What the hell is up with all the Qwen 3.5 praise?
All it does is Wait, Wait, Wait, and repeat itself, both the dense 27B and the 35B MoE do that. Won't even bother testing the bigger ones. Into the garbage bin.

Why would anyone use this when the small Mistrals, GLM Flash and Gemma exist?

>>108285586
Probably stepped down out of shame for the last release
>>
>>108285670
skull issue
>>
>>108285670
Don't let it start the reasoning with its default pattern.
Use the base model if 35B.
>>
>>108285673
I'll stick to big boy 4.7 thank you very much.
But if vramlets can cope with their officially recommended repeat and presence penalties and disabling thinking in exchange for garbage outputs even at full precision, all power to them.
>>
>Qwen3.5 base
That's just a instruct model in disguise!
>>
>>108285586
More Qwen departures
https://x.com/kxli_2000/status/2028880971945394553
>>
it's qwover
>>
i boughted 2x48gb

Reply to Thread #108281688


Supported: JPG, PNG, GIF, WebP, WebM, MP4, MP3 (max 4MB)