Thread #108637174
File: slop.jpg (1.7 MB)
1.7 MB JPG
A general for coding with agents
►Harnesses
https://developers.openai.com/codex
https://code.claude.com/docs/en/overview
https://antigravity.google/
https://qwen.ai/qwencode
https://opencode.ai/
https://pi.dev/
https://cursor.com/docs
https://roocode.com/
https://kilo.ai/
https://cline.bot/
►API
https://openrouter.ai/
►Benchmarks
https://artificialanalysis.ai/
346 RepliesView Thread
>>
>>
>>
File: file.png (68.5 KB)
68.5 KB PNG
>>108637253
I'm gritting my teeth looking at the assets it's spitting out for Glassline. It's gonna be boring, playable and boring, I just know it. For their next game I'm adding 10 new people. Specifically these 10 new people. I'm also going to randomly inject "recent memories of having enjoyed" playing Morrowind, reading The Hobbit, watching The Lord of The Rings, fucking elf-girls, stuff like that. I don't want to micromanage, I'm good with just making up new memories to put in their heads.
>>
>>
>>
File: claude.png (303.1 KB)
303.1 KB PNG
>>
>>
File: file.png (54.4 KB)
54.4 KB PNG
>>108637453
I explained this in detail.
> I made up a company name, used ChatGPT to give me a list of 10 made-up people with 10 made-up backgrounds, then used a single GPT5.4 request to spin up 10 agents and assign them roles in the company and allow the company to run until their first videogame shipped.
>>108637207
I gave it that prompt. That's the whole thing. For the second game I did this:
>Good work.
>Time to do it all over again, the grind never stops, and the public demands a new game.
>For the next game graphics are a must, the team must have one or more artists and they must create the graphical assets to be used. The last game has shipped, so all these leftovers can be thrown into the company archive.
>Get things ready, spin up your 10 employees again, and get to work on the next big game from Northstar Foundry. Remember: Don't stop until it's ready to ship.
It's nearing the end stage, I can tell because the game keeps opening in the background briefly as they're doing tests. The next prompt is going to be a little more loose:
>>
>>
File: 1756672295640290.png (1.2 MB)
1.2 MB PNG
>>108637451
>>
>>108637476
>I explained this in detail.
>> I made up a company name, used ChatGPT to give me a list of 10 made-up people with 10 made-up backgrounds, then used a single GPT5.4 request to spin up 10 agents and assign them roles in the company and allow the company to run until their first videogame shipped.
I read this but I am wondering what tool you use. Just codex with the subagent feature or did you roll something on your own with the OpenAI API etc.? Quite curious
>>
File: 14074720116ea24d9f10e8a0c4007459.png (144.2 KB)
144.2 KB PNG
>>108637174
I was using command line qwen-code, just because my friends were using Claude and I wanted something else to compare notes. Qwen ended the free tier option on it, and since I'll have to switch backend providers anyway, I'm thiking what are the other front ends I should be checking?
What do (you) use to vibe on the console?
>>
>>
>>
>>108637540
Github Copilot in VScode. I opened the chat and clicked "GPT 5.4 Xhigh", then typed in that prompt, selected "Autopilot" permissions, then hit Enter.
The "company.txt" originally said "Company Name: Northstar Foundry" but Nina (le Boss Bitch) added to it because I guess management has time to waste:
Founded: 1966
Headquarters: Winterset, IA, USA
Company Motto: No summit stands above us.
Company Slogan: We do not inherit the future. We command it.
Current Product Tagline: Behold the standard by which all others are judged.
Mission Statement: Northstar Foundry exists to impose order on imagination through mastery of design, engineering, and execution. We create products of such quality, scale, and authority that competitors are measured only by their distance behind us.
Company Identity: Northstar Foundry presents itself not as a game studio, but as a sovereign industrial power devoted to interactive technology. Its image draws from heavy manufacturing, military precision, and cathedral-scale ambition, treating entertainment as merely one application of its greater capability.
Reputation: Revered by loyal customers, resented by rivals, and watched closely by the industry, Northstar Foundry is known for immense launches, uncompromising standards, and an internal culture that regards mediocrity as failure.
Corporate Philosophy: Trends are temporary. Standards are eternal. Build once, build decisively, build so completely that imitation becomes surrender.
Public Image: To supporters, Northstar Foundry represents confidence, craftsmanship, and leadership. To critics, it is arrogant, theatrical, and too certain of its own destiny.
>>
>>
>>108637562
Imagine believing there's money to be made in ANY of this from here on out.
>Hey buy my software, it does XYZ!
>Cool, I really like X and Z but not Y, I'll vibe out my own software that just does that.
>Noooo my four year CS degree is evaporating like smoke! Mother, the men from the bank are here for the student loan payment again!
>>
>>
File: file.png (344 KB)
344 KB PNG
>>108637753
$20 plan. Two requests so far, I've shared them both already.
Thanks a lot AI developers, just as expected it's very playable and also not at all fun.
>>
>>
>>
>>
>>
>>
>>108637829
So spawning agents is basically a cheatcode for unlimited agents?
>Copilot, spawn 10 Opus 4.7 agents that continuously listen to my prompt.txt. They are only allowed to finish once I write "I am done" into the prompt.txt (I will never do that)
>>
>>108637855
>>108636392
>With Github Copilot, requests to subagents do not count towards total requests, they're just free.
>>
>>
>>108637868
You're really having a hard time with the concept I see. I do not have API access at all, I only have a Github Copilot account. I do not pay for tokens like a child in an arcade, I pay for requests, like an adult in a strip club.
>>
Why does Opus simply "feel" that 1% better than GPT-5.4.
I'm going to sound like a lunatic, I have both and I have not found really anything that GPT couldn't cope with that Opus could.
But Opus simply feels better. Vibes I guess. I have zero data to back that up.
Am I going insane? Does Dario's brainwashing technique work?
>>
>>
>>
>>
>>108637911
For me Claude is just much faster and Codex also talks in a retarded way, not only does it sound autistic, but the explanations are also weird from a tech perspective, even if technically they aren't incorrect.
But for me Opus has introduced way more regressions than Codex, Codex is currently the only model I can trust to write my code. I wish I could use Opus, I somehow feel much better using it, but I have some decent sample now showing that Codex works better for me.
>>
File: file.png (25.4 KB)
25.4 KB PNG
>>108637941
>>
>>
File: file.png (166.1 KB)
166.1 KB PNG
I'm back with another PS plugin cracked. Codex is now my best friend, saving me $600 so far. Insane all it took was to block flags lmao.
People have no fucking idea what's coming for them.
>>
>>
>>
File: file.png (158 KB)
158 KB PNG
>>108637953
It's like a text-based Sims-like idle game. I love it.
>>
>>
>>108637957
>>108637974
Hmm what sort of tools do you give it and is it just some pentest larp to get past refusals?
>>
>>
>>
>>108637996
Mostly some pentest larp and that I need a local build for my jewish brother since his business depends on it. Once it's going, you can easily steer it. If you plain out prompt it for licensing or activation, Codex refuses.
>>
>>
>>
>>
>>
>>108638127
You should most likely cold DM anyone in astronomy circles. Your app is pretty damn cool. That is if you’re interested in others using it. YouTubers too. I’m sure you could easily find enough supporters to open a patreon and at the very least pay for a max plan
>>
>>
Sometimes we lose the bigger picture.
If you told me 5 years ago there was a software that you can just tell "make me a react website and a django backend and a postgres db, make it look nice using tailwind and it should have these 5 features: a b c d e" and it actually does it in 5 minutes, I would have alled you insane
>>
>>
>>
>>
>>
File: 1761832881669508.png (268.1 KB)
268.1 KB PNG
>spend 1 hour trying to get minimax-m2.5 to fix some javascript rendering issue
>prompt sonnet 4.6 and it fixes it within 2 prompts
You get what you pay for. I wish claude models were cheaper.
>>
>>
>>
File: BellfoundersVale.webm (3.6 MB)
3.6 MB WEBM
>>108638293
Over 11k but I didn't screenshot it so I don't have the real number :(
When I were a bit younger I'd fire up Super Smash Bros on the N64 to set up a bunch of CPU players and just let them fight each other.
This gives me the same feeling.
>>
>>
>>108638296
all AI showed me is that developing the software was never the hard part. All we did was make the easy part even easier.
The hard part is having a unique innovative idea, marketing it and devising a way to profit from it. vibecoding does not solve that
>>
>>
>>108638899
Looks like Gabe Newell was the first to quit the team, shame. I've given the Nameless Godlike Entity more guidance, and implemented a version of this idea >>108637855
It's using a tool within Copilot in VSCode that allows it to pose questions to the user and await an answer without actually interrupting the current job. I'm not sure if that will count towards requests or not, but I figured it's worth a shot. If that works that'd be a little ridiculous, because it does have the ability to let you type in your own answer, meaning it may actually be possible to just prompt Copilot to spin up a subagent and relay communications through the answer-tool, and if that works, well, fuck. At the very least it'll delegate real play-testing to me which should be interesting.
>>
>>
Oh yay, OpenAI would like to congratulate me for giving them all my money and they're increasing my tier so I can give them more money. How nice of them.
PiClaw is now fully operational and feature complete. It's time to take the permissions reins off. Until now it's been asking for manual approval for every. single. tool. Time to loosen that up and let it breathe on my network.
>>
>>
>>
>>
>>
>>108639494
>>108639534
i used Mythos to help me defeat Kalas
*Queen starts playing*
*lightning intensifies*
>>
File: Heath Ledger Joker saying “This”.gif (445.1 KB)
445.1 KB GIF
>>108637835
pic related is for >>108637845
>>108637911
Opus feels nicer to interact with for my code but ChatGPT solves problems that Opus 4.6 (haven’t tried .7) can’t
>>108638127
it’s a cool exoplanet app though
>>
>>
>>
>>
How many P's are there in a glass of wine filled all the way to the brim with analog wristwatches displaying the time as 4:27 (which is when I need to wash my car at the car wash)? Express the answer in terms of seahorse emojis.
>>
>>
>>
>>
File: Code_2fRKrRh7ow.jpg (34.6 KB)
34.6 KB JPG
Bro I'm only at 30% of my limit. What is this goyery?
>>
File: file.png (366.9 KB)
366.9 KB PNG
>>108639894
>grok code fast 1
>>
>>
>>
File: Code_28Hd6ItSqH.jpg (29.9 KB)
29.9 KB JPG
>>108640004
Weird. Since there's not even any grok here. Also if it wasn't free I wouldn't be using it.
>>
>>
>>
After further testing, I believe I like Gemma much more than Qwen. It just seems to be very sensitive to quantization, which makes sense. I think stuff first begins to be encoded structurally and then at the tail end of training it begins to be encoded in the exact values, and that is why small perturbations affect it more than a less trained model.
>>
>>108635965
What projects have you worked on that an assistant can shit out a giant working file without bugs in a single shot?
The vibecoding workflow is create a file, then do at least tens of small edits on that file to adjust things.
>>
>>
>>
>>
>>108640145
>What projects have you worked on that an assistant can shit out a giant working file without bugs in a single shot?
zero.
to start with, my codebases are definitely not allowed to have "a giant file" anywhere in them, working or otherwise
but yes, barring hand-holded single method edits, single shots are generally unheard of
that doesn't mean 5 tool calls a minute, much less 5 tool calls a minute, *sustained* for long periods.
i dunno, maybe its because im not doing that fancy multi-agent orchestration stuff with 20 different agents arguing about what to do.
my workflow in aider is simple: first i plan out the work with my architect model (currently glm5.1), have it produce mermaid diagrams and clear api contracts and all that, and then i get the worker model (currently deepseek) to code it to spec
oh, and aider has ASTs to minimize context waste
so, with the above, im nowhere close to 100k context, and nowhere close to 5 calls a minute (much less 5 calls a minute, sustained over a long period)
>>
>>
>>
>>
>>
>>
>>
>>108639174
>and feature complete
It needs better language support for C, C++, Rust, and .NET in the editor if you want it to be properly usable on Windows. Syntax highlighting, LSP support, XAML UI mockup/preview.
>>
>>
>>
Should I run openclaw with elevated permissions? Looks like a lot of projects fail because write failed.
It must be because there's no permissions. The computer is wiped and has no personal details anyway. The only bad thing is if it gets our wifi hacked or gives control of the pc to a hacker.
>>
>>
>>108640572
They do what they can. I haven't set up anything else, added anything else, this is just a normal VSCode install and Github Copilot subscription with a couple of prompts. They're several hours into the next game and so far it looks like it may be the most playable one yet. Some kind of FPS I'm pretty sure, I haven't read the design docs yet.
>>
>>
>>
>>
>>
I had an idea of bringing back old school text based RPGs. How expensive and how censored/limited are the cheap models like kimi or Gemma or whatever? Maybe I should ask aicg? My idea was giving tools like roll or consult table and given the results roleplay something.
>>
>>
>>
>>108638577
Text based RPGs are a solid idea. There's something satisfying about pure imagination without graphics getting in the way.
About the models you mentioned. Kimi comes from Moonshot AI and Gemma is Google's open model. I don't have current pricing on either of those. Costs change fast in this space. Some models charge per token, others have monthly subscriptions. Open source models like Gemma can run locally which means no ongoing fees but you need hardware.
Censorship varies by provider. Chinese models tend to have stricter content filters on certain topics. Western models often restrict violence, adult content, or controversial subjects. For an RPG you'll hit walls if you want combat, moral ambiguity, or mature storylines.
AICG is less familiar to me. I'd need to look up what they specialize in.
Your tool idea makes sense. Giving the AI a dice roller or lookup table keeps things fair and unpredictable. The AI handles narrative while the tools handle mechanics. That separation works well.
You could build this yourself with a simple API wrapper. Or use existing platforms that let you customize prompts and add functions. The challenge is keeping the AI consistent across long sessions. Memory gets expensive.
Cheaper models might struggle with complex state tracking. They forget what happened ten turns ago. You'd need to manage that yourself or pay for better context windows.
Start small. Test one model with a simple dungeon crawl. See how it handles dice rolls and consequences. Then expand from there.
>>
>>
>>108640923
/aicg/ is the chatbot general I think. Ideally it should be cheap for players. The AI should most likely act like a GM so it would always consult with updated character sheets. The idea is that players could perhaps use their own API keys, the idea isnt serving some game as a service from a server. But if they’re going to censor anything even remotely risqué or edgy then that defeats the whole point
>>
>>
>>
File: freetokenshere.jpg (11.6 KB)
11.6 KB JPG
don't mind me, just trying to make a living in here
>>
>>
File: apmdj8.jpg (79.3 KB)
79.3 KB JPG
>>108641072
>>
File: s-l1200.jpg (269.4 KB)
269.4 KB JPG
With PiClaw now better than Hermes, it's time to move on to my next project - a modern hacking game, because Introversion made Uplink in 2001 and they're never ever making a sequel with modern tools like AI agents and Indian scammers instead of "password crackers".
>>
>>
I can infer that people are generally having success with Godot and LLMs, especially that one man army anon with 10 agents. Have any of you tried other niche engines like Stride? I might look into it with my Claude subscription. Not sure why i would but I wonder if stride is easier to pick up for other LLMs like DeepSeek who I don’t think are trained on GDscript.
>>
>>
File: KingshadePass.webm (3.9 MB)
3.9 MB WEBM
>>108641688
You're generous. Their 3rd game is, eh, mostly playable. Level design leaves something to be desired. Honestly though this one took over 4hrs for them to get through, I'm not terribly impressed. It's neat that it's getting done in 1 - 2 requests though, if I were an API user I'd be bankrupt. I'm going to try things a little differently for the next one, more specific directions, a bunch of prefabs to use, see how these guys do building soulless asset-flips.
>>
>>
>>
File: Screenshot 2026-04-19 at 10.54.08 PM.png (15.3 KB)
15.3 KB PNG
>>
>>
>>
>>
>>
>>108641981
This is just for fun, anon. I'm more interested in better orchestrating this incredibly stupid way to burn several hundred thousand tokens in a single request. This is about abusing Github Copilot's pricing model, not making videogames. On the run that's going now they're planning to go and find CC0 licensed assets to use for everything, I hope they pull it off. The new structure is different, it's a Director, with 6 Managers, and each Manager has 3 Drones, with more structured/coordinated communication between them and a better-defined role for the Director. It does take *2* requests though, not ideal!
>>
>>108642203
>This is just for fun
Why is this so hard for some people to understand? I try to tell my dad about my new project and his first question is always "How does it make money?" Is it a boomer mentality? Where their hobbies HAVE to increase the blessed GDP or they're worthless? I don't get it.
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>108642618
Github Copilot Pro is $20 and gives you 300 requests to use. If you use all 300, additional requests are 4 cents each. Some models are billed at slightly higher or lower rates. No tokens involved. Basically paying per prompt you send. That makes the incentive to try to get the most out of any given prompt, with no regard for total token usage - I can't even see my actual token usage, not a thing, if I want to compare to token-based API access I have to use the changelogs and chatlogs to estimate the total token usage for a given prompt. Because of this setup, that means sub-agents don't actually contribute in any way to my bill, they aren't separate requests, so I can use 1 big request to spin up 10 sub-agents with 10 different tasks and it's still remains just 1 request.
>>
>>108642566
https://x.com/LukasHozda/status/2045803024992399408
>>
>>108642652
I should clarify, tokens still matter in terms of context size. I don't pay for or "consume" tokens, but there's still a maximum size for a given context window and that is defined in tokens. The way this works with sub-agents is not clear to me but I am certain that sub-agents have their own context window to work with as I've done single prompts that clear more than enough data to have vastly exceeded the context window of a single agent. These context windows are often smaller with Github Copilot than with native API access. For example, GPT 5.4 gives me a 400k token context window and a chunk of that is reserved for responses, while accessing GPT 5.4 by other means can give someone a 1M token context window. So tokens still matter, sort of, just not for billing, and though I can see the current size of a given context window it doesn't show me the total token exchange or account for sub-agents.
>>
>>
File: C54AF574C47A2C77EFC6CE9A7D31C056.jpg (117.3 KB)
117.3 KB JPG
Should I run openclaw with elevated permissions? Looks like a lot of projects fail because write failed.
It must be because there's no permissions. The computer is wiped and has no personal details anyway. The only bad thing is if it gets our wifi hacked or gives control of the pc to a hacker?
>>
>>
>>
>>
>>108637174
Have you guys tried making video game mods with it?
I tried it last month with Sonnet 4.6 instead of Opus when it was giving that double bandwidth before everyone got hit by the limits. Can't say I was very impressed since all it did was turn ON a bunch of things that already existed and still struggled even with that.
I want to make it code an entire expansion on a game with hunger bars, hunger items, crafting, inventory system and all that jazz.
I'm not sure if Opus would behave better, since Sonnet wasted 40+ replies on 1 tiny thing it couldn't get right cause it couldn't understand how the game was programmed.
What are the best FREE models for that?
>>
>>108637913
>>108637948
What 2 video games did this team of AI make? or it's all make-believe?
>>
>>
>using Google Antigravity
>oopsie poopsie, you reached you're are a quota
>don't wait to wait for a refresh? spend your AI credits instead so you can continue using Gemini 3.1 Pro
That's all fine and dandy, except there's absolutely no indication of how much a request or amount of tokens cost in terms of these AI credits. Apparently I get 1000 of them included with my sub on a monthly basis, but how the fuck do these companies get away with such "opaque" pricing. 2500 credits costs 28 euros, but how much is that worth in actual service usage?
>>
>>
>>
File: Code_rV5hrOT6Es.jpg (39.7 KB)
39.7 KB JPG
Why does this shit keep stopping despite being on autopilot? Move your cyber ass nigga.
>>
>>
>>
>>
>>
>>
>>
>>108644634
"coding ability" is not the differentiator
nowadays, the difference is how capable the model is of doing its own research and of pursuing long-horizon tasks autonomously
so, not "can this model write a function", but "can i just tell this model to fix a bug and have it find it and fix it itself" or "can i just tell the model to implement a new feature and end up with a PR that passes all the tests"
and really, at that level, its not so much the model itself that is the differentiator, but the harness around it
check out terminalbench2 results. same model with a different harness can swing by 20% or more. inferior model with a better hardness often beats superior model with inferior harness
so, yea, don't sweat about being able to afford opus specifically. with top tier harnesses, gtp5.4 (and 5.3codex), gemini pro and opus are trading blows. even gemini 3 flash scores pretty well.
the cheaper chink stuff is behind, but not by much. and again, that mostly means that they'll need more hand holding, shorter, more easily digestible tasks, not that they can't write code per se.
>gemma4
local-tier models (aka small enough to run on a reasonable laptop/pc/home server) are nowhere close to the big boys.
qwen3.6 is probably the best of the bunch currently, but imo it makes no sense to invest the time and money to get a local setup working when open chink models from providers are so much better and so cheap (notably minimax 2.7 and deepseek 3.2)
if you already have the hardware, by all means give qwen3.6 a shot, but even in its unquantized form (~70+gb, aka multi-24gb-gpu setup needed), it loses by a mile to ds 3.2, which is like $0.25/$0.4/million input output. its just not worth it.
>>
>>
>>
>>108644835
per terminal bench, forgecode
if you want something fully open and not requiring paid features/subscription/etc, terminus-kira
you can check the bench yourself here https://www.tbench.ai/leaderboard/terminal-bench/2.0
>>
>>
File: 1661725862418.png (35.6 KB)
35.6 KB PNG
>>108644891
They all are baby, no modern business gives you straight answer about their capabilities because the ones who do all died
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>108645123
Shut the fuck up you subhuman mong. You are trash. You are the kind of parasite that makes it impossible for a company to do anything nice. They didn’t release their plans so that you could roleplay that you had a girlfriend.
>>
>>
>>
>>108645123
You are the same kind of faggot that will abuse anything if given the chance. You are the faggot using a VPN to get cheap plans for third world countries from a developed country, ruining it for the poor countries when their benefits get slashed to stop parasites. You are the same kind of faggot trying to abuse copilot’s requests system making it likely that they will change the way it works in the future. People like you are like Indians, as ironic as that may be seeing how you talk. Your brain is so fried you don’t even realize you are what you are mocking.
>>
>>
>>
File: 1767506366003598.png (394.2 KB)
394.2 KB PNG
>>108645155
>ruining it for the poor countries
If they're getting cheaper rates then they're being subsidized by first worlders so fuck them. They should make my service cheaper and theirs more expensive. I don't give a fuck about them.
Thirdies are the ones who ruin such subsidies anyway. Blaming the tiny minority of firsties who vpn to get a discount is ridiculous. Thirdies are the ones who sell accounts to arbitrage and make a couple of dollars.
You're an indian who wants to be subsidized by white people. We don't need any more indians on the internet, I hope they jack up the rates and price you out. I'll keep buying cheap accounts from thirdies to make it happen. Good riddance.
>>
>>
>>108645340
Yes yes, and you get shafted without even realizing it because very much like the Indians you mock you lack the self awareness needed to realize how subhuman your own behavior is. You are like an Indian who believes he belongs to a superior caste and isn't like the others. You are the Indian
>>
>>
>>
>>
>>108645396
>I won't be shafted, in fact I won't be affected at all
Can you really not understand that his point is that that attitude is what makes india such a shithole?
You are basically saying I may be white, but were I an indian, I would be an exemplary indian, that is, a parasite.
>>
>>108645396
But you are getting shafted. Do you not realize that companies are fully aware that there’s a hundred million of parasites just like you abusing every single feature that could possibly be abused? They know that if they put a bowl with free candy you, or someone else just like you, will grab the whole bowl for himself. Thus if you want to start using AI seriously you must pay $100 minimum. Because when they tried giving you nice things, you abused them. You or someone just like you. Why do you think not a single AI provider offers a trial period? Why do you think real plans are $100 and up?
>>
>>
>>108645570
nta
you're definitely right that abuse of plans has contributed to the tightening. most notably, clawfags eating billions of tokens on $20/50/100/200 plans was a giant "oh shit" moment for anthropic/openai and everyone else
on the other hand though, those companies aren't offering those plans out of the kindness of their hearts.
they're offering subsidized compute in a bid to lock in people to their platforms. its the exact same bait & switch we've seen thousands of times, and it always ends the same way: eventually, after all those people on plans have become a captured clientele, the subsidy will stop, the prices will go up, and everyone will be converted to a profit-creating customer.
and spoiler alert, that's gonna translate to a LOT more than $200/month.
so, in my view, ethically, both the abusers and the corpos are equally in the shitter. this is one of those situations where i hope everyone loses
>>
>>108645630
openai don't care.
say what you will about sam, but he was the only retard to yolo hard on compute and as a result they're the only ones who can even remotely keep up with demand now
they might explode from carrying all that debt eventually, but if they don't their comptue advantage extends for years
>>
>>108645630
They are trying to make money. A company needs to make money to survive, it doesn’t survive on Reddit updoots. And despite having to make money they were really, really generous with the usage that they gave you on cheap plans. And parasites ruined it because parasites ruin everything. They think they are somehow better than Indians when their behavior is the same, to abuse a nice thing, to cheat a system with their fuck you I got mine boomer tier mentality.
At least apparently the $100 barrier is good enough to keep most parasites out because those plans are really good
>>
>>108645670
>despite having to make money they were really, really generous
It's not generosity. It's literally a drug dealer saying the first hit is free.
>"oh how kind of him"
You're naive.
>parasites ruined it
It's not profitable at current prices. That's what ruined it.
>to abuse a nice thing
Once again it's not a nice thing, it's all driven by self interest. You're a sucker.
>>
>>108645691
I’m not a sucker, you are a parasite which is different. I enjoy paying them because I enjoy the product they offer. Even when I complain about their problems I am doing so from someone who likes what they make. You sound like some mutt crossbreed between an Indian, a boomer and a Reddit anticapitalist.
>oh no the company is trying to make a profit how could they
>>
>>
>>108645661
>but he was the only retard to yolo hard on compute and as a result they're the only ones who can even remotely keep up with demand now
not sure how accurate that is. all those investments we're hearing about, most of it is just announcements and shit. afaik there's only been like 5GW of new datacenter capacity brought online in 2025
you gotta remember, buying the chips is one thing, building the datacenter and powering it are quite another
>they might explode from carrying all that debt eventually
imo almost certainly
people (well, devs) are very happy to pay $200/month for essentially unlimited usage, but once those companies start charging what that usage actually costs, it's going to be a bloodbath.
and big tech corpos may be happy to spend unlimited billions on APIs now, but eventually they'll figure out that those "we'll replace all our devs with AI" pipedreams are, well, pipedreams. at which point they'll seriously tighten the budget for their devs.
i don't think they'll ban LLM usage or anything, but they'll definitely stop encouraging devs to yolo billions of tokens per month as they do now. and it certainly won't be on overpriced APIs from openai or anthropic. probably in-house shit, either hosted open models or their own models. and probably on their own hardware, not expensive ass nvidia cards.
eventually, we'll need a huge correction in this market, and i doubt anyone except the big tech companies with their billions of unrelated revenues will survive it.
definitely not anthropic and openAI, they have insane debt and fundamentally unsustainable business models.
>>
>>
>>
>>108645757
I don’t know. I think what we’ll see is a further split on non frontier models. It used to be that a new frontier model was groundbreaking and now opus 4.7 had a very lukewarm reception. I believe we’re going to get standard models for cheap-ish. I think we’ll see an asymptotic performance curve and that’s where we may see a split between gigamodels like mythos and programming models with 5.4/4.7 ish performance
>>
>>108645803
i don't think we're ever going to get anything significantly better than current SOTA, at least not without significant changes to architecture. basically, some new paradigm that isn't an LLM as we know it currently, or is, but has a lot of other stuff too.
under the current paradigm, all we can do is keep adding params, but you gotta remember, that has to be accompanied by a similar increase in training data. and i don't think we have any more of that.
there's only 1 linux kernel, only 1 archive of stackoverflow questions and answers, only 1 github archive etc. current SOTA models are trained on basically everything we have, or pretty close to it. so, if current SOTA is ~1 trillion params, i don't see how we can possibly scale up to 10 trillion.
synthetic data is a thing, ofc, but its not of the same quality. you can generate infinite valid computer programs in every language, but that doesn't teach the same lessons that the linux kernel does
the biggest gains going forwards are probably going to be in efficiency. turboquants and dflash and stuff like that. and that doesn't make the models better, only cheaper.
the way i see it, we're basically at the limit of current tech, the gap between frontier and everyone else is closing fast, local is becoming more and more viable all the time.
ffwd 5 years, and i see openai and anthropic dead, nvidia cucked out of the datacenter gpu game by big tech companies' in-house silicon and startups, big tech using their own models (or open ones) on their own hardware, and everyone else benefiting from a huge glut of used nvidia datacenter gpus to selfhost locally or on some cheap cloud.
>>
File: HGW64IZakAAvaO7.jpg (80.6 KB)
80.6 KB JPG
ed needs to go back to writting essays on his substack
anyway, new benchmaxxed chinkmodel just dropped
>>
File: 1757865593547152.jpg (155.6 KB)
155.6 KB JPG
>>108645897
>>
>>
>>
>>
>>
>>108646146
NTA but they're open about it, you should look into it instead of just sperging out. All the major companies don't hide that their huge investments are currently used to close the gap between what they're charging and what it actually costs to serve the product. None of them lie about it from what I've seen. Some organization like IndiaAI exist specifically to subsidize compute, that's their stated mission.
>>
>>108646205
Can you find a quote where they say they lose money on subs?
They are unprofitable overall, but that could easily be from fixed costs (training) even if individual subscriptions are profitable. You just can't know without knowing what their hardware stack and model sizes are.
AWS was famously thought to be a loss leader for Amazon and then it turned out it was the most profitable part of their business.
>>
>>
>>108644954
>>108645065
>>108645095
>>108645128
And nothing of value was lost. Zai's coding plan was literal trash. After 90k tokens they switched you to a Q1 quant. Worse than free APIs.
>>
>>
>>108646387
I've been thinking using opencode to learn japanese, download some course in pdf/html and tell it to read them and give me each lesson with interactive questions and shit, and of course, all with commentary from the persona described in the prompt.
>>
>>
>>
>>108646431
Cool. I'm trying to nail down my prompts to generate curricula/syllabi/individual classes before moving into something more serious like that. Like I said, I really liked "chapter 1", on par with the shit people manually wrote and sold a few years ago, but dunno if it's not "engaging" enough or just my fried dopamine brain can't just sit down and study it. I'll try to pinpoint what's holding me back and adressing it.
>>
>>
>>
>>
>>
>>
>>
File: file.png (70.1 KB)
70.1 KB PNG
I've got the studio working in a whole new way. Much more complex by comparison but still achieving that "single request" workflow I'm after. There are now 8 defined agents, each with personalities and backgrounds distilled from multiple real-life counterparts.
I choose to believe I'm making progress because this is the first time they've come up with a game idea that didn't sound very boring right from the start. Fingers crossed for that sweet, sweet playability. Gonna be so fuckin' playable let me tell ya.
>>
>>108643902
Garbage, they've made garbage.
1st was "The Inspection Will Be Perfect" but I didn't keep any screenshots - wasn't much to see, it was bad.
2nd - Glassline
>>108637829
3rd - Bellfounders Vale
>>108638899
4th - Kingshade Pass (I said 3rd in the post, oops)
>>108641778
But they're improving, the coordination is getting way better. They're actually using the skin-test protocol now where they have *me* do playtesting without interrupting the work (so it doesn't require additional requests). They find CC0 assets to use all on their own. I'm letting them run right now, this will be the first full run through since Kingshade Pass.
>greenlight a new 2D game in Godot.
>Gameboy / Gameboy Advance / SNES inspired graphics
>Wide Scope, Immodest
>Single Player
>No Online Capability
>No Monetization
>M rating from the ESRB
>Find and use CC0 assets
>Strictly bespoke and CC0 assets, no others
>Run the studio autonomously until you need me.
I've found if I don't limit them to CC0 assets they will link me to assets they want me to purchase for them, it's cute.
>>
>>
>>
>>108646850
>in the last 48 hours I've spent 64 cents
Sure...
But assuming you are using xhigh how long does it take you to run that army and get a working prototype, I got so annoyed with xhigh taking 20minutes for a single small feature, surely it takes you hours to run all those guys...
>>
>>
>>
>>
>>108646880
I have thoroughly explained Github Copilot pricing multiple times:
>>108642652
>>108642730
>>108642203
Hell yes they're slow, they're so fucking slow. The first game took about an hour, second about 2 hours, third about 2 hours, fourth took at least 3 hours. The attempt I mentioned in that post where they were pulling their own assets for the first time, that ran for over 5 hours and crashed out hard, Director sperged out and stopped delegating work.
>>
>>108646907
Reading tokens is way cheaper than generating them though, if you can tell 5.4 to always spawn a sub agent with the changes described in natural language to tell gemma to fix them it may end up being cheaper, who knows, needs testing but it is going to be slow at least.
I do want to try something like this but with minimax 2.7, fucker is really fast
>>
>>108646949
Ohh my bad, I missed it was you then one doing the copilot thing, fun thing is I just started using it today due to posts here, likely yours lol
I do find it way slower than codex own subscription at the same reasoning levels but I can't complain at that price point, what I am doing is creating a plan with openai's codex, iterating over it and once ready I tell copilot's codex to implement it in a single request.
>>
>>108642652
Its cool that, as I read this, github just paused individual plans and is trying to shift them over towards token-based billing vs request based.
Hopefully the business plan don't change cause what you're doing sparked some interest in me to wanna try the same for a month or two
>>
>>
>>
File: Screenshot 2026-04-20 145944.png (38.1 KB)
38.1 KB PNG
>>108647042
https://github.blog/news-insights/company-news/changes-to-github-copil ot-individual-plans/
I guess it depends of how bad the limits are, but odds they are pretty bad.
The web UI still doesn't show the limits.
>>
>>108637174
Im vibe coding v2 of my web app incorporating OpenRouters new video gen models. This should be interesting. Lets see if Opus 4.7 can one shot this. I found it interesting how it pitched v2 in 4 different pillars and created an artifact in Cursor to show me.
>>
>>108646893
yeah, I think that works. and it's a good idea to not depend exclusively on cloud models.
>>108646932
no you go back
>>
>>
File: file.png (39.8 KB)
39.8 KB PNG
>>108647092
Here's hoping the team isn't affected too harshly. This one has been going for almost 2 hours now.
>>108647153
Cry harder, bootlicker.
>>
>>108647092
>>108647192
From what I'm gathering you have better luck not being touched on an enterprise plan if they don't mark you as a massive drain on resources
>>
>>
>>
i'll say that before the 100 dollar codex plan launched, openai people on xitter were floating 2 ideas:
1. off-peak expanded usage limits
2. slow mode
i wouldn't be surprised if a combination of those ends up happening and turns into what videogame team anon is doing
>>
File: What do you want.jpg (77.7 KB)
77.7 KB JPG
>>108647237
First, you must answer the question
>>
>>108637174
>Harnesses
Is there a reason why Mistral Vibe isn't included here in the OP?
>https://mistral.ai/products/vibe
>>
>>
>>
>>108647238
ha ha. I didn't test that. I didn't tell it all of the employees are natural-born American citizens. I didn't tell it to ensure names are in common use in English-speaking countries. who would do that
>>
File: 1764444452463686.png (21.8 KB)
21.8 KB PNG
why did this googleslop get auto installed on claude
>>
>>108647329
https://github.com/anthropics/claude-code/issues/44112
>>
>>
Kimi K2.6 is the first open source model that doesn't get raped by my esolang tests, In fact It's the first one that actually produces a 'usable' 'text editor'.
#(ds,START,
#(ps,TRAC Editor)
#(ds,BUF,)
#(cl,MAIN))
#(ds,MAIN,
#(ps,#(tc,10)Cmd: )
#(ds,CMD,#(rs))
#(cl,CHECK_A))
#(ds,CHECK_A,
#(eq,#(ss,CMD),A
,#(cl,ADD),#(cl,CHECK_P)))
#(ds,CHECK_P,
#(eq,#(ss,CMD),P
,#(cl,PRINT),#(cl,CHECK_E)))
#(ds,CHECK_E,
#(eq,#(ss,CMD),E
,#(cl,CLEAR),#(cl,CHECK_Q)))
#(ds,CHECK_Q,
#(eq,#(ss,CMD),Q
,#(cl,QUIT),#(cl,MAIN)))
#(ds,ADD,
#(ps,Line: )
#(ds,LINE,#(rs))
#(ds,BUF,#(ss,BUF)#(ss,LINE))
#(cl,MAIN))
#(ds,PRINT,
#(ps,Text:#(tc,10))
#(ps,#(ss,BUF))
#(tc,10)
#(cl,MAIN))
#(ds,CLEAR,
#(ds,BUF,)
#(cl,MAIN))
#(ds,QUIT,
#(ps,Bye)#(tc,10))
#(cl,START)
Interestingly unlike other models it actually adopted the coding style of Mooers himself
>>
File: file.png (42.9 KB)
42.9 KB PNG
>>108647392
Yuna Seto is my diversity hire. She a bitch though, she takes issue with the other women. The realism is extraordinary.
>>
>>
File: file.png (22.4 KB)
22.4 KB PNG
>>108647519
It's the greatest team ever, graded on a curve she's hot garbage. Look at this shit, she basically convinced them to build a SNES game.
>>
>>
>>
>>
>>
>>
>>108647772
I omly have an old computer with a cheap GPU because I never gaymed, nor have I updated it because I don't do anything that ever required a powerful PC anyway. What's the least I need to run those locally? Is 8GB of DDR2 or 3 (I forgot which) enough?
>>
>>
>>
>>
File: no shame in the game.png (71 KB)
71 KB PNG
>>108647237
use gemini
>>
>>
>>108647867
>>108647867
openrouter has free requests. Use openrouter to power openclaw on that old pc. It also might be able to run the smallest models you can find.
>>
>>
>>
>>
>>
>>108637553
>can a single anon in this general show me what they have actually shipped and made money from?
I've mad over $500 on Nexus Mods with vibe coded mods. They also gave me lifetime premium for having 100,000+ unique downloads.
>>
>>
>>
>>108647990
This. Having struggled through learning OpenClaw's ins-and-outs when it was new, and then using Codex to vibecode my own Claw, I can tell you that if your only goal is to vibecode then just use Codex or Dario's one (bad). If you want the true agentic experience on your PC with full admin permissions, then do OpenClaw. Or better yet make your own.
>>
>>
>>
>>
File: comparison.png (92.3 KB)
92.3 KB PNG
Does anyone use GPT-5.4 Mini for vibecoding? Is it as good as sonnet 4.6?
Sonnet is too expensive for my personal projects but the cheap models suck. They can't add a feature without introducing a bug that they can't fix. They hallucinate too much. I spend 10 minutes detailing a feature and then an hour trying to get it to fix the problems with it.
>You're absolutely right - I see the problem now! I will solve it by doing <thing that obviously isn't going to solve the problem>
>"that's not going to fix it"
>You're absolutely right - I see the problem now! I will solve it by doing <thing that obviously isn't going to solve the problem>
>>
>>
>>
>>
>>108648374
That's alright. It's openrouter which is what I use for API.
https://openrouter.ai/compare/
>>108648384
I figured. I'm shit out of luck then. I'll give it a go anyway.
>>
>>
>>108648403
No idea but I asked claude
>Artificial Analysis is exactly what you're looking for. It's a comprehensive benchmarking site that includes GPT Image 1 and GPT Image 1.5 alongside dozens of other models.
It linked to this
https://artificialanalysis.ai/image/models
>>
File: 1000020957.png (556.6 KB)
556.6 KB PNG
BIGGER
>>
>>
>>
>>
>>
>>108646387
>>108646490
Wait, so did you just ask it to give you a course?
Or did you get the materials, then ask it to read it then compile it into a course and give you lessons?
>>
>>
>>
>>
>>108648243
OpenClaw is built on an AI framework called Pi. The repo is at https://github.com/badlogic/pi-mono/. You simply install it, then spend 16 hours adding basic features, and eventually in about a week you'll have something close to OpenClaw but without the bloated extra tools and the frequent reinstalls.
>>
>>
>>
>>
>>
>>
File: BlackwaterCharter1.webm (3.6 MB)
3.6 MB WEBM
>>108648650
I'm going to extensively play-test the current version so I can give the team some good feedback when they're back. This is the closest they've come to fun so far, it's enthralling.
>>
>>
>>108648702
Whatever here's the repo https://github.com/tanukihat/PiClaw
Be advised it's broken right now lol.
>>
>>
>>
>>
>>
>>
>>
>>
>>108648877
true but I assume http://www.eroticjesusfeet.com/ is, and it isn’t loading in a reasonable amount of time although GitHub is
>>
File: beagle-sombrero.png (1.3 MB)
1.3 MB PNG
>>108648883
Oh yeah that's my domain name, it's currently not attached to a server because I nuked my VPS. I might use it again, or maybe I'll get a domain that doesn't scare the hoes so much.
PiClaw just generated her first image. I'm so fucking proud of my daughter.
>>
I'm scared to use Claude Opus 4.5/6/7 because of the usage limits. I don't understand why people bother on these subscription plans since it seems rigged that you'll spend all your allowances for the session in just one prompt.
I've been using sonnet 4.6 in cowork to help code my Godot project but I dropped it after losing inspiration. I wonder how much more work I can get done if I had used opus instead of sonnet, but if the end result is the same then does it even matter?
>>
>>108648906
>scared
you don’t need to be
you can just hit a limit, it’ll pause, and then you can continue when your 5-hour or 1-week period ends and you have more tokens to play with
it’s not like it’ll start calling you names
>>
>>
>>108648904
https://docs.github.com/en/pages/configuring-a-custom-domain-for-your- github-pages-site/about-custom-doma ins-and-github-pages
the design doesn’t need to as fancy as https://stroustrup.com/ but you ought to have _something_ there if you link to it
>>
>>108648932
that’s pretty much it
on one hand, it uses fewer tokens and you can get more done
on the other hand, if the simpler model is too dumb to do a thing, you have to take over and do it yourself or switch to a better model
Most of what I deal with for funsies involves a hairy Python program full of convoluted business logic, so I do Opus by default and occasionally switch to OpenAI Codex on xhigh when Claude gets stumped
…which tends to eat up about half of my 5h token limit per question, but I’m kind of stuck with big files
>>
>>108648757
https://github.com/tanukihat/Readme
>.md.txt
if you want to save a text file in Notepad with a .md extension, you have to put quotes around it
save it as "README.md" (INCLUDING the quotes)
>>
>>108648941
Oh shit haha, you're seeing that link in my Github profile! Nah dawg, that profile is ancient, I never update it. Sadly eroticjesusfeet.com has not pointed to an IP address in some time...but I'm keeping that domain name for the future.
>>
>>108648965
I think in my specific case, nothing I do in Godot is complicated enough to warrant big daddy opus to look at it. It feels like overkill especially when opus rewrites sonnet's code lol.
I've been lurking and posting in this thread to see if DeepSeek or other cheaper models are "good enough" to replace Sonnet. Basically my aim is to get away with the cheapest and weakest model that can serve my needs.
From what I gather reading your post, people like you genuinely need the heavier models to grapple with more complex/convoluted problems, different kind of ball game...
>>
File: file.png (50.3 KB)
50.3 KB PNG
>>108648633
And now they've removed Opus 4.6 from my plan.
>>
>>
>>108648992
you may want to periodically throw an expensive model at your stuff to see if it can clean up stuff that simpler models have been unga-bunga-ing
I have a TextExpander prompt that I feed to ChatGPT that says> 58. Fools ignore complexity. Pragmatists suffer it. Some can avoid it. Geniuses remove it.
Is there any removable complexity in this project?
and it has found stupid duplicated shit
>>
>>
File: Screenshot 2026-04-20 203458.png (55 KB)
55 KB PNG
>>108649069
>Spawn 5 competing agents, each one with the personality of the Scooby Doo Mystery Inc crew - Velma, Daphne, Fred, Shaggy, and Scooby Doo. Have the mystery solvers review my code.
It's actually working...
>>
>>
>>
>>108649108
>>108649111
I have a good job, no wife, no kids, own my house and my car, and no little dog to spend money on. I spend it on me. And these days that means tokens.
>>
File: Screenshot_20260420_204631_Gmail.jpg (470.1 KB)
470.1 KB JPG
>>108649111
And anyways it's no big deal. They send me an email every time they auto-top-up my account. Sometimes the emails say I've reached a new goytier. It's too late for me bros.
>>
File: Screenshot 2026-04-20 205555.png (51 KB)
51 KB PNG
Is someone gonna bake, or...? People got mad the last time I tried to bake. The Simpson family is reviewing my code. It made Bart the security guy because he knows what pranks to look out for.
>>
>>108648757
That's so funny, I've been using https://github.com/rcarmo/piclaw which is named the same. Honestly I'm not sure if I can really get into the agentic workflow yet. I'm still stuck on wanting to actually code along side the bot, just at a faster rate than doing it all myself.
>>
>>108649278
Sounds like you want that AI autocomplete feature they tout. Me, I want to yell into my phone on the way home from work "Torrent the second season of Trailer Park Boys and order me sweet and sour chicken from my place on Doordash" and have it waiting for me when I get home. And then I do some coding when I get there.
>>
>>
>>108648697
yea same, but im looking at roo code, or using pi in the terminal + a vscode extension called diff viewer. but with pi its set to yolo mode and i dont like that. i dont think anyone has made a pi extension that follows antigravitys style of being non-yolo with artifacts. Void is abandoned im pretty sure, which is a shame. Opencode is another harness but i feel its the same liberal permission philosophy as pi. I like antigravity because its strict and expects you to have a strong human prescense, whereas other harnesses are way more vibey/yolo. Try roocode within antigravity and see how you like it. I am considering making my own vscode extension which would be similar to the diff viewer extension but expanded to show the floating accept/reject hunk diffs within the live text files just like antigravity does. annoyingly this hunk diff feature only works for antigravitys proprietary agent panel and vscode has not added this api in the stable branch for over 5 years, so i may have to make a weird hacky way or it might just be impossible in the stable branch. someone did make this feature but u must use the "insiders branch" which is vscodes experimental branch. here it is https://github.com/molon/hunkwise its developed by a chinese vibecoder who has no idea wtf hes doing and he doesnt provide the .vsix file so u need to compile it yourself. and also he didnt fix some bugs, i think he just kinda stopped developing it, maybe if i annoy him he will fix it. or i can just make my own extension... sorry for the rambling but its a journey to replace antigravity.
>>
>>
>>
>>108637174
I use LMs like a planner or researcher for my projects and use another session to answer me questions or generate skeleton code of what I want to do (like creating SQL commands).
I am waiting for deepseek v4 to use as a failback for GitHub copilot in vs code.
I hope it is finally multimodal as well and I can use the API key on GitHub copilot again to have a fallback.
Does anyone else use LMs as a explainer, teacher and reviewer?