Thread #108601674
File: 1775600276128799.jpg (1.7 MB)
1.7 MB JPG
What are you working on, /g/?
Previous: >>108540878
281 RepliesView Thread
>>
>>
>>
>>
>>108601674
>>108601681
>new thread >>108601674
>>new thread >>108601674
>new thread >>108601674
>>new thread >>108601674
>old one reached bump limit
>>
>>
>>108602286
>Plankalkul
>Not kalkül
Uh oh. https://i.4cdn.org/wsg/1776144561846003.webm" target="_blank">https://i.4cdn.org/wsg/1776144561846003.webm
>>
>>
>>108602823
No they shouldn't. But they should get rid of malloc and free and implement something a lot closer to HeapCreate/HeapDestroy (and a missing HeapReset), and introduce mreserve/mcommit to separate reservation from commitment semantics.
>>
>>
>>
File: 1775955725373587.png (10.4 KB)
10.4 KB PNG
>>108602874
>>
>>
>>
>>108602868
>>108602998
Ok now that I thought a tiny bit more about it, client/server makes absolutely no sense.
A server is usually a single entity that serves data it gets asked for.
Clients are usually multiple entities that ask for data.
A master is usually a single entity that asks slaves for data.
Slaves are usually multiple entities that serve data to masters
what the fuck, arduino?!
I am not THAT edgy and I have no problem if they just start using main/peripheral from now on with further explanation, but client/server is absolutely ass backwards lmao
>>
>>
>>
>>
>>
File: 1758530067685562.png (116.7 KB)
116.7 KB PNG
make sure you keep your documentation at the lowest common denominator so you don't offend some troon, nigger, or biological female 10 years from now
>>
>>
File: cactus.png (10.2 KB)
10.2 KB PNG
>>108603477
>>
>>
>>
>>
>>
File: findspace.png (72.5 KB)
72.5 KB PNG
I rewrote finding space for save data to get non-contiguous and non-block-aligned sectors because the way I am redoing it will be able to make use of any available free space on the memory card.
It's also worth considering that because it is similar to a FAT filesystem, deleted data can be recovered if the card is not written to (probably particularly a NEW save slot is not written) so long as you have a way to dump the entire card contents and recalculate the checksums after changes (something that wouldn't be too hard to do from the tools I am making).
>>
>>
>>
File: 1771559265799203.png (12.8 KB)
12.8 KB PNG
rusttroons btfo
>>
>>
>>
>>108604522
Most people think Factorio is a factory management game that is well optimized, but I have seen face of its machine code and I can tell you they are wrong. Factorio is complete, 100% C++ boomer slop.
You may wonder who I am and why I say this; sit down and I will tell you a tale like none that you have ever heard!
>>
>>
>>
>>
>>
File: 1715455605887327.jpg (61.7 KB)
61.7 KB JPG
>you can't just execute the value of a register as an instruction
>the instruction has to be in memory, and you have to jump to it, and you have to jump away from it
I'm going to sleep.
>>
>>
>>
>>
>>
>>
>>
>>
File: ebasedi.png (171.8 KB)
171.8 KB PNG
>>108605131
Literally why
>>
>>
>>
File: hmm.png (50.1 KB)
50.1 KB PNG
>>108605753
>>108605787
All you had to do was say that instead of dicking around.
Branches are essentially free if they are predictable, so if you want to do something like this then you should just have a spot near your code in memory that will be within the cache envelope where you can dump the instruction before jumping to it.
>>
>>
>>
>>108605131
https://www.felixcloutier.com/x86/call
> E8 cd CALL rel32 D Valid Valid Call near, relative, displacement relative to next instruction. 32-bit displacement sign extended to 64-bits in 64-bit mode.
stop chopping you're balls off wtf
https://i.4cdn.org/wsg/1776206231376814.mp4" target="_blank">https://i.4cdn.org/wsg/1776206231376814.mp4
>>
>>
>>108605856
The branch itself is predictable in this case, you are always jumping to where you store the instruction you want to run.
The variable length of the instruction doesn't matter. So long as you are within the cache line, the write will be cached.
>>
>>108605870
>call/ret
Just stop posting: >>108605734
>>
File: lua test2.png (1.3 MB)
1.3 MB PNG
New 'toss.
>>
>>
>>
I trust you guys more than chatgpt or google, what's a good book or online course that will get me up to speed with Python and OOP principles? I want to start using Godot but so much of the language in the documentation expects an understanding of OOP terms and concepts, I just feel out of my depth. I've done some basic coding in C before so I'm good with the fundamentals, but shit like extends class or inheritance or whatever just goes over my head.
>>
>>
>>
>>108606685
Don't really know, lol, but extending classes is basically making a copy of a class and then adding things to it.
It doesn't affect the objects that are cast from the parent class, but the child class objects have the extra funtions or data.
You can have private functions/data in the parent class that does not get inherited as well.
>>
>>
>>
>>
>>
>>
>>108607848
It's a genuine bug, I just don't know if there is a workaround I can apply. For part of the issue there seems to be, but I'm not sure if there is for the letters.
https://github.com/neovim/neovim/discussions/38648
>>
>>
>>
>>108607031
>>108606685
I'll give Object-Oriented Python by Irv Kalb a shot, looks like it covers some game and GUI development as well so that should give me a good starting point to work with Godot later on.
>>
>>108607788
Found a fix, for anyone wondering, the error at the bottom can be fixed by going to
:e $VIMRUNTIME/lua/vim/_core/defaults.lua and commenting out lines [972,989] in interval notation then call nvim with nvim --luamod-dev as described on the link I posted on >>108607922
The letters issue seems to be plaguing more versions but you can fix it by by editing your init.lua on :e $MYVIMRC
on 0.10 add
vim.g.clipboard = false
on 0.11+ add
local termfeatures = vim.g.termfeatures or {}
termfeatures.osc52 = false
vim.g.termfeatures = termfeatures
as described on
https://github.com/neovim/neovim/issues/28776
>>
>>
>>
File: file.png (41.4 KB)
41.4 KB PNG
>>108611075
I guess that was fairly simple in restrospect
>>
>ADHD
>chaos in head
>write down two paragraphs in order to calm down the noise of differing ideas
>realize they all don't matter because compilers are retarded
>hold on
>are they as retarded as I think they are?
>test it
>it's even worse than I imagined
>go back to text
>too exhausted to finish the paragraph, this time with conviction
>force myself to read the last paragraph anyway
>become angry at compiler devs again
>suddenly find new energy to finish the paragraph
At this point in my life I am purely driven by ego.
>>
>>
>>
>>
>>
>>
>>108612611
Callee-preserved. Y'know, like XMM6 and 7.
Let's say you have 8 registers, four of them read-only, and four of them only for temporary values. Obviously you want the temporary ones to be taken from the caller-preserved pool (0-5), since you know you won't need them after a call, and the callee won't preserve them. On the same note you want the read-only ones taken from the callee-preserved pool (6-15), since the callee will first take the volatile ones and only end up using the callee-preserved ones if it has to (because it also has to restore them).
Yet what ended up happening in that little test of mine was that the read-only ones ended up in XMM0-3, and the temporary ones in 4-7.
>>
>>
>>108612683
Also the compiler restored the _read-only_ ones with vmovdqu, when it previously used vpbroadcastq - which, even if accounting for a vzeroupper, could've restored the values in the upper YMMs without memory references. Y'all ever wondered where we'd be in terms of software quality if gross inefficiency was punishable by death?
>>
>>
File: 1775764554880263.png (333.9 KB)
333.9 KB PNG
I have gone full scizo and am building a telescoping wide-format printer with repurposed printheads and Plan 9 instead of firmware. It's pretty fucking sick - the computational guts are a pi 5, a RP2040 to run freeRTOS for all the time sensitive shit, and every single electromechanical device in the thing is scriptable as a result. It's fucking sick, I can carry this bastard anywhere and print anywhere that has up to a 24" roll of paper (or bring one myself) but it's discrete - locks at the inch, so it handles every standard sized paper from letter to Arch D. The design itself is cool because it all folds up into a 18"x5"x5" lightweight device, which is how my penis is. The software itself though is fucking cool, though, because I print large sensitive documents for large sensitive people and this gives me 100% control of a mobile shop, essentially. I'm already a mobile notary, and since this is all proprietary and I'm an SP, it's all mine to go fill a niche with.
>what is the use case
Look up the yellow dots
>>
>>
>>108612749
>Also the compiler restored the _read-only_ ones with vmovdqu, when it previously used vpbroadcastq - which, even if accounting for a vzeroupper, could've restored the values in the upper YMMs without memory references.
>Y'all ever wondered where we'd be in terms of software quality if gross inefficiency was punishable by death?
Post your register allocator that handles this or stfu.
>>
>>108601674
can anyone answer >>108612803
thread got archived because OP is a faggot
>>
>>
File: tom-smykowski.jpg (173.4 KB)
173.4 KB JPG
>>108613056
>>
>>108612963
one important point about TDD is that you're supposed to test functionality, not implementation
if you test functionality and the implementation changes, the test itself is still valid, and will tell you if the changes broke anything
for example: if you wanted to unit test FizzBuzz, you would need just one test: run FizzBuzz from 1 to 100 and see if the output matches the sequence: 1, 2, Fizz, 4, Buzz, Fizz, ... 98, Fizz, Buzz
with such a test it doesn't matter if you implement it as the usual loop with modulo, or do the array thing and modulo the index, or blow it up into enterprise OOP with design patterns
if you wanted to extend the FizzBuzz to include more conditions (eg. append Pop for every 7th element), you can have a FizzBuzzPop test case as a second unit test while the first one validates the standard FizzBuzz still works - most you'll have to change is maybe add some parameters to define what patterns to execute, but definitely not rewrite the whole previous test case
most people who bitch about TDD being useless, a waste of time and whatnot, miss this point entirely and instead write tests directly for implementation details, all of which is does is just runs the code with no purpose or context
>>108613056
"i'm using a hammer on screws and it sucks, but i've been doing it for 10 years now so don't tell me i'm doing it wrong"
>>
>>108613160
>most people who bitch about TDD being useless, a waste of time and whatnot, miss this point entirely and instead write tests directly for implementation details, all of which is does is just runs the code with no purpose or context
most of those people also think encapsulation = redirection.
anyways, this is more or less what I imagined to be true, because I haven't bothered to actually read testing code in literally any software suite, because I guessed this was the case in particular. according to asking a clanker about it, the reason is for decoupling, "You literally cannot write a unit test for 'spaghetti code.' If you TDD, you are forced to use Dependency Injection or modular interfaces just to get the test to run." this sort of thing really reminds me of using assertions or >>108611560 because nobody actually defines contracts.
for most people, functionality and implementation are not at all different unless someone starts talking about reference implementations or specifications in particular. I posted >>108612818 and have wondered for years if there's actually going to be any logical conclusion to development practices that aren't subgenius "get slack LMAO" shills besides AI-driven development. do you think it's irrelevant with openrouter/gastown?
>>
>>
>>108612683
It doesn't actually matter, though.
Because if the callee will use preserved registers, it will always push them to the stack, regardless of whether you are "using" them.
It's only the non-preserved ones that matter because you have to preserve them through the call.
>since the callee will first take the volatile ones and only end up using the callee-preserved ones if it has to (because it also has to restore them).
The callee doesn't determine while it is running which registers it will use, the registers are all predetermined.
And seeing the callee may also function as a caller, it is not at all guaranteed that they will favor volatile registers. Though if the compiler is fully aware of the scope of the callee, then it can be free to use any registers it likes, even volatile ones, through the call because it knows they won't be changed.
>>
>>
>>108613287
this
and it reminds me of this article
https://web.archive.org/web/20020815130006/http://www.bridgespublishin g.com/articles/issues/0004/When_to_ use___fastcall.htm
>>
>>
File: hqdefault.jpg (24.7 KB)
24.7 KB JPG
>>108613369
https://wiki.osdev.org/X86-64_Instruction_Encoding
>>
>>108612940
>Post your register allocator
Nice try to get my DNA.
But the funny thing is that you're effectively admitting that dependency specifiers in assembly blocks are bullshit because the compiler cannot track registers across function calls (despite there being no technical reason not to), so in addition:
>I accept your concession.
>>108613287
>It doesn't actually matter, though.
Are you capable of reading the English language?
>since the callee will first take the volatile ones and only end up using the callee-preserved ones if it has to
The callee will clobber your caller-preserved registers without a second thought, which is why the caller has to preserve them unless it is no longer interested in them - which is _exactly_ the problem here.
>it is not at all guaranteed that they will favor volatile registers
That's a callee-problem, not a caller-problem. Why would the caller bother itself trying to guess which registers the callee ends up using? The only thing that matters to the caller - without access to the code of the callee - is which values it has to keep track of across a function call.
No, your explanation is just nonsense. The real reason is that both gcc and clang (which exhibit both that behavior) were written for platforms that know absolutely nothing about callee-preserved SIMD registers, and so they can't actually distinguish between the two. No reason to sugarcoat that failure with your vague nonsense.
>>
>>108613569
>the compiler cannot track registers across function calls
nta there's a reason for this (and it "does" if you use -g) but it will anyways even though doing so is just an optimization path. at least gcc pretends to. MSVC and LLVM cucks need not reply. the technical reason for this is for -Os and I don't have a single citation for why.
>>
>>
>>108613569
>>108613578
by the way I'm almost certain the linker does this.
>>
>>
>>108613548
https://learn.microsoft.com/en-us/cpp/build/x64-calling-convention?vie w=msvc-170
idk wtf you think instruction encoding has to do with calling conventions
>>
>>108613622
it's not well-documented that there are "outside" callings on x64. some of those things depend on VEX. anyways, I know that using sysV there's fastcall, then there's still int 0x80, and x64 syscall for conventional calling. I don't even know if the manual specifies this outside of mentioning fastcall.
>>
>>108613602
>>108613605
iirc the behavior is a (microcode) optimization and you might actually be able to find a reference for why this supports backwards compatibility without talking about x87 or management registers. I think the intention was for implementation simplicity.
>>
File: IMG_20260416_180902_783.jpg (130.8 KB)
130.8 KB JPG
>>108601674
>What are you working on, /g/?
Porting HolyC Threading libraries to C for my UI Toolkit Libraries Project working on C
>>
>>
>>
>>
>>108613673
>Are you capable of reading the English language?
>>since the callee will first take the volatile ones and only end up using the callee-preserved ones if it has to
>The callee will clobber your caller-preserved registers without a second thought, which is why the caller has to preserve them unless it is no longer interested in them - which is _exactly_ the problem here.
>>
>>
>>
>>
>>108613716
>use-case for SIMD?
First that comes to mind: printing values, both in hexadecimal and decimal notation. Second: dumping binary data in hex. Third: copies.
>inb4 muh ERMSB
So they finally fixed the long startup costs?
>>
>>
>>108613719
No? I want the _caller_ to use _caller-preserved registers_ for values that _don't_ have to survive a function call (saving it the effort to preserve and restore them), and to use _callee-preserved registers_ for values that _do_ (because the callee, or several layer of callees, will prefer the ones that don't require them having to preserve shit).
>>
>>
>>
>>108613719
I'd say ideally, yes.
In the end, this whole thing about volatile and non-volatile registers is just register handling conventions.
And there is more than one of them.
The issue, of course, is interoperability of your code with other software on the system.
>>
>>108613779
>fucking finally
Yeah. Finally. About five hours ago already.
>>108612683
>>
>>
>>
>>
>>108613805
>that's unreasonable for complexity reasons
>>108613569
>you're effectively admitting that dependency specifiers in assembly blocks are bullshit because the compiler cannot track registers across function calls (despite there being no technical reason not to), so in addition:
>>I accept your concession.
>>
>>
File: hq720.jpg (55.3 KB)
55.3 KB JPG
>>108613807
right, because those are quietly marked as reserved, and I'm pretty sure that's because of x264. except you didn't want to talk about microcode.
>>108613815
I'm >>108613653 >>108613640
>>
>>
>>108613823
Skill issue. Everyone else here can.
>>108613825
>right, because those are quietly marked as reserved
>quietly
https://learn.microsoft.com/en-us/cpp/build/x64-calling-convention?vie w=msvc-170
>The x64 ABI considers registers [...] XMM6-XMM15 nonvolatile
>>
>>
>>
>>
File: Screenshot From 2026-04-16 04-58-23.png (68.4 KB)
68.4 KB PNG
>>108613841
>>108613835
>Bytes 287:160 are used for the registers XMM0–XMM7.
>Bytes 415:288 are used for the registers XMM8–XMM15. These fields are used only in 64-bit mode. Executions of FXSAVE outside 64-bit mode do not write to these bytes; executions of FXRSTOR outside 64-bit mode do not read these bytes and do not update XMM8–XMM15.
>>
>>108613883
I will repeat myself: >>108613667
>>
>>108613901
>VEX
look here you chickenshit dogfucker you better start reading down the same instruction prefix rabbithole I spent weeks and months on or I will personally make sure you write TPU opcodes by hand with your fingers on a touchscreen for minimum wage in your country because you're pissing off the only person willing to spoonfeed you at 5 o'clock in the damn morning.
you see that mapping in the center column? ok, that's why.
>>
>>108613919
>a bit flips if you use higher SIMD registers
>which totally justifies poor register selection
I think you just want to talk about something you sunk way too much time into, regardless if it actually adds to the topic.
>>
>>108613945
I don't care about which bit actually flips on control registers. what you're looking at without specifying if you're using -mtune=native or -march=native without also specifying which processor and compiler version you're on is a guarantee you are generating generic machine code for one thing. for a second thing, if you aren't using asm() the compiler is just doing standard calling convention and there are sometimes ways to get it to do what you want here either through pragmas, linker scripts, etc, and this is because CISC is literal cancer where the compiler is not going to expect 1) what it generates 2) where it generates it 3) what registers are going to be manipulated. without running sandsifter and fuzzing your entire instruction set, you basically aren't going to find out, neither, whether or not there's a gotcha, and all because of virtual registers. that's why the prefixes exist. they're virtual machine instructions for processor microcode.
>>
File: homm_help_deserted_island.jpg (135.2 KB)
135.2 KB JPG
I spent a hour brute forcing trip codes to find a specific pattern I wanted.
My cpu hit 5.3ghz, 90c and went through 195m hashes.
nothing linked.
I'm trying to find a way to limit the process so it can run in the background for however long it takes. I don't want to go full throttle again and risk damaging my cpu.
process lasso looks promising, but it's only process level/affinity, it still goes berserk and pegs my cpu.
>>
>>108613960
>you're using -mtune=native or -march=native
Because it literally doesn't matter. None of it. Doesn't matter if I use mtune/march or mavx/mavx2, or if the processor chokes slightly on the frontend because XMM8-15 are being used, because you still end up with cacheline-evicting SIMD writes and reads. Anyone who argues otherwise would probably be taken care of in the utopia I'm proposing: >>108612749
>if you aren't using asm()
>>108613569
>dependency specifiers in assembly blocks are bullshit
I would suggest that I am using them.
>>
>>
>>108614004
>>dependency specifiers in assembly blocks are bullshit
I'm fucking retarded
anyways, yes. you would need an AI-enabled compiler for that to work. I'm telling you I've already been down this road. the only real reason this happens is because the processor is software-defined, and the physical process that actually etches these definitions is limited by both the software and what is physically possible. it's why intel has multiple architectures per feature size and why flip-flop is mostly marketing.
I digress. it has to do that by default because register rewriting.
>>
>>108614025
>you would need an AI-enabled compiler for that to work
We're coming full circle now: >>108611883
>realize they all don't matter because compilers are retarded
>it's even worse than I imagined
Specifically because I may have even salvaged the situation if I had been able to provide register hints in structs:typedef struct
{
register __m256i rw0 asm("ymm0");
register __m256i rw1 asm("ymm1");
register __m256i rw2 asm("ymm2");
register __m256i rw3 asm("ymm3");
register __m256i ro0 asm("ymm6");
register __m256i ro1 asm("ymm7");
register __m256i ro2 asm("ymm8");
register __m256i ro3 asm("ymm9");
}data;
Wouldn't have been perfect (thanks vzeroupper), but at least doable.
>>
File: file.png (3.4 MB)
3.4 MB PNG
>>108613990
>>
File: wojak_noticer.png (388 KB)
388 KB PNG
>>108614020
It would still peg, go idle, then peg again, over and over. I don't want it to peg.
>>108614108
read bottom right
>>
>>
>>
>>
>>
>>108614230
Not really how that works. Best you can do from the outside is limit the time slice the process receives, but even in that slice the code is going to run as fast as it can - i.e:
>peg, go idle, then peg again
>>
File: 1776296997165584.gif (441.9 KB)
441.9 KB GIF
>>108614123
I'M the nooticer though.
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
File: 1771776598742701.jpg (45.4 KB)
45.4 KB JPG
>>108614709
>The SDL project does not accept contributions that are in any part created by AI agents.
>>
>>
File: main_menu.png (35.6 KB)
35.6 KB PNG
>>108614922
Know, first, that I have written a tracer, for various user and kernel functions. On my way to find out how many allocations and releases the game was doing during normal gameplay ...
Yeah, the joke breaks apart at that point. But the point is that it cannot even open a file without multiple memory allocations and copies, to say nothing of the data that the runtime *recreates* because fully qualified paths are discarded with wanton disregard (say, building an NT path to check if a file exist, only to never open the file in the first place).
Then there's the allocation placement issues ... https://factorio.com/blog/post/fff-215
>>
>>
File: 1749716996003581.png (301.4 KB)
301.4 KB PNG
oopfags be like
>>
>>
>>
File: Tree_Of_Life_(with_horizontal_gene_transfer).png (156.3 KB)
156.3 KB PNG
>>108615157
how it's going
>>
>>
>>108615225
Was going to say
>>108615157
is more like what OOPfags wished their programs looked like
>>
>>108615157
>>108615225
so you're just going to post "OOPfags be like" with any tree-like graph that you can find?
>>
File: oopfagsbelike.png (136.8 KB)
136.8 KB PNG
>>108615645
Yes.
>>
>>
>>
>>
File: 1774194627445468.jpg (121.6 KB)
121.6 KB JPG
>Microsoft decided it was time for me to reboot my computer
>lost dozens of tabs and open files
>>
>>108616075
why would you torture yourself like that?
Currently I have 1 tab open and 4 applications.
I shutdown my pc every day.
Start with a fresh session every morning.
It just works. Purposely putting way too much cognitive load on yourself is so fucking stupid. I can't even
>>
File: 1774593473310350.jpg (72.6 KB)
72.6 KB JPG
>>108616088
I'm afraid that if I turn my computer off it won't turn back on again
>>
>>
>>
>>108616122
7:14 PMGood question — this is a common point of confusion with QSPI PSRAM.
Short answer: No, you cannot write just 4 bits (a nibble) to an address. The minimum write granularity is 1 byte (8 bits), even in Quad SPI mode.
>>
>>
>>108616331
The datasheet really says nothing
https://www.lcsc.com/datasheet/C261882.pdf
I was suspecting that and from my implementation testing I was quite sure about it, but maybe my implementation was just wrong.
But it's the chip
>>
>>
>>
>>108616536
>>108616631
Also you just keep on trying until it works.simple r/w test...
testing uint16
*data_u32: 0xbeeebeee - (expected: 0xbeeebeee)
==================================================
*data_u16+8: 0x0000fefe - (expected: 0x0000fefe) - 0x90000010
*data_u16+9: 0x00001337 - (expected: 0x00001337) - 0x90000012
==================================================
*data_loc+2: 0x00000023 - (expected: 0x23)
*data_loc+3: 0x00000045 - (expected: 0x45)
*data_loc+4: 0x000000ab - (expected: 0xab)
*data_loc+5: 0x000000cd - (expected: 0xcd)
==================================================
*data_u32+0: 0x4523beee
*data_u32+1: 0x0000cdab
*data_u32+2: 0x00000000
*data_u32+3: 0x00000000
*data_u32+4: 0x1337fefe
*data_u32+5: 0x88440000
*data_u32+6: 0x00000000
*data_u32+7: 0x00000000
*data_u32+8: 0x00000000
*data_u32+9: 0x00000000
*data_u32+10: 0x00000000
*data_u32+11: 0x00000000
*data_u32+12: 0x00000000
*data_u32+13: 0x00000000
*data_u32+14: 0x00000000
*data_u32+15: 0x00000000
*data_loc+0: 0x00000031 - (expected: 0x0031)
*data_loc+2: 0x00000077 - (expected: 0x0071)
*data_u16+0: 0x00001131 - (expected: 0x1131)
*data_u16+1: 0x00001177 - (expected: 0x1177)
*data_u32: 0x11771131 - (expected: 0x??)
Lets see if I can run programs from psram now. At least it looks based
>>
>>
>>
>>
>>
>>
>>108616161
>rework code
>no longer in a struct, and instead use register variables implicitly like a retard
>callee-preserved registers are now properly reserved and preserved, like Baal intended
>but the compiler "forgets" to restore the upper YMM registers - not only because they're volatile, but also because the compiler itself generates vzeroupper
https://godbolt.org/z/4j4q7f63b
Can't make this shit up
>>
>>
>>
File: bluray.jpg (31.4 KB)
31.4 KB JPG
>>108614996
factorio looks like the most boring game on earth.
If you want to build logic circuits just use vhdl, verilog or a fucking breadboard.
I might just be too dumb to understand why people love this shit
>>
File: base.png (3.8 MB)
3.8 MB PNG
>>108617250
Guess what taught me that central buses and processing plants don't scale due to congestion.
>just like in real life
>>
>>
>>108617315
If this video doesn't sell it, then I got nothing.
https://www.youtube.com/watch?v=9JUbCNt-tog
>>
File: PXL_20260416_195418078.jpg (2 MB)
2 MB JPG
>>108616772
WOW I LOVE YOUTUBE!!!
>>108617338
Well, I understand that it can be satisfying to just get it done. It's just not my cuppa
>>
File: nocall.png (70 KB)
70 KB PNG
>>108614634
I went to bed.
But your calls don't go anywhere, so the volatility of the registers doesn't matter.
>>
>>
>>
>>
>>108618316
You'd have bigger problems then, because the last reference jumps rather than calls.
If you want to interrogate unlinked binaries it may be more worthwhile to build it locally than use godbolt. When I have some time later I'll try it myself.
>>
>>
>>
>>
>>
>>108618767
Yeah, I have noticed that both Gemini and Deepseek have no ability what I'm talking about either. Of course vibecoders would find something they neither understand nor can be explained properly boring.
>>
>>
>>
>>108619085
You've been talking about this for months, and in months you've written exactly one sentence explaining wtf you are even talking about: >>108613751.
Everything else is incoherent babble (muh tracking) where you moan that the compiler doesn't perform some kind of context-sensitive (call graph) register allocation and ABI-violating optimized calling convention where, in your imagination, callee spills are magically better than caller spills.
>>
>>108619285
I'm sorry, have you implemented an AVX2-accelerated multithreaded registry dumper that executes in seven seconds?
I haven't either, but someone who did and spams 4chan so much about it likely has a bunch of mental illnesses.
>>
>>
>>
>>
>>108619295
>>108619308
non sequiturs
>>
>>
>>
>>
>>
>grug has food
>don't want shit on food
>can decide where store
>place where other grug is expected to shit
>in fact expected so much
>that even if other grug doesn't shit
>grug HAS to assume other grug will shit there
>rule of elder grug
>elder grug beat grug to death if not
>so grug will have to move food later again
>and every single time other grug comes in
>and takes a shit
>because that's other grug's shitting place
>but grug insists on putting food there
>then other place
>special place made for grug
>where other grug might still shit
>or not
>don't actually know
>not grug's shit, other grug's shit
>other grug's not telling
>but grug knows one thing
>even if other grug doesn't shit in first place
>grug still has to act as if other grug shits in first place
>even if other grug doesn't
>elder grug's rule
>but
>if other grug will shit
>in other place
>will move food away first
>and clean up after
>and move food back
>grug uses other place to store food
>because grug smarter than /dpt/ autists
>>
>>
>>
File: Screenshot_20260417-122934_1.png (37.8 KB)
37.8 KB PNG
>>108620044
Pushes contents of xmm6 and xmm7 to stack.
>>
File: no_push.png (3.6 KB)
3.6 KB PNG
>don't actually know
>not grug's shit, other grug's shit
>other grug's not telling
>>
File: 1704122102355709.jpg (49.6 KB)
49.6 KB JPG
>>108620007
>all that text written in 45s
>>
>>
>>
>>108617123
YMM registers are never used.
Since they need to be caller preserved, it literally doesn't matter if barrier dirties them cause foo never uses them again. It also doesn't matter that foo doesn't restore them cause that's the caller's (main's) problem.
>>
>>
>>108620189
>Since they need to be caller preserved
https://learn.microsoft.com/en-us/cpp/build/x64-software-conventions?v iew=msvc-170
>XMM6:XMM15: Nonvolatile (XMM), must be preserved by callee.
>>
>>
>>
>>
>>
>>
>>108618062
>But your calls don't go anywhere, so the volatility of the registers doesn't matter.
That's tantamount to saying "We can't discuss how this part of this function is translated to machine code without building an entire million line application around it!!!"
Take them as placeholders for something real and move on with your life.
Also bear in mind that the person you're talking to may be using a different calling convention in their code. There are several, and different platforms make different choices.
>>
>>108620305
But the lower portion is preserved by the callee, in pushing XMMs to the stack.
Seeing it is up to the caller to preserve the upper portions, but foo never uses them, it is free to blitz them and doesn't need to restore them.
>>
>>
>>108620506
>But the lower portion is preserved by the callee
Only if it actually needs the register. How many times do you need to hear this?
>where other grug might still shit
>or not
>don't actually know
>not grug's shit, other grug's shit
>other grug's not telling
>inb4 but how would it nuke the upper portions without pushing/popping the register
vzeroupper. Because register files are a mess.
>foo never uses them
What, do you think, do "v" and "ymm0" mean?
https://godbolt.org/z/f17sMvWor
>>
>>108620635
>What, do you think, do "v" and "ymm0" mean?
It sets the upper value in YMM with vpbroadcast, and clears it with vzeroupper, but foo never USES the contents of the upper portion, so it doesn't need to preserve them across the calls to barrier.
There doesn't seem to be anything wrong with the handling of the contents of the registers, only the choice of using MM6-9, which require the callee to preserve the contents of the XMM registers, which it does.
Same with https://godbolt.org/z/f17sMvWor
There seems to be a strong preference to use MM6-9, but there's nothing wrong with the way it is handling them. foo is preserving XMM6-9 as it should and when it does need to ( https://godbolt.org/z/GErvxYrWc ) as caller it preserves YMM.
>>
>>
>>
>>
>>
>>
File: entryflags.png (30.3 KB)
30.3 KB PNG
Completely unset entries (the second page of a sector added to a directory listing) remains unmodified (so 0xFFFF on 1 fill cards) which has 0x8000 set, but bit 0x4000 is never set on entries, whether they are in use or deleted.
So it seems likely that either a check for 0xFFFF is done along side a check for 0x8000 to see if an entry is in use, or a check with 0xC000 is done (AND or XOR or whatever), which is what I am going to do.
>>
>>108621104
I accept your concession; the registers are fully used, and the reason the compiler cannot properly preserve them is because it was originally designed for a system with no differing preservation rules per split.
>>
>>
>>
>>108621507
https://godbolt.org/z/M9v3ozTfx
wow, look at it preserving the values of the registers when they are actually going to be used after the calls
>>
>>
>>108621525
(I will admit this is a shitty demonstration, but it does show that the behavior changes, including pushing some values to stack, when the values are needed again after the calls. In some cases it is faster to load the values from memory instead, but that's because it is a shitty demonstration.)
>>
>>
>>
>>108621660
until you have to homebrew your own implementation of even a 1 feature from the crt and now you burned through the microseconds you saved at startup and are probably running orders of magnitude slower
>>
File: no_restoration.png (29.7 KB)
29.7 KB PNG
>>108621513
>What concession? I said nothing incorrect.
>foo never USES the contents of the upper portion
>>108621515
There is no contradiction.
>>108621525
>picrel
... you honestly cannot be that retarded. Please tell me you're just pretending.
>>
>>
>>108621751
>when it doesn't use them
BUT IT DOES! That's what the "v" is saying. Input dependency for the *entire* register, not just the lower portion.
>b-b-but it's not *actually* used
Nigger the compiler is not gonna look at whatever nonsense is being printed with the asm statement. It has no fucking clue what's actually inserting into the instruction stream. That's WHY you jump through the hoops of declaring output and input parameters and early clobbers and what kind of operands (memory, register, immediate, etc) you're accepting, Jesus Fruitcake Christ.
"v" is enough unless you can actively prove that it isn't, period.
>good luck
>>
>>108621764
This shit actively reminds me of >>108618062
>b-b-but you're not *actually* executing code here
Doesn't matter. What matters is that the compiler has to assume that the volatile registers are now smashed. It's just that simple.
>>
>>
>>108621764
Oh, and also: the argument that there's no output dependency is fucking retarded. That's what the "=m" is about. Makes sure the asm block isn't removed or reordered. After all, it's not an asm volatile block.
>>
>>
>>108621778
never did and never will be
https://www.youtube.com/watch?v=1RQBPs5Dkzg
>>
>>
>>
>>108621884
>still doesn't understand that the problem is the compiler being unable to see YMM registers as only partially preserved by the callee as per the rules defined by the ABI
>>108613835
>>
>>108621925
calling conventions exist so that a caller can call a function without knowing any detail of what the callee does exactly, outside of which argument it takes and values it return, you dumb motherfucker
stop conflating function calls and function call graph analysis, nigger
>>
>>
>>
>>108621986
Oh, don't worry. I'm actively advocating to remove all useless autists from software development, and the upcoming worldwide energy crunch is going to be extremely beneficial. Autists have no place in software development. If we had killed them all thirty years ago we'd be in much, much better shape.
>>
>>108622012
Fuck, even Google admits that autists are unsuitable for the job. And it's usually glazing your ilk, saying that you deserve human rights and employment.
>>
File: 1694411001056574.jpg (6.5 KB)
6.5 KB JPG
>>108622012
>the function call autist is also the angst basedjack resident
oh no no no no no
>>
>>
>>
>>