Thread #108301545
File: 2026-01-07-stack-overflow1.png (279.2 KB)
279.2 KB PNG
What's going happen when AI have nothing to train on anymore
27 RepliesView Thread
>>
>>
>>
>>108301545
Someone made a good point about real-ID verification being implemented for the sake of keeping a clean data pool to train off of. It'd make sense if there weren't other more obvious "factors". Maybe it's all of the above; maybe everyone's guesses are right regarding the incentive for pushing this shit.
>>
>>
>>
>>
>>
>>
>>108301694
I don't know why people say this as if people don't learn the same way.
As if our teachers did anything other than reinterpret what they were taught, from someone who learned the unoriginal idea from observing something else themselves.
>>
>>
>>
>>
>>
>>
>>
>>
>>108301545
There was over a trillion user generated messages sent to chatgpt in 2025. For all data collecting models we are looking at atleast 2 trillion user generated messages. If just 1 in a million include high quality information that still dwarfs stack overflow.
You need to recalibrate your scale, stackoverflow is small potatoes compared to the chatbot push. Many technical people didn't even use it.
>>
>>
>>
>>
>>
>>
>>108301731
The difference here is that teachers are people with real bodies that live in a world which they perceive through more than just text someone else wrote. They can test if the things they've been taught mesh well with reality.
If some student is thought something shitty, like how to program COBOL, they can use it, compare it to alternatives, like C, and ask themselves which was a better experience. If they then become a teacher they're probably not gonna teach a COBOL course and will instead teach C. This type of merit-based filtering is not something you're ever gonna get from an AI unless the AI has been TOLD, in its training data, that one thing is better than another.
Long story short, a person can make real decisions because they live a real life, AIs can't.
>>
>>108301545
1. Incentivize the creation of fresh data in the wild, and find ways to identify ai-poisoned data
2. Directly employ people tasked with the creation of more training data
I really don't know how (1) would work, so I envision the creation of AI training gig-work farms. Such things already exist to a certain degree with labeling firms. What a time we live in.
>>
>>108301545
the same thing that's been happening for the past two years after they used up all the data, you train with rl until it achieves its goals. the limiting factor is and always has been compute, not data
>>
File: 1760363808208.jpg (1 MB)
1 MB JPG
>>108305095
Facebook and Yann LeCun win again