Thread #108667649
File: be witches.webm (2.9 MB)
2.9 MB WEBM
It seems 4chan started to alter the webms that are uploaded here. It happened during the last month, meaning the webm that was downloaded a month ago after upload now will have a different hash checksum. I'm going to test it posting some webms I downloaded from from 4chan more than a month ago.
Starting with this webm from /v/.
https://arch.b4k.dev/v/search/image/HiI3cP_nzmP_VtJTyg1kNw/
19 RepliesView Thread
>>
File: be witches 2.webm (2.7 MB)
2.7 MB WEBM
>>108667649
and this webm
https://arch.b4k.dev/v/search/image/4SF7KD2W1JofzILZgcb4TA/
>>
>>108667649
https://desuarchive.org/g/search/image/xVrSJmikwtbeS5icW0rszA/
>>108667655
https://desuarchive.org/g/search/image/tu6DvGAIzydOxsL51CLOkw/
Both webms were altered. This change might cause problems with archives and your collection of webms, because of multiple identical webms having different hash.
>>
>>108667679
I don't know how the webms are being modified. I don't think there was any metadata left to be removed, because I think 4chan already removed all metadata.
The file size of the modified webm is few bytes smaller.
>>
File: 1751963855773.mp4 (3.4 MB)
3.4 MB MP4
Now checking MP4 files.
https://arch.b4k.dev/_/search/image/oMjJ3eLUCACVXz-weQ3AFg/
>>
>>
File: bloatlord.webm (1.7 MB)
1.7 MB WEBM
Now the VP9 webm:
https://arch.b4k.dev/_/search/image/HGLGlyncNpDJXkrU_hjaGw/
>>
>>108667764
VP9 webm was altered too, it seems only the WEBM videos get modified.
https://desuarchive.org/g/search/image/1j2V4K0riNyTfrIcBydiuw/
>>
>>
>>
>>
>>
>>108667649
4chan uses ffmpeg on all video uploads, first to analyze the amount of streams, then to remux the video stream into a new file (to take out any possible metadata or embedded CP, yes, the latter is a problem).
The exact command according to the source leak is/usr/local/bin/ffmpeg-mp4 -f $format -i \"$file\" -map_metadata -1 -bitexact -c copy -f $format -y \"$out_file\"
where $ format is webm or mp4, $file is the input file, and $out_file is $file."_tmpff".
You can find it under function remux_webm() on imgboard.php.
If the resulting output is now different they either:
- updated ffmpeg and it produces different files now, or
- updated the command line to do something else.
Either way, you can try running the command on the file for your self to see what it produces and if it matches.
Also consider that Cloudflare can also remux the file.
Also 4chan actively bans site meta discussion, so this thread is going to be deleted.
>>
>>
>>108669412
first f defines/forces input format, second f defines output format
may not be necessary, but it's not wrong
>>
>>
>>108673087
>Why not just check if there is any metadata in the file instead of remuxing everything.
Because there are 1001 ways of adding metadata or to embed hidden data, including the simple "append to end of file" method. Remuxing works 100% no matter what.
You should look at the insanity the site does for checking images; there are things there like reading the IDAT block header, then comparing the listed block size with the full file size. If the file is larger than that (+/- a few %), it is presumed to have embedded data.
Jpeg files get remuxed with jpegtran which also removes some metadata needed for more advanced jpg compressions that rely on metadata, making some jpg files ending up larger on upload (forgot what that jpg format is called, someone on /g/ pointed it out that 4chan does not support it, and it was because of jpegtran making a bitstream copy to remove all metadata from the file).
>>
>>
>>108667649
Checksums will never be reliable.
If they are filtering the data at all even something as simple as updating zlib could change all the checksums.
Not saying that specifically, I don't think webm uses zlib, but just saying broadly speaking compression itself can alter checksums, even just updating a shared library.