//g/
Discussion and Development of Local Image, Video, and Music Models

Previous: >>109001708

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
SDWebUI: https://rentry.org/ldg-lazy-getting-started-guide#the-stable-diffusion-web-ui-lineage
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, & Upscalers
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/ostris/ai-toolkit
https://github.com/Nerogar/OneTrainer
https://github.com/tdrussell/diffusion-pipe
https://github.com/kohya-ss/sd-scripts
https://github.com/kohya-ss/musubi-tuner

>Z
https://huggingface.co/Tongyi-MAI/Z-Image

>Anima
https://huggingface.co/circlestone-labs/Anima
https://tagexplorer.github.io/
https://animadex.net

>Qwen
https://huggingface.co/collections/Qwen/qwen-image

>Klein
https://huggingface.co/collections/black-forest-labs/flux2

>Wan
https://github.com/Wan-Video/Wan2.2

>LTX-2.3
https://huggingface.co/collections/Lightricks/ltx-23

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
https://rentry.org/mvu52t46

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe | https://litterbox.catbox.moe/
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Collage: https://rentry.org/ldgcollage

>Neighbors
>>>/aco/csdg
>>>/b/degen
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

>Maintain Thread Quality
https://rentry.org/debo
https://rentry.org/animanon
Showing all 124 replies.
>>
>mfw Resource news

06/07/2026

>Ideogram4 GGUF quantized files
https://huggingface.co/leejet/ideogram-4-GGUF

>‘A driver of political violence’: how the breakneck AI boom is fueling anti-tech extremism
https://www.theguardian.com/technology/2026/jun/07/anti-ai-tech-extremism-violence

>Ideogram 4 NF4 integration for Forge Neo with a visual JSON layout builder
https://github.com/Whatwhatio/forge-neo-ideogram4

>Huihui-gemma-4-12B-it-abliterated
https://huggingface.co/huihui-ai/Huihui-gemma-4-12B-it-abliterated

06/06/2026

>HugginFace VFS Plugin: Native Total Commander file system for Hugging Face models
https://github.com/mikinko/HuggingFace_WFX

>ComfyUI Lance AIO: Custom nodes to run Lance-3B
https://github.com/SteveImmanuel/comfyui-lance-aio

>Cube: Generative AI System for 3D
https://github.com/Roblox/cube

>The token bill comes due: Inside the industry scramble to manage AI’s runaway costs
https://techcrunch.com/2026/06/05/the-token-bill-comes-due-inside-the-industry-scramble-to-manage-ais-runaway-costs

06/05/2026

>RhymeFlow: Training-Free Acceleration for Video Generation with Asynchronous Denoising Flow Scheduling
https://simon-dcs.github.io/Website-of-RhymeFlow

>Complexity-Balanced Diffusion Splitting
https://noamissachar.github.io/CBS

>Can We Predict The Human Preference For Text-to-Image Content Prior To Generation And Is It Even Useful To Do So?
https://github.com/LSU-ATHENA/HPM-Predict

>SAM-Flow: Source-Anchored Masked Flow for Training-Free Image Editing
https://github.com/chwbob/Sam-Flow

>Geometry-Aware Dataset Condensation for Diffusion Model Training
https://github.com/2018cx/GADC

>StoryVideoQA
https://github.com/nercms-mmap/StoryVideoQA

>Lightricks to split into two companies as it cuts 75 jobs
https://www.calcalistech.com/ctechnews/article/r1dgjt5gmg

>Akium Sampler: Custom k-diffusion sampler for Stable Diffusion Forge / A1111
https://github.com/AkiumAI/akium-sampler
>>
lowest effort collage of all time lmao
>>
File: 1760481521858266.png (2.0 MB)
2.0 MB
Astronaut playing violin on the moon, by Greg Rutkowski.
>>
>by Greg Rutkowski
lost technology unless you're midjourney (it's probably lost there too)
>>
>>109003994
i was gonna make the top right image real fore fun but its a literal chibi...

fuck it lets just get it done anyway and she how they like realistic loli...
>>
>>109004104
Do you think he's still mad about AI even though his name is basically associated with "good art"?

He's one of the largest beneficiaries of the technology.
>>
>>109004110
standing. looking at pooer
>>
>>109004026
hmmm
>>
1.4 MB
>>
>>
1.4 MB
>>109003927
there u go faggot is that what you meant and want? its OK you don't have to fucking hide it...

what you gonna do fucking mald and seethe about it for the next 24 hours?
>>
localmeltie
>>
>>109004188
what ever man what ever man what ever
>>
the problem is mine is artist and legit the other shit is i'm gonna hide it and its fucking cringe as fuck. people see right through it.
>>
reprompted the whole collage itself as a new image on Klein 9B with Gemini caption lmao, came out better than expected honestly

the prompt is very long so I put it here:
https://pastes.io/aywh1DMS
>>
File: Anima-00018-387790566.jpg (749.5 KB)
749.5 KB
>>
1.3 MB
why not. its just a young girl with a bear suit in front of her computer.
>>
1.3 MB
>young girl wearing a cute bear suit in front of her computer in her messy bedroom, fast food, coke drink from typical fast food with clear plastic lid, disbelief and depression about what she is seeing on her computer monitor. Night time, dark, blue light from the screen, realistic, photo, high quality

dropped the reference and controlnet just to see how the prompt worked.

not bad.
>>
File: Ideogram_00082_.png (2.2 MB)
2.2 MB
>>
Bro seems to believe that this image was an example of something only Nano Banana could do, impossible to reproduce without Le Ebin Json Ideogram (it is not and was not)

https:/reddit.com/r/StableDiffusion/comments/1tzr6ci/an_experiment_recreate_jsonprompted_closed_model/
>>
>>109004220
this is what Neon Genesis Evangelion will look like in 2013
>>
Are most using the workflow provided by the lakers of ltx 2.3 for training loras or is 3rd party stuff better now?
>>
>>109004220
>misato cosplaying as aska
cute hag
>>
>>109004351
Naomi Wu before boobjob
>>
>>109002285
Bot you don't talk about that stuff after sex? Do you never ask each other about random lore of things you don't know about the opposite sex?
>>
>>
>>109004405
I am fascinated by your world view
>>
>>109004424
AY BITCH WHERE DEM HORI NIP NIPS COME FROM HUH

- me, post coitus
>>
File: Ideogram_00087_.png (2.8 MB)
2.8 MB
>>
Haven't touched SD since 2023
Is in painting still required for editing a face or are models today smart enough to handle requests like "give the woman on the right blone hair"?
>>
>>109004479
>are models today smart enough to handle requests like "give the woman on the right blone hair"?

Short answer yes.

Slightly longer answer, not all models.
>>
File: nbp.png (1.6 MB)
1.6 MB
>>109004351
>Three anime figures on or near the PC tower
there's four
>"stance": "standing in a slight contrapposto pose",
not really, she's just leaning
>negatives: Any appearance of pink/magenta anywhere
your troony keyboard and pride flag in the background??

not bad, a few more training runs on nano banana and gpt outputs and local should be there by 2027.
>>
>>109004502
??? wat? my picrel Klein one is closer overall to the original Gemini pic than his Ideogram one was. Had had rando curtains on the wall and four figures on top with none inside, instead of three plus one inside.
>>
File: Ideogram_00092_.png (2.7 MB)
2.7 MB
>>
>>109004533
why are Ideo gens so Ernied, like high contrasty grainy
>>
>Ideogram 4
fucking kill yourself... fucking brown skinned low iq cunt...
>>
buy a fucking ad for there is nothing that shit model can do that i can't with a simple controlnet faggot mouth breather
>>
>>109004532
the ideogram one is terrible. json prompting is a meme anyway, the only reason it works on nano banana is because their 3 trillion parameter LLM is re-writing the prompt in the background. jeets think it's some kind of 'computer language' that 'better represents how models think' but it's really just slop.
>>
1.5 MB
>>109004502
he really cares about this, its his whole life...
>>
wtf are horizontal nipples?
>>
>>109003927
Have an RTX 5080 with 16GB VRAM. Can I even run WAN?
>>
Why is this guy having a meltie?
>>
>>109004588
>json prompting
i told gemini to fuck off with that shit, it seems to leaking more and more into mainstream cloud models, its is fucking garbage. The more they move away from us the more they will self destruct, so its win win.
>>
>>109004614
you'd know if you asked a woman about it immediately after railing her
>>
>>109004619
because you're a fucking faggot and i'm tired of pretending to be nice...
>>
>>109004619
I mean I just posted this:
>>109004351
where the ATTACHED pic here on le 4Chinz (made with Klein) was just showing that the Gemini pic (left side of the Reddit thread) was not in any way an example of something that was difficult to gen to begin with. IDK about anything past that lol
>>
>>109004600
why does this have a gemini watermark
>>
>>109004636
show me
>>
>>109004636
oh so it does, i guess they trained this model on some images from gemini, so fucking sue me? its not like those other companies didn't steal everyone else's shit and then charge money to use it...

"its okkay when we do it... "
>>
>>
>>109004636
KEEEEEK remember to thank your api overlords, localkeks. without google and openai, you wouldn't have any synthetic outputs to train your slop on.
>>
File: 56754.webm (3.1 MB)
3.1 MB
would you look at the time
>>
>>109004657
sure its not like every real image wasn't already trained into the local models, its not like those datasets are magically gone either.
>>
>>109004671
pathetic man truly pathetic, you could even see the start point before the explosion.

Tip: start from empty white image, then make the prompt.
>>
>>109004619
Four hours ago someone posted a cute 1girl and he's been melting down ever since.
>>109003582

It's been a couple of months since this guy has had this particular brand of sperg out ITT. He used to do it like every other weekend.
>>
File: Ideogram_00099_.png (2.6 MB)
2.6 MB
>>
>>109004689
i wonder why my glow nigger
>>
>>109004700
neat
>>
File: debo_s_fia_00016_.png (2.0 MB)
2.0 MB
>>
we're going to hell and you're all coming with us.
>>
the darkness will consume you. our mission complete it was our purpose.
>>
>>109004700
Prompt? I'm close to wanting to use i4 even if just to say I have
>>
>>109004671
Nice ehh... vignette
>>
>>109004636
because he just i2i'd >>109004502
>>
https://www.youtube.com/watch?v=3wTl8DrC240
Amen
>>
>>109004728
https://pastebin.com/MMGXzBv0

It's llm slop and too big to post here
>>
>>109004743
the plot thickens
>>
>>109004760
Are you having it write that based on an existing image or are you just typing out ideas and having it expand upon them?
I'm really close to trying out i4.
>>
Japanese Folk Metal ACEStep XL LoRA. Trained on just 10 songs.
https://vocaroo.com/1hOnOf8ZWn71
https://vocaroo.com/18pRgXxfm3tj
https://vocaroo.com/15AMD9XrQ4Xl
>>
File: Ideogram_00096_.png (1.7 MB)
1.7 MB
>>109004783

The latter. My pathetic human attempts ended up looking like shitty romance novel covers.
>>
in the end who really cared this much, it was all just a dream.

peace
>>
nightmarishly terrible thread, a new low for local. good job lads.
>>
File: Ideogram_00106_.png (2.6 MB)
2.6 MB
>>
File: Ideogram_00107_.png (3.3 MB)
3.3 MB
>>
>>109004838
>that photo on the wall behind migu and homer
kekk
>>
>>109004797
Hows the speed and on what card are you running it on?
>>
>>109004871
3090. Slow as fuck desu. Like a minute 30 seconds per image.
I think there are ways to speed it up, but I haven't checked yet.
>>
>>109004789
easiest way to train acestep loras?
>>
>>
>>109004890
NTA, but side step.
>>
>>109004816
Anon can make it up in the back half dw
>>
https://files.catbox.moe/ek5jwc.mp4
facebook ai content is so unhinged lmao
>>
>>109004890
Wrote a detailed guide here
https://rentry.co/s8fg8ber

But it uses custom scripts alongside Side-Step. By far easiest way is to just use Side-Step's options to caption the dataset, but my script is what I use since I can work around bugs once everything is fetched. The most tedious part is just the data curation, with doubledouble top and the web to curate lyrics, and the help of Gemini to structure them, it should not be too bad (done manually), though I still use a script to reformat my lyrics. For training, I use a cloud GPU (Modal) since it's free $30 in monthly credits, good for a few runs (this one was just $5). I will update the guide later to include a Modal training script. It can be done locally, but since I have a 3090 it would have to be left running overnight to get the 800 epochs. I don't like to use 60 sec chunks on Side-Step because I think it converges better without, so I train with the full songs which takes exponentially more time.
>>
>>109004946
Thanks anon!
>>
File: Ideogram_00116_.png (3.4 MB)
3.4 MB
>>
>>109004990
how nsfw can ideogram get? Can it do boobs? What about underwear, lingerie?
>>
>>109004946
nice ty anon
>>
>>109004996
It can do most things. It's that fucking filter that's the problem.
>>
>>109004996 >>109005000
try https://pastebin.com/xpYezwZp as workflow
>>
>>109004990
>>109005000
it has sd1 face

so it seems all these three news one ernie,animu and ideogram are fails then

cant beat zimage and klein and even qwen aint that bad
>>
there's one thing that isn't a failure
>>
>>109005019
Interestingly I just tried my regular workflow but wrote naked lady with no clothes in Japanese and it did it no issues.
>>
File: 1764654624217817.png (2.8 MB)
2.8 MB
>>
>>109005000
ok, so it can do breasts
https://civitaiarchive.com/models/2679521?modelVersionId=3008701

I'm just not impressed with the weird grainy quality. I'm trying to find a reason to pull the trigger and start downloading the models but I just don't see how it's better than zit or k9 with some realism loras thrown in. Skin and faces look very, very, sloppy
>>
File: ig.jpg (268.0 KB)
268.0 KB
>>109005043
good too.

anyhow if the censorship gives you trouble that's the least affected sensible workflow I've seen so far. makes the model usable.

i forgot to add "masterpiece" but it almost is
>>
>>109004671
I used to work with a girl who looked and acted like this. We both worked in a Japanese company so she was a rice hunter and a bit of a cunt. But your of reminds me of her.
>>
>>109005072
the 1girls are less pretty than in other models overall but not terrible

> I just don't see how it's better
you can prompt far more characters/objects in defined regions, that's the main thing IMO
>>
>>109005080
what like she craved yellow d lmao
>>
>>109005089
Yeah lol. I assume all white women I meet here are into Asian dick. It’s hell on earth for them here otherwise
>>
>>109005080
you should have fixed her
>>
File: ig.jpg (347.6 KB)
347.6 KB
>>109005072
also the best model for text. maybe if you want to do some visual storytelling.
>>
>>109005078
have you tried it with an abliterated clip?
>>
File: ig.jpg (305.7 KB)
305.7 KB
>>109005152
no. i just recommend trying >>109005019, it works quite well
>>
why does the comfyui desktop app run at 5 FPS? the gen speeds are fine, but the interface is so slow. i'm not even using one of those jeeted 500 node workflows either
>>
>>109005072
this is jeetslop

>This workflow uses an uncensored text encoder: Qwen3VL-8B-Uncensored-HauhauCS-Aggressive, plus a latent upscaler before the image sampler. Right now, it works successfully around 30% of the time

bro thinks the text encoder has jack diddly fucking squat to do with anything here (it does not, "uncensored" text encoders are NOT a thing that serves any purpose in the context of image models)
>>
anyone got ideogram gguf to work in comfy?
>>
File: ComfyUI_00746_.png (503.5 KB)
503.5 KB
How can local diffusion be used to create video game? How do I ensure my OC has a consistent face?
>>
File: file.png (1.9 MB)
1.9 MB
>>
>>109005240
Why would you use gguf?
It's quantized from 8bit, it will have shit quality.
Just run the fp8 if you can, nf4 if you are a hyper vramlet with I dunno like 6 gigs of vram.
>>
>>109005198
i'm guessing by far most here (still) use the webui, it's probably also the most obvious workaround tion if that desktop ui has some bug
>>
>>109005263
this shit so ass
>>
>>109005265
nvfp4 is blackwell only
>>
>>109005273
no saar, very good model saar. please watch the /r/stablediffusion postings
>>
>>109005259
many options. world models
https://github.com/Tencent-Hunyuan/HY-World-2.0
https://github.com/robbyant/lingbot-world
https://over.world/

might be the most direct way sooner or later

or try to use 3d object generation splat models, idk which is currently best maybe try https://github.com/VAST-AI-Research/TripoSplat https://github.com/IgorAherne/TRELLIS.2-stableprojectorz etc, obviously this is for 3d engines

or use blender/krita/whatever with plugins to work with 3d or 2d textures

it's not what most people here usually do tho.
>>
File: idiotgram.jpg (256.8 KB)
256.8 KB
>>109005273
>>109005263
>>109005301
saar you only need to draw more bounding boxes
>>
>>109005263
anyone have a workflow/node to get around the salocy filter?
>>
File: ComfyUI_00747_.png (442.1 KB)
442.1 KB
>>109005303
>>
File: ga.jpg (227.6 KB)
227.6 KB
>>109005322
see >>109005019
it seems to mostly work for me. it may be that you do have to define some extra boxes. idk, add a /ldg/ logo
>>
>>109005291
nf4 and nvfp4 aren't related at all.
>>
Also nvpf4 will "work" on 4000 and 3000 series, the speed will be ass due to lack of acceleration, similar to how fp8 works on 3000.
But that is also true, and possibly worse for Q quants.
Regardless it's not nvfp4 anyway.
>>
File: 1778593395456.webm (1.6 MB)
1.6 MB
>>
>>109005019
Error running sage attention: Unsupported head_dim: 256, using pytorch attention instead.

am I supposed to switch off sage on start-up?
>>
File: ComfyUI_00750_.png (538.3 KB)
538.3 KB
>>
>>109005380
idk if anything is better than the fallback it already chose

Reply to Thread #109003927


Supported: JPG, PNG, GIF, WebP, WebM, MP4, MP3 (max 4MB)