Thread #108295929
File: highlights_g_108290374_1772660550_1.jpg (974.4 KB)
974.4 KB JPG
Discussion of Free and Open Source Diffusion Models
Prev: >>108290374
https://rentry.org/ldg-lazy-getting-started-guide
>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP
>Checkpoints, LoRAs, Upscalers, & Workflows
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info
https://openart.ai/workflows
>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/ostris/ai-toolkit
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/musubi-tuner
https://github.com/tdrussell/diffusion-pipe
>Z
https://huggingface.co/Tongyi-MAI/Z-Image
https://huggingface.co/Tongyi-MAI/Z-Image-Turbo
>Anima
https://huggingface.co/circlestone-labs/Anima
https://tagexplorer.github.io/
>Klein
https://huggingface.co/collections/black-forest-labs/flux2
>LTX-2
https://huggingface.co/Lightricks/LTX-2
>Wan
https://github.com/Wan-Video/Wan2.2
>Chroma
https://huggingface.co/lodestones/Chroma1-Base
https://rentry.org/mvu52t46
>Illustrious
https://rentry.org/comfyui_guide_1girl
https://tagexplorer.github.io/
>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe | https://litterbox.catbox.moe/
GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Collage: https://rentry.org/ldgcollage
>Neighbors
>>>/aco/csdg
>>>/b/degen
>>>/r/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg
>Local Text
>>>/g/lmg
>Maintain Thread Quality
https://rentry.org/debo
https://rentry.org/animanon
311 RepliesView Thread
>>
File: 00267-3812189491.png (2.9 MB)
2.9 MB PNG
>>
>>
>>
>>
File: ComfyUI_03282_.png (1.3 MB)
1.3 MB PNG
>>
>mfw Resource news
03/04/2026
>Helios: Real Real-Time Long Video Generation Model
https://pku-yuangroup.github.io/Helios-Page/
>Toward Early Quality Assessment of Text-to-Image Diffusion Models
https://github.com/Guhuary/ProbeSelect
>CFG-Ctrl: Control-Based Classifier-Free Diffusion Guidance
https://hanyang-21.github.io/CFG-Ctrl
>SIGMark: Scalable In-Generation Watermark with Blind Extraction for Video Diffusion
https://jeremyzhao1998.github.io/SIGMark-release
>Flimmer: Video LoRA training toolkit for diffusion transformer models
github.com/alvdansen/flimmer-trainer
03/03/2026
>Alibaba’s Qwen tech lead steps down after major AI push
https://techcrunch.com/2026/03/03/alibabas-qwen-tech-lead-steps-down-a fter-major-ai-push
>Adaptive Spectral Feature Forecasting for Diffusion Sampling Acceleration
https://hanjq17.github.io/Spectrum
>Kiwi-Edit: Versatile Video Editing via Instruction and Reference Guidance
https://github.com/showlab/Kiwi-Edit
>Let Your Image Move with Your Motion! -- Implicit Multi-Object Multi-Motion Transfer
https://ethan-li123.github.io/FlexiMMT_page
>Neural Discrimination-Prompted Transformers for Efficient UHD Image Restoration and Enhancement
https://github.com/supersupercong/uhdpromer
>OmniLottie: Generating Vector Animations via Parameterized Lottie Tokens
https://openvglab.github.io/OmniLottie
>Flow-Factory: A Unified Framework for Reinforcement Learning in Flow-Matching Models
https://github.com/X-GenGroup/Flow-Factory
03/02/2026
>Accelerating Masked Image Generation by Learning Latent Controlled Dynamics
https://github.com/Kaiwen-Zhu/MIGM-Shortcut
>Open-sourced a one-click ComfyUI setup for RTX 50-series on Windows
https://github.com/hiroki-abe-58/ComfyUI-Win-Blackwell
>stable-diffusion-webui-codex v0.2.0-alpha
https://github.com/sangoi-exe/stable-diffusion-webui-codex
>ComfyUI SeedVR2 Tiler
https://github.com/BacoHubo/ComfyUI_SeedVR2_Tiler
>>
>>108295980
why is this schizo lurking here now? go back to /sdg/ you fucking parasite
https://rentry.org/debo
>>
>>
File: xyz_grid-0041-1939773252.jpg (1.1 MB)
1.1 MB JPG
>>108295994
He has nobody to talk to in his thread
>>
>>108295929
Thank you for baking this thread, anon
>>108295944
>>108295957
Thank you for blessing this thread, anon
>>
>>
>>
File: 00119-3294099865.png (1.8 MB)
1.8 MB PNG
>>108296036
Odd thing to post but facts and statistics prove otherwise.
>>
>>
File: Seedream 4.5_In this eye-level, close (1).jpg (671 KB)
671 KB JPG
>>108296008
seedream 5.0 lite is absolute shit and even more censored than 4.0/4.5. 5.0 still produces that same face seedream look that baked into the previous models.
>>
>mfw Research news
03/04/2026
>BrandFusion: A Multi-Agent Framework for Seamless Brand Integration in Text-to-Video Generation
https://arxiv.org/abs/2603.02816
>From "What" to "How": Constrained Reasoning for Autoregressive Image Generation
https://arxiv.org/abs/2603.02712
>TC-Padé: Trajectory-Consistent Padé Approximation for Diffusion Acceleration
https://arxiv.org/abs/2603.02943
>DREAM: Where Visual Understanding Meets Text-to-Image Generation
https://arxiv.org/abs/2603.02667
>Generative Visual Chain-of-Thought for Image Editing
https://pris-cv.github.io/GVCoT
>SemanticDialect: Semantic-Aware Mixed-Format Quantization for Video Diffusion Transformers
https://arxiv.org/abs/2603.02883
>StepVAR: Structure-Texture Guided Pruning for Visual Autoregressive Models
https://arxiv.org/abs/2603.01757
>Conditioned Activation Transport for T2I Safety Steering
https://arxiv.org/abs/2603.03163
>NOVA: Sparse Control, Dense Synthesis for Pair-Free Video Editing
https://arxiv.org/abs/2603.02802
>Beyond Language Modeling: An Exploration of Multimodal Pretraining
https://beyond-llms.github.io
>FiDeSR: High-Fidelity and Detail-Preserving One-Step Diffusion Super-Resolution
https://arxiv.org/abs/2603.02692
>Preconditioned Score and Flow Matching
https://arxiv.org/abs/2603.02337
>Kling-MotionControl Technical Report
https://arxiv.org/abs/2603.03160
>Cultural Counterfactuals: Evaluating Cultural Biases in Large Vision-Language Models with Counterfactual Examples
https://arxiv.org/abs/2603.02370
>Semantic Similarity is a Spurious Measure of Comic Understanding: Lessons Learned from Hallucinations in a Benchmarking Experiment
https://arxiv.org/abs/2603.01950
>ProGIC: Progressive and Lightweight Generative Image Compression with Residual Vector Quantization
https://arxiv.org/abs/2603.02897
>RealOSR: Latent Guidance Boosts Diffusion-based Real-world Omnidirectional Image Super-Resolutions
https://arxiv.org/abs/2412.09646
>>
>>108296008
>>108296094
>This post is off topic.
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
File: make it stop.png (59.6 KB)
59.6 KB PNG
>>108296146
omg bruh another schizo? it never stops or what?
>>
>>108296143
Having a lot of Lora's downloaded along with using stuff like regional prompter wt 1024x resooution though... it makes me consider getting rid of the 5070 and just getting a 3090 from someone reputable like I originally planned to just for peace of mind.
>>
File: kino department.png (1.5 MB)
1.5 MB PNG
>>108296146
>tamzaiy:https://www.pixiv.net/en/users/38224130
Hello, Kino department?
>>
File: 00291-1297487379-ad-before.png (3 MB)
3 MB PNG
I think the sad part is he gave up on being witty or even making gens. He lacks the cognitive ability to make anything half way decent so he only post his garbage in his containment to cope with feeling like he belongs in his dead thread. I expect people to be able to use models to make interesting stuff at least not resorting to spamming and seething all day.
>>108296156
It's the same one he just wants to go all out because he's been outshined by the dev, who to be honest is way better at being a schizo than he is because he can at least hide in the crowd better. He's not great at it but still better than this guy
>>108296166
Why not try to get a TI and sell the regular model?
I think a 5000 series card will be better for you if they start expanding on the new ai features that I keep forgetting about.
>>
>>
>>
>>
File: 00294-1297487379.png (2.6 MB)
2.6 MB PNG
>>108296182
Damn, personally for me the 3090 seems like a hardsell outside of LLM stuff and stacking multiple.
>>108296194
OG schizo is trying to reclaim his throne and failing at it. He can't even make coherent gens or post anymore.
He's washed I tell you completely washed up
>>
File: 140163730_p0_master1200.jpg (924.4 KB)
924.4 KB JPG
>>108296146
>BakerAnon:https://www.pixiv.net/en/users/110320313
Holy fuck this anon cooks anime!
>>
>>108296174
>I think a 5000 series card will be better for you if they start expanding on the new ai features that I keep forgetting about.
I figured even those could not make up for the VRAM deficit on the order of 12GB versus 24GB, something to do with network layers I was told.
>>
>>
>>
File: 00298-2590445290.png (2.8 MB)
2.8 MB PNG
>>108296225
I guess you can but the newer cards are faster and you're less vram constrained with image gen compared LLMs. You also can't stack GPU for image generation like you can with LLMs. I think 24gb is the sweet spot and if the 5000 super cards actually launched those cards would have been the best imo. Still read about vram on the models you use.
>>
>>
File: 00302-2590445290.png (2.9 MB)
2.9 MB PNG
>>108296262
I see you bitching but I never see any gens
I think this fail gen is still better than anything you can make btw
>>
>>108295943
>>108296048
>>108296174
>>108296213
>>108296261
>Avatar or Signature use
You know what to do sirs, avatarfaggots aren't welcome here
>>
File: deCD_zi_00025_.png (3.5 MB)
3.5 MB PNG
>>
File: Seedance 2.0.mp4 (2.1 MB)
2.1 MB MP4
When will local reach this level of kino?
>>
File: kino factory.png (731.2 KB)
731.2 KB PNG
>>108296327
>https://www.pixiv.net/en/users/76374114
This is what i cal "good anime"! they surely deserve a general on their own!
>>
File: 00304-1604700477.png (2.8 MB)
2.8 MB PNG
You're really crashing out today
>>108296271
Part of why you failed is your misuse of terms, I'm not sure what to call your condition but this inability to grasp basic concepts has been hurting you for years.
I'm protesting a entity that has been harassing people he dislikes for years and can't even enjoy having the thread he fought so hard for all to himself.
I amaze myself with how fast I can make these in 60 steps, I typically do 150, might see how 300 looks with this model.
>>108296327
Really crashing out today I see
>>
>>108296348
>I'm protesting
spamming the same exact image is a "signature use" so yeah, you also always starts your filename with "00", that means everyone can easily recognize you, if you want to play the avatarfag, you have /sdg/ for that, this place is not made for self centered drama queens such as you
>>
File: 00178-2758067635.png (1.9 MB)
1.9 MB PNG
>>108296420
Are you done crying yet?
>>
>>
>>108296348
>>108296441
/ldg/ is gettig dismemebered by /adt/ since november, loser, but you're still clinging to Ani and Debo, you're incapable of seeing the present as it is faggot.
>>
>>108296302
>>108296102
>>108296484
>you're still clinging to Ani and Debo
to be fair, debo is still trying to invade this place
>>108295980
>>
>>
>>
>>
>>
>>
>>
>>108296441
>>108296348
You are a pussy clinging to the past if you are incapable to see the present
>>
>>
>>
>>
>>
>>108296540
>>108296549
>exactly one minute apart
try to be more patient the next time you want to do some samefagging "anon"
>>
>>
>>
>>
>>108296569
>He follows the same pattern
>"This is why /ldg/ is dead"
>>108296582
>thats why your general is dead.
LMAOOOO, the jokes write themselves
>>
File: 00319-391856034.png (2.6 MB)
2.6 MB PNG
>>108296575
No idea, I miss it desu
>>108296591
I know...I think what makes it genuinely hilarious is that he's resentful of his position and can't comprehend why after I think we're almost at 4 years, things have gotten worse for him.
>>
>>
>>
File: ComfyUI_03319_.png (1.2 MB)
1.2 MB PNG
>>
>>
File: ComfyUI_03329_.png (1.3 MB)
1.3 MB PNG
>>
File: 00047-1082936407.png (3.5 MB)
3.5 MB PNG
>>
File: Capture.png (45.6 KB)
45.6 KB PNG
>>108295929
I'd like to use AI to assist me in comic production.
Like "draw me cocked HK P7 M13 9mm with it's magazine, here's the rough sketch, and here's the callout sheet of HK P7 M13 for your reference" way of usage.
Is there any way to achieve this in local UI without having to train new LoRA for every single type of gun?
>>
>>
File: 00347-17445002.png (2.2 MB)
2.2 MB PNG
>>108296886
most likely not, you'll need to make loras
>>
File: 00051-3250529753.png (3.8 MB)
3.8 MB PNG
>>
>>
File: 00353-1650878331.png (2.8 MB)
2.8 MB PNG
>>
>>
>>
File: 00053-3424678679.jpg (441.6 KB)
441.6 KB JPG
>>108296979
the beauty of qwen 2512.
>>
>>
File: 00358-1909569379.png (2.8 MB)
2.8 MB PNG
I struck a nerve
>>
File: 00360-1909569380.png (2.5 MB)
2.5 MB PNG
>>
File: Anima_00566_.png (441.7 KB)
441.7 KB PNG
bix nood
>>
File: Anima_00500_.png (735.5 KB)
735.5 KB PNG
>>
>>
File: 00059-4223595170.jpg (536.8 KB)
536.8 KB JPG
>>
>>
File: 00062-3433294168.jpg (529.8 KB)
529.8 KB JPG
>>
File: 00066-2723775481.jpg (485.9 KB)
485.9 KB JPG
>>
>>
>>
File: 00068-3238456423.jpg (407.5 KB)
407.5 KB JPG
>>108297394
it's literally just three schizoids spamming constantly day after day.
>>
File: 00074-822221418.png (3.6 MB)
3.6 MB PNG
>>
>>108297446
There's only a couple actual anons posted here to keep them entertained and thus contained. All the old posters have moved to a different thread. You'd think they'd realize seeing how slow the thread is without the spam.
>>
>>
>>
>>
>>
>>
File: 00376-1484273914.png (2.3 MB)
2.3 MB PNG
He's still at it?
>>
File: Anima_00427_.png (852.5 KB)
852.5 KB PNG
yeah he's got TDS (tdrussell derangement syndrome)
>>
real question not trolling. do newfrens actually fall for posts like >>108297394 and >>108297517 ? also related why does anon make such posts? it seems like they have sour grapes or are upset by this thread existing no?
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>108297778
>outdated
It was only 2 months outdated when training started and it's a huge pain in the ass to update it. Especially since I would have to scrape everything including metadata myself, since I don't think there's any public dataset more up to date than the stuff on HF.
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
File: Anima_00612_.png (1 MB)
1 MB PNG
>>108297855
>>
>>
>>108297885
>>108297900
>>108297906
what kind of mental illness is this?
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
File: FK9B__00008_.png (1.8 MB)
1.8 MB PNG
>>
>>
File: 1747975398465359.png (3.5 MB)
3.5 MB PNG
>>
File: Whisk_f8f6700993aadf786a14a393275fa6c6dr.jpg (583.6 KB)
583.6 KB JPG
>>
File: 1765531658581189.png (70.5 KB)
70.5 KB PNG
https://github.com/Comfy-Org/ComfyUI/pull/12773
Oh shit, we'll soon get an improved version of LTX2
>>
File: 1767181712072663.png (3.6 MB)
3.6 MB PNG
>>
File: 1753904231878732.png (3.5 MB)
3.5 MB PNG
>>
>>108298087
>>108298095
Very interesting
>>
>>
File: ComfyUI_03354_.png (1.2 MB)
1.2 MB PNG
>>
>>
File: FK9B__00009_.png (1.3 MB)
1.3 MB PNG
Klein is such a sleeper.
>>
>>
>>
>>108298181
> What is the difference between LTX-2 and LTX-2.3?
> LTX-2.3 brings four major improvements over LTX-2.
>
> A redesigned VAE produces sharper fine details, more realistic textures, and cleaner edges.
>
> A new gated attention text connector means prompts are followed more closely — descriptions of timing, motion, and expression translate more faithfully into the output.
>
> Native portrait video support lets you generate vertical (1080×1920) content without cropping from landscape.
>
> And audio quality is significantly cleaner, with silence gaps and noise artifacts filtered from the training set.
> Should I upgrade from LTX-2 to LTX-2.3?
> Yes — LTX-2.3 delivers sharper output, better prompt adherence, cleaner audio, and significantly improved image-to-video across the board. The one exception: if your workflow relies on custom LoRAs, those will need to be retrained for the 2.3 latent space before you migrate. See the Migration Guide for details.
>>
>>
>>
>>
File: FK9B__00010_.png (1.8 MB)
1.8 MB PNG
>>108298266
>>
>>
>>108298306
>>108298223
What the fuck, why is their thread has such high quality and our is infested with tran and avatar troons?
>>
>>
>>
>>
>>
File: 1743286266863113.png (3.7 MB)
3.7 MB PNG
>>
>>108298223
>>108298306
I don't get it, why does it seem like they're talking to the void ?
>>
>>
>>
File: FK9B__00011_.png (1 MB)
1 MB PNG
>>108298312
>>
>>
>>
>>108298295
This is a question I've wondered for a long time. Is anon incapable of telling the difference between "ai style" and not? Or is he able to but chooses to not acknowledge it because more authentic styles are "difficult" to achieve and thus out of his ability?
>>
>>
File: 1760059119068941.png (3.1 MB)
3.1 MB PNG
>>
>>
>>
>>108296132
What can I do with 24GB running XL and whatever the lstest SD models are for image generation that I simply cannot with 12 or even 16GB?
I have a lot of Lora's downloaded, but I would like to train my own perhaos or at least experiment with doing so to prevent reliance on civitai stuff. Maybe.
>>
>>108298435
>that I simply cannot with 12 or even 16GB?
Run video models.
>but I would like to train my own
You can even train on 8gb.
>XL
Antiquated lineage. For anime use Anima https://huggingface.co/circlestone-labs/Anima
>>
File: 1758886898878373.png (3.7 MB)
3.7 MB PNG
>>
>>
File: uh oh.png (94.1 KB)
94.1 KB PNG
>>108298463
>>
>>108298306
>>108298463
Their gens are so interesting! Impossible to tell them they're sublimating pedophilic sexual impulses into drawings!
>>
>>108298450
Seriously? I was under the impression based on 8GB experience that in order to combine a workflow consisting of many tokens/tags, 1024x resolutions, hires fix, regional prompt, using the many lora's downloaded, and not compromise on image quality or resolution you need 16-24GB.
i was also told (albeitv 2 years ago by now) that low rank and high rank loras are simply impossible unless you got 16 or ideally 24.
I suppose if I had shelled out 800 euros for a 3090, I would have had fun dabbling in video gen too by now.
>>
File: FK9B__00012_.png (1.4 MB)
1.4 MB PNG
>>108298407
>>
>>108298498
>many tokens/tags
Does not affect VRAM usage.
>1024x resolutions, hires fix
Tiled sampling fixed this.
>regional prompt
Does not affect VRAM usage.
>using the many lora's downloaded
Barely affects VRAM usage.
>and not compromise on image quality
Quality is unaffected by VRAM.
>or resolution
See second answer.
>low rank and high rank loras are simply impossible
Most if not all trainers have low VRAM presets for every model.
Offloading to regular RAM can also help.
>>
>>
>>108298295
Hair and shirt of middle is shaded differently than the rest. Might not be the best example here, but Banana is incapable of reproducing styles that aren't from its training data, and makes things look extra generic and "soulless", at the upside of anatomical correctness which is half the reason the side view looks inoffensively plausible.
Another thing that didn't out well is multi views of room interior so I had to go autistic with manual editing, reprompting and rerolling to get workable results, reminding me that my hypocrite ass wanted to become an artist.
>>
File: 1751711725635440.png (1.6 MB)
1.6 MB PNG
>>
>>
File: sanic.png (1.1 MB)
1.1 MB PNG
>>108298626
yeah they're funny
>>
>>108298533
>Tiled sampling fixed this.
Doesn't this cause problems of its own, like edge seams and attention context loss besides much slower sequential speed? You can run full latent diffusion without tiling, preserving full context on 24 no? I noticed the loss of attention context a lot in my use of regional prompts.
>regional prompt
I am under the impression this depends on resolution, on the count of attention masks and multiple conditioning passes increasingh memory and adding overhead especially at 1024x resolution.
>lora's barely affect vram usage
This seems misleading when you take into account low rank versus high rank loras. And their number and modifier layers still create GPU tensors and modifu attention layers. Though UI's like Forge sometimes merges LoRAs into the model temporarily, reducing overhead at the cost of time and offloading to system RAM.
>quality unaffected by VRAM
Quantization/reduced precision? Attention slicing? CPU offloading? Tiled diffusion?
>trainers have low vram presets
I suppose you are right that high rank is possible today on low vram, but there may be instability and the training times will spike.
>offload to RAM
Still a significant penalty (10x slower) and seems to risk crashes or instability.
>>
File: comfy__49.jpg (862.2 KB)
862.2 KB JPG
>>
>>108298667
I'm struggling to come to grips with someone who can reference concepts like rank, attention, quantization, conditioning, etc. while also being under the impression that you need 24gb to get the most out of something as old as XL. Are you feeding these replies into Gemini and regurgitating its answers?
The bottom line is that 8gb is painful but doable, 12-16 is fine for image generation (except Qwen and another large model that I'm forgetting), and 24+ clears everything easily.
>>
>>
>>
>>108298626
I don't mind it in faux traditional style. Plus the fact it can do logical backgrounds sends it a mile ahead of n** in the background department, since "things don't make sense" is a bigger issue to me. Can be used to make comics too.
>>
>>108298728
>something as old as XL
My setup has not changed that significantly since 2023-2025 as much as I'd like, so I've heen stuck in a bit of an awkward spot.
I like to generate SDXL model/shitmix based images (especially retro styled anime or realistic game textures) at 1024x1024 resolution with 35-65 steps with 1-3 lora triggers with adetailer and hires passes for final generations along with inpainting, with some regional prompting if multiple subjects are involved for example. I do this in Forge/NeoForge. I have not tested newer models yet so alot of my questions are speculation or guesswork based on what I have looked at to squeeze as much out of low VRAM GPUs on Ampere as opposed to Blackwell or high VRAM Ampere based on what updares I have made. I have felt the limitations in several instances though.
Brute force capacity or suffer severe time penalties and quality loss haa been my understanding since the original spec for SD that I recall lists 24GB aa a requirement to run with no optimizations, which to my understanding are all tradeoffs.
>>
File: 1755632435600480.jpg (834.1 KB)
834.1 KB JPG
>>
>>
File: models-zit-2026-03-05_00017_.png (2.5 MB)
2.5 MB PNG
summoned here? ahh fuck
>>
File: FK9B__00015_.png (1.5 MB)
1.5 MB PNG
>>108298523
>>
>>108298774
A robust XL workflow is no match for a 16gb card. A 4gb card would be excruciating nearing (if not) impossibility, but with 16gb you are not suffering _severe_ penalties. Arguably the biggest loss of quality comes from quants and even then the "loss" of q8 is virtually nonexistent. In addition, quanted XL models aren't really a thing. A small number do exist but most quanting efforts have been focused on larger models because there has been little if any need for it with XL.
To say "you will suffer greatly using XL with anything less than 24gb" is highly dubious.The only difficult and painful part of your defined goal would be regional prompting and not for its compute requirements but rather the fact that it kinda sucks and has not received much focus since modern models have a degree of spacial awareness.
>>
File: models-zit-2026-03-05_00031_.png (2.4 MB)
2.4 MB PNG
>>108298908
and they give the other thread shit for bad gens lmao
>>
>>108298884
>>108298917
dont you have an alimony payment to make
>>
>>
File: FK9B__00014_.png (919.5 KB)
919.5 KB PNG
>>108298908
>>
File: models-zit-2026-03-05_00032_.png (2.3 MB)
2.3 MB PNG
>>108298921
what a strange thing to say
>>
>>
File: 1755990519816769.png (3.8 MB)
3.8 MB PNG
i dont like the lack of 1girls
fag do better fagollages
>>
File: 1760566758705173.png (2.9 MB)
2.9 MB PNG
>>
>>
>>108298924
If you had less than 12 then I would feel sorry for you. But you'll be fine with 16.
>>108298924
>higher resolution
You'll run into the limitations of a given models ability to generate extreme resolutions itself before you hit a wall due to lack of compute, honestly.
>>
File: 1742451412232338.png (3.6 MB)
3.6 MB PNG
>>
File: FK9B__00016_.png (1.6 MB)
1.6 MB PNG
>>108298933
>>
File: models-zit-2026-03-05_00040_.png (2.4 MB)
2.4 MB PNG
>>108298989
so real king
>>
>>
File: 1757440497489541.png (2.9 MB)
2.9 MB PNG
>>
>>
>>
File: FK9B__00017_.png (1.3 MB)
1.3 MB PNG
worse samplers are more gooder, in mysterious ways. just one of the reasons local is beettten ahem, better.
>>
File: FK9B__00005_.png (1.5 MB)
1.5 MB PNG
>>108299175
this one is what I meant.
>>
>>108295929
what can I realistically expect to run with 8gb of vram, 16gb regular ram? What kind of results would I get?
I'm wondering if it's worth the time and effort to set this up now or if I should wait until I can upgrade my ram and then do it after.
>>
File: burgervoi.png (2.4 MB)
2.4 MB PNG
we are hungry :)
https://suno.com/s/xBlXU9lhOP3hBdwM
>>
File: 1746349134519400.jpg (842.9 KB)
842.9 KB JPG
>>
>>
File: FK9B__00008_.png (844.8 KB)
844.8 KB PNG
>>108299180
>>
File: FK9B__00013_.png (1.7 MB)
1.7 MB PNG
>>108299292
lcm, cfg=10
>>
File: 1758993022393436.png (2.7 MB)
2.7 MB PNG
>>
>>108299198
I don't know about what >>108298916 has to say about it but when I was on a 3070 I would regularly exceed 16GB of system RAM for my image prompting.
>>
>>
>>
>>
>>
>>108298222
It's interesting: The slop gets more and more accessible (I don't care, it's fine, whatever. If it's fun do it, idgaf). But the promise of easily creating great stuff (or "pseudo-great" if that term offends you) quickly escape the grasp of the average user. I genned a bunch of shit for maybe a year or so before burning out, now it feels like I've been left in the dust and all the shitty tool-operating knowledge I gained from tweaking the knobs and levers is useless. I see some great amazing gens made by people and think "fuck man, how the hell am I gonna catch up with this?". Even if this is all hilarious because wErE nOt aCtuAlLy cReAtInG aNyThiNg, it still branches off into schools of thought and sifting through reams and reams of eldritch scrolls (models) and components that quickly get dizzying in scope unless you've been studying every single advancement every day something changes.
>>
>>
>>
>>
>>
>>
>>
File: 1742198918618621.jpg (634.4 KB)
634.4 KB JPG
>>
>>
>>
File: Wan2.2animate_move.mp4 (986.8 KB)
986.8 KB MP4
gave wan2.2animate-move a try on the hf space.
source video: >>>/pw/19984525
source image: https://files.catbox.moe/tzvwne.jpeg
can I run wan2.2animate on a 3090 with 32gb ram? what limitations would 64GB ram lift (longer videos)? what would gen times be like? on the hf space 4 seconds took 11 minutes, so I assume about 30mins on a 3090 at 480p?
>>
what is the best method to generate a lora for z image, what is best? I'm using ai-toolkit.
Do i generate the lora using the base model and the use the loras with the base image or with the turbo?
Or do I generate the lora directly for the turbo using the workarounds in ai-toolkit?
>>
>>
>>108300285
no, but i believe you, it gets shittier with every pull
anyways, i asked the free claude to vibecode a custom node based on this >>108291118 and i was surprised it actually worked for anima. not lossless with default settings (apparently it's uses some sort of clever mechanism to select which steps to skip, but only of flux models), but if you adjust settings, i get honest x1.4 speed increase
>>
>>
>>
File: Untitled.png (630 KB)
630 KB PNG
Trellis 2
>>
>>108300351
Agentic dataset collection and LoRA training + agentic image generation with fine tuned model VLLM with multiple personalities specializing in Conducting, Control Net Posing, LoRA selection, and judging result. Why the fuck hasn't anyone done this? You basically almost get proprietary model quality for free
>>
>>108300244
>so I assume about 30mins on a 3090 at 480p?
Nah it's not that long. Like 5-8 minutes depending on how long you want to stomach a few extra steps for quality. Generally you want to use a lighting LoRA with it.
Most workflows also come with a continue node so you can kind of keep it going forever. The real limitations of the model is that it kind of sucks for anything that isn't 1 girl dancing.
>>
>>
>>
>>108295929
>Local Model Meta: https://rentry.org/localmodelsmeta
>I haven't updated this in awhile. Sorry. I've been busy. I'll try to get back to it over the next couple of weeks, same with the Wan rentry. If not, someone else can take over.
time to stop including it
>>
>>
>>
>>
>>
>>
File: Untitled.png (615.6 KB)
615.6 KB PNG
Trellis 2.
>>
>>
>>
>>
File: node.png (28.8 KB)
28.8 KB PNG
>>108300923
be aware it's broken at the core (100% made by free claude). use lite node
https://files.catbox.moe/g07g91.zip
>>
File: Anima_00465_.png (810.2 KB)
810.2 KB PNG
>>108300952
thank you
>>
>>
>>
>>
File: 00016-2854748890.jpg (444.4 KB)
444.4 KB JPG
>>
File: 00009-3193015821.png (3.5 MB)
3.5 MB PNG
>>
File: Our MC.jpg (32 KB)
32 KB JPG
>>108300665
Thoughts? Even GPU manufacturers don't want anything to do with Python. Why Comfy insist on it?
>>
>>
>>108301217
python was good when hardware was cheap and abundant. we're in new reality now and people are finally starting to pay attention to efficiency. adversity breeds opportunity and i'm unironically hyped for the future of ai without bloat
>>
>>
File: Untitled3.jpg (27 KB)
27 KB JPG
>>108301217
>>
File: animapreview_00364_.png (1.6 MB)
1.6 MB PNG
>>108301333
12GB 3060 RTX
t. VRAMlet connoisseur
>>
File: ellie.jpg (103.1 KB)
103.1 KB JPG
>>108296306
seedance 2 will soon be censored, and they'll add a filter for celebrities...
btw, i'd like to take this opportunity to tell you that i've created my own discord server called bchan. :-)
discord VAaTvbH7
ldg sisters, you are welcome :)
>>
File: 00022-3948370380.jpg (456.4 KB)
456.4 KB JPG
>tfw no thicc big frap latina gf
>>
>>
File: 00003-2501008303.png (3.6 MB)
3.6 MB PNG
>>108301333
a 16gb vram card with 32gb of ram will be good enough but anything higher than sdxl and z image turbo will have issues. I would recommend you spend big and buy a 4090 or 5090. Look for beefy 4090/5090 prebuilds that have 64gb of ram. I spent nearly $4000 on my 5090/64gb ddr5 ram build last summer and have no regrets.
>>
File: 1761806072395283.jpg (953.3 KB)
953.3 KB JPG
>>
>>
File: 1743849031429092.jpg (544.1 KB)
544.1 KB JPG
>still replying to him
lol
>>
>>108301749
don't think it's ani but anon's right. you're seething at someone more successful because you're a nobody that didn't achieve anything in this field. you contribute nothing but useless drama to the threads
>>
>>
>needs over 2 years to figure out how to build an imgui wrapper
>still crashes all the time and compilation is shit
>zero contributions to sd.cpp (which is doing all the work and MIT)
I even respect that turk furkan more, he contributes more to the ecosystem lol
>>
>>
>>
>>
File: Anima_00669_.png (570 KB)
570 KB PNG
>>108301217
>>
Fresh
>>108301867
>>108301867
>>108301867
Fresh
>>