Thread #108672527
File: highlights_g_108668921_1776984968_1.jpg (1.9 MB)
1.9 MB JPG
Pear-Shaped Edition
Discussion and Development of Local Image and Video Models
Previous: >>108668921
https://rentry.org/ldg-lazy-getting-started-guide
>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP
>Checkpoints, LoRAs, Upscalers, & Workflows
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info
https://openart.ai/workflows
>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/ostris/ai-toolkit
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/musubi-tuner
https://github.com/tdrussell/diffusion-pipe
>Z
https://huggingface.co/Tongyi-MAI/Z-Image
>Anima
https://huggingface.co/circlestone-labs/Anima
https://tagexplorer.github.io/
>Qwen
https://huggingface.co/collections/Qwen/qwen-image
>Klein
https://huggingface.co/collections/black-forest-labs/flux2
>LTX-2
https://huggingface.co/Lightricks/LTX-2
>Wan
https://github.com/Wan-Video/Wan2.2
>Chroma
https://huggingface.co/lodestones/Chroma1-Base
https://rentry.org/mvu52t46
>Illustrious
https://rentry.org/comfyui_guide_1girl
>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe | https://litterbox.catbox.moe/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Collage: https://rentry.org/ldgcollage
>Neighbors
>>>/aco/csdg
>>>/b/degen
>>>/r/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg
>Local Text
>>>/g/lmg
>Maintain Thread Quality
https://rentry.org/debo
https://rentry.org/animanon
323 RepliesView Thread
>>
File: 1775473589226078.png (1.2 MB)
1.2 MB PNG
Is there some gigachad with a chad gpu to try this? I wanna see how well it fares against this prompt >>108671341
https://github.com/inclusionAI/LLaDA2.0-Uni
https://huggingface.co/inclusionAI/LLaDA2.0-Uni
>>
File: 1girl pixelDit.png (1.6 MB)
1.6 MB PNG
don't let kekestone see this lol
https://pixeldit.github.io/
https://github.com/NVlabs/PixelDiT
https://huggingface.co/nvidia/PixelDiT-1300M-1024px
>>
>>
>>
File: file.webm (2.1 MB)
2.1 MB WEBM
>>108672564
>kekestone
https://xcancel.com/LodestoneRock/status/2046437094479020543#m
>recursive model distillation be like
he's not wrong though, that's what happens when you train on synthetic data
>>
File: 1759157790491765.png (2.3 MB)
2.3 MB PNG
>>
>>108672554
Damn shit has built-in turbo?
>>
File: ComparisonTwo.jpg (3.9 MB)
3.9 MB JPG
>>108672490
Cosmos 2 didn't look that great but it was compositionally coherent and good enough at text
here all three were genned at 1072x1440, hi-res-fix upscaled up to 1872x2512
>>
>>
File: 1767235136033825.png (2.9 MB)
2.9 MB PNG
>>
>>
>>
>>
File: leave.png (100.7 KB)
100.7 KB PNG
>>108672647
you have to go back >>108653190
>>
>>
File: 1756479624894773.png (2.3 MB)
2.3 MB PNG
flux klein edit is still so neat, Q8 model works fast too, one or two image inputs.
give the anime girl a white racing suit with "Marin" in stitched black lettering on the front.
>>
File: weird.png (1.5 MB)
1.5 MB PNG
>>108672564
>https://pixeldit.github.io/
damn look at those veins, that dude is surely juicing kek
>>
>>108672683
apparently you can get even better results with klein
https://www.reddit.com/r/StableDiffusion/comments/1somo2r/coming_up_to morrow_flux2klein_identity_transfer /
>>
File: flux_nigger_vae_v2_ultrarealistic_0001.png (2.7 MB)
2.7 MB PNG
>>
>>
what's the best realism editing model that is good at following instructions for moving camera perspective? nano banana isn't working good since if i tell it to raise the camera a little higher, it creates a satellite image of the scene
>>
>>
>>
File: 1750555552202270.png (1.4 MB)
1.4 MB PNG
>>108672683
business suit + blouse
>>
>>108672641
>>108672699
why are you posting Gippity Image 1.5 gens lol
>>
>>
>>
>>
File: Ernie be like.png (233.1 KB)
233.1 KB PNG
>>108672596
>>
File: proof.png (875.8 KB)
875.8 KB PNG
>>108672722
Hive is trained on a gorillion outputs from each model, you're not gonna get a 1.0 return for GPT Image 1.5 unless it actually is specifically that
>>
File: ernie lora be like.png (434.5 KB)
434.5 KB PNG
>>108672747
>>
>>
>>
>>
>>108672564
Apparently OOMs on 12gb because it tries to load TE and unet model at the same time.
Was gonna make some garbage for the memes but thanks for wasting my time downloading.
I can probably change load precision in the inference script to get it to work but I think I am just going to delete it.
>>
>>
File: 1750902079181552.jpg (762.3 KB)
762.3 KB JPG
>>108672695
https://github.com/capitan01R/ComfyUI-Flux2Klein-Enhancer/blob/main/ex ample_workflow/iden_wf%20(1).json
oh shit it's not a snakeoil, it works well
>>
>mfw Resource news
04/23/2026
>ParetoSlider: Diffusion Models Post-Training for Continuous Reward Control
https://shelley-golan.github.io/ParetoSlider-webpage
>DynamicRad: Content-Adaptive Sparse Attention for Long Video Diffusion
https://github.com/Adamlong3/DynamicRad
>Normalizing Flows with Iterative Denoising
https://github.com/apple/ml-itarflow
>LLaDA2.0-Uni: Unifying Multimodal Understanding and Generation with Diffusion Large Language Model
https://github.com/inclusionAI/LLaDA2.0-Uni
>Illustrious XL & NoobAI-XL Style Explorer
https://github.com/ThetaCursed/Illustrious-NoobAI-Style-Explorer
>AI Model & ‘MAGA’ Influencer Emily Hart Unmasked as Indian Man
https://www.yahoo.com/news/articles/ai-model-maga-influencer-emily-091 027504.html
04/22/2026
>Embedding Arithmetic: A Lightweight, Tuning-Free Framework for Post-hoc Bias Mitigation in Text-to-Image Models
https://github.com/cvims/EMBEDDING-ARITHMETIC
>Denoising, Fast and Slow: Difficulty-Aware Adaptive Sampling for Image Generation
https://github.com/CompVis/patch-forcing
>TS-Attn: Temporal-wise Separable Attention for Multi-Event Video Generation
https://github.com/Hong-yu-Zhang/TS-Attn
>AnyRecon: Arbitrary-View 3D Reconstruction with Video Diffusion Model
https://yutian10.github.io/AnyRecon
>SmartPhotoCrafter: Unified Reasoning, Generation and Optimization for Automatic Photographic Image Editing
https://github.com/vivoCameraResearch/SmartPhotoCrafter
>Soft Label Pruning and Quantization for Large-Scale Dataset Distillation
https://github.com/he-y/soft-label-pruning-quantization-for-dataset-di stillation
>Extending One-Step Image Generation from Class Labels to Text via Discriminative Text Representation
https://github.com/AMAP-ML/EMF
>Enhancing Continual Learning of Vision-Language Models via Dynamic Prefix Weighting
https://github.com/YonseiML/dpw
>IR-Flow: Bridging Discriminative and Generative Image Restoration via Rectified Flow
https://github.com/fanzh03/IR-Flow
>>
>mfw Research news
04/23/2026
>Image Generators are Generalist Vision Learners
http://vision-banana.github.io
>Camera Control for Text-to-Image Generation via Learning Viewpoint Tokens
https://randdl.github.io/viewtoken_control
>Hallucination Early Detection in Diffusion Models
https://arxiv.org/abs/2604.20354
>Wan-Image: Pushing the Boundaries of Generative Visual Intelligence
https://arxiv.org/abs/2604.19858
>MMCORE: MultiModal COnnection with Representation Aligned Latent Embeddings
https://arxiv.org/abs/2604.19902
>Rethinking Where to Edit: Task-Aware Localization for Instruction-Based Image Editing
https://arxiv.org/abs/2604.20258
>Amodal SAM: A Unified Amodal Segmentation Framework with Generalization
https://arxiv.org/abs/2604.20748
>FluSplat: Sparse-View 3D Editing without Test-Time Optimization
https://arxiv.org/abs/2604.20038
>HumanScore: Benchmarking Human Motions in Generated Videos
https://arxiv.org/abs/2604.20157
>Render-in-the-Loop: Vector Graphics Generation via Visual Self-Feedback
https://arxiv.org/abs/2604.20730
>Mitigating Hallucinations in Large Vision-Language Models without Performance Degradation
https://arxiv.org/abs/2604.20366
>Cognitive Alignment At No Cost: Inducing Human Attention Biases For Interpretable Vision Transformers
https://arxiv.org/abs/2604.20027
>X-Cache: Cross-Chunk Block Caching for Few-Step Autoregressive World Models Inference
https://arxiv.org/abs/2604.20289
>Self-supervised pretraining for an iterative image size agnostic vision transformer
https://arxiv.org/abs/2604.20392
>Efficient INT8 Single-Image Super-Resolution via Deployment-Aware Quantization and Teacher-Guided Training
https://arxiv.org/abs/2604.20291
>From Diffusion to Flow: Efficient Motion Generation in MotionGPT3
https://arxiv.org/abs/2603.26747
>>
>>108672628
People who say Cosmos was bad are dumb. It has an atrocious default aesthetic tune (by design, it's a fucking robotics world model), but in terms of prompt understanding, coherence, and breadth of knowledge it was basically Flux 1 level but only 2b parameters.
>>
File: SixWay.jpg (3.7 MB)
3.7 MB JPG
Updated comparison, adjusted the prompt a bit to try to force more similar results out of all the models
Prompt:
A fair-skinned young Irish woman with long, sleek copper-red hair and blue eyes stands centrally on a weathered stone walkway, posing daintily and smiling directly for the camera. She wears a whimsical pastel lavender mini-dress featuring a tiered skirt, ruffled bodice with lace trim, and sheer long sleeves, accessorized with a metallic gold crossbody bag. Her legs are clad in intricate white patterned lace tights, ending in chunky two-tone black and white platform oxford shoes. She is situated in a formal garden setting, flanked by stone balustrades topped with large white classical urns containing manicured green bushes. Immediately behind her stands a white architectural frame structure bearing the text "1GIRL GARDENS" in bold serif capital letters. The background reveals terraced flower beds, classical white statues, and a green hillside dotted with buildings. The lighting is soft, flat, and diffused from an overcast sky, creating shadow-free illumination that enhances the soft pastel colors of her dress and the even tones of her complexion. Style: whimsical DSLR street fashion photography. Mood: sweet, composed, and serene. Aspect ratio: 3:4.
>>
Anyone tryed and used the official Circlestone Greg Rutkowski lora appart from reading the metadata?
Isn't it total shit? The demo images look nothing like his art but I downloaded it anyway and used it and it's super cherry picked. Only works with vague prose style and with the slop "dramatic oil painting" prefix. If you use some anime character it completely loses the oil painting brushstroke dark fantasy effect, what a garbage lora.
>>
>>
>>108672641
>>108672715
this model has a weird noise pattern, doesn't it?
>>
>>108672991
Sorry but Cosmos is literally incapable of disentangling styles and characters. I'm the anon from >>108673089. The only interesting thing it has is prompt adherence. Cosmos can't grasp the concept that an anime character can have a different style applied. I downloaded an Elden Ring style lora too and same shit, only works for vague characters. The second you mention a specific anime character the lora effect stops working and loses all the aesthetic. This never happened to me with Neta Lumina or even SDXL.
>>
>>
>>
>>108673104
Not at home now, but will share because it's very noticeable. This happens with a lot of style loras. If you prompt an anime character the lora stops working and it defaults to a diluted style. I'm using the lora trigger words and the anime character without any artist styles or coloring or mediums.
But if you're on a pc download
-The official Anima Greg Rutkowski lora
-search CivitAI for an Elden Ring lora
-and whatever other style loras you want.
Test the lora without a specific anime character, then test with a specific character and look at the difference in style.
>>
>>
File: skill issue sailor moon.png (1 MB)
1 MB PNG
>>108673135
>Cosmos is literally incapable of disentangling styles and characters
>>
>>
>>
>>
>>
>>
File: Ohio impressed.jpg (886.1 KB)
886.1 KB JPG
>>108672951
>>
File: zit-00016.jpg (1.8 MB)
1.8 MB JPG
>>108673081
have a zit again
>>
>>
>>108673200
Loss is the primary way training happens for AI model.
Though it doesn't work the way it usually does for diffusion.
It zigzags, but it is overall supposed to decrease a bit initially and then remain same for a while.
In rare case it can spike a lot suddenly up or down, meaning something has gone wrong.
But most of the time you let it churn at the final value for the correct time, if you do it longer than it needs, it would result in a fried model.
Shorter would result in an underbaked model.
Validation loss is supposed to be a way to independently determine if the model is starting to get fried.
Though I never managed to get it working properly.
>>
File: goofyahhhbackground.png (1 MB)
1 MB PNG
>>108673222
wtf sampler did you use lmao, it looks weird
>>
>>
>>108673272
>How do you measure that?
There is literally no way to measure that besides trial and error and experience nonnie.
It depends on which model, what type of lora, dataset quality/quantity, LR, batch, and other training parameters.
You are going to underbake or fry until it clicks.
3-4k steps is generally a good starting point for loras.
Don't pay too much attention to the loss.
>But I don't know if they are undercooked or overcooked
If you undercook, the character/style/concept has poor resemblance to the training dataset.
If you overcook, you will get weird gens showing that irrelevant details from the dataset has been learned (this can also happen due to poor dataset diversity, quantity or very poor captioning, so in effect you can undercook and overcook at the same time!).
>>
>>
>>
File: GPT Image 2.png (1.7 MB)
1.7 MB PNG
Local lost so bad...
>>
>>
>>
>>
So, I've been using automatic1111 forever, and, I'm trying to transition to confyui. Got it installed, it works , but, I need some basic workflows... Like, one that has hires fix and lora settings? I've tried making some and "wiring" it up... Well, Im just big dumb. Where to get some out of the box workflows for SD?
>>
>>
>>
File: 1769055197808443.jpg (467.5 KB)
467.5 KB JPG
>>108673081
i hate how they massacred nbp. theyre definitely running a quantized version now
pre quantized nbp was truly something else, $100 a month for 1000 4k images a day was so worth it at the time
>>
>>108673320
>It seems mostly undersaturated to me.
It seems like it just accurately recreates what it was trained on to me. If you prompt artists that make desaturated art it will look desaturated, saturated artists will produce saturated results. Or you could use keywords like colorful, monochrome, etc. in the positive or negative depending on your aims to counteract that. Don't tell me you guys are just prompting "masterpiece, best quality" for aesthetics and nothing else?
>>
>>108673382
That's why local is important.
The weights are yours and they can't take it.
The API jews will gladly charge you hundreds of dollars a month and serve the shittiest quality possible that they think they can get away with it.
I also think they are straight up doing undisclosed distills and still serve those models as the same model.
>>
>>
>>
>>
>>
File: comfy countdown.png (56 KB)
56 KB PNG
[BREAKING NEWS]
https://comfy.org/countdown
ComfyOrg is counting down to a major release?? What could this be?
>>
>>
>>
>>
>>
File: 1766102423868319.jpg (76.1 KB)
76.1 KB JPG
>>108673331
it's rather sad for API studios. they quickly fall into oblivion, and the local community manages to create similar things, even with outdated tools kek
>>
>>108673473
90 images, I'm trying to make a style Lora. I think I tagged all correctly. I'm experimenting with some values but I can't manage to get things right. I think I'm overcooked, the models look "creamy" and oversaturated sometimes, like greasy.
0.0003 learning rate, 2 repeats, 15 epochs.
>>
>>
>>
File: 1746351050967631.png (33.9 KB)
33.9 KB PNG
>>108673555
This is it, LTX 3, 8b model, as good at Seedance 2.0
>>
>>
>>
File: nbp Q8.png (2.8 MB)
2.8 MB PNG
>>108673397
>I also think they are straight up doing undisclosed distills and still serve those models as the same model.
this happens for sure, but sadly even the distilled versions are better than local
so if you want to use sota models youre kinda forced to get scammed. hopefully the gpt 2 release changes the landscape a little
>>108673555
theyre going to intergrate openclaw into comfyui
>>
>>
>>
>>
File: 1.png (1.3 MB)
1.3 MB PNG
>>108672554
Ok. I am trying better parameters.
This shit also uses absolutely deranged amount of VRAM (Saw 74 gigs, and no not my GPU sadly), though I think I set the parameters wrong.
Trying again
>>
>>
>>
>>
File: 1773492392886208.png (39.4 KB)
39.4 KB PNG
>>108673555
that madlad finally did it
>>
>>
>>108673615
>>108673685
dont act as if you wouldnt cream your little undies if you could gen images like that locally
>>
>>
>>
>>
>>
>>
>>
>>
File: output.png (1.3 MB)
1.3 MB PNG
>>108672554
>>108673679
Ok still not a great image at least it doesn't look like a lodestone model anymore.
>>
>>
File: file.png (13.3 KB)
13.3 KB PNG
>>108673555
my dick is ready
>>
File: me irl.png (3.8 MB)
3.8 MB PNG
>>108672527
rate my outfit /g/
>>
>>
>>
>>
>>
>>
File: Gemini_Generated_Image_d2k2kud2k2kud2k2 (1).png (1.7 MB)
1.7 MB PNG
>>108672527
damn, this ai stuff is pretty funny
>>
>>
File: 770783717616289.png (1.8 MB)
1.8 MB PNG
>>
File: anima_054.png (1.4 MB)
1.4 MB PNG
>>
File: Anima_01966_.png (917.8 KB)
917.8 KB PNG
>>
It would be great if tdrusell discovered a format converter for Lora from SDXL to Anima.
If you can convert a jpg file to png and a zip file to rar or Word to PDF, why not SDXL to Anima?
This is the zeitgeist that is holding back local.
>>
File: Gemini_Generated_Image_pdumiopdumiopdum.png (1.2 MB)
1.2 MB PNG
>>
>>
File: 8.png (1016.3 KB)
1016.3 KB PNG
Your Taylor Swift that took dozens of gigabytes and 5 minutes on an RTX PRO 6000 sirs.
(I had a gen with better facial likeness but the rest was not good)
Maybe they fucked the inference script on both the github and hf, but maybe it's not good and they cherry picked the images.
I am gonna run a few more tests though, I am 3 dollars into this.
Sunken cost is a bitch.
>>
ZAMN! 128 ranking made my lora like 529mb big. Dats a big lora. The samples are awful (picrel) so far but I'm thinking it's going to level out much better over all.
>>
>>
>>
>>
>>108674104
I'm gonna piss in your mouth
>>108674106
It's epoch 3 literally the 3rd sample. On my smaller lora at ranking 16 it was fine I think this bitch is just taking a minute to digest everything. There is 1800+ images plus tags.
>>
>>
>>
File: 1752385604939813.jpg (927.8 KB)
927.8 KB JPG
Is there a way to use new schedulers and samplers on forge neo?
>>
>>
File: file.png (414.2 KB)
414.2 KB PNG
>>108673166
>>108673106
damn I knew I've seem this one before
https://x.com/SwayStar123/status/1960325938706554950
>>
>>108674039
This is not gonna work since most SDXL loras are unet+te and you only train unet on anima.
For the unet part you might be able to apply delta between SDXL baseline and lora to anima, but between wildly different architectures this is not likely to work well.
>>
>>
>>
File: 9_thinking.png (888.4 KB)
888.4 KB PNG
>>108672554
>Make a detailed infographic about ComfyUI.
This is with "thinking" enabled by the way. Here is what it "thought":
>Thinking: ComfyUI's user interface is showcased in a clean, minimalist design, featuring a central modal window with a white background and a blue header labeled "ComfyUI." The window displays a list of four options, each accompanied by distinct icons: "General Settings" with a gear icon, "Accessibility" with a wheelchair icon, "Accessibility" with a wheelchair icon, and "Accessibility" with a wheelchair icon. Below the list, a blue button labeled "Continue" is prominently positioned. The surrounding interface includes a sidebar on the left with icons for "Home," "Profile," and "Settings," and a top navigation bar with options such as "Home," "Profile," "Settings," and "Help." The overall layout is organized and user-friendly, emphasizing clarity and accessibility, with a focus on simplicity and ease of navigation. The visual style is modern and functional, utilizing a limited color palette of blue, white, and gray to create a professional and approachable aesthetic.
LMAO fuck this garbage.
>>
>>
>>
>>
File: 1747425132869663.jpg (3.8 KB)
3.8 KB JPG
>>108673555
i hope comfyui won't be unstable again. It's finally stable...
>>
>>
>>108674122
it's a complete waste of time given training loras never takes that long, and if the lora degrades at all in the conversion you might as well retrain it, it's also a chance to update with new data etc
>>
>>
File: ComfyUI_07024_.png (3.4 MB)
3.4 MB PNG
>>108673183
Is this the new fud?
>@stonetoss, Ayanami rei wearing a white bodysuit.
>>
>>
>>
>>108672554
It also can make you wait minutes for a mid image caption that a 2024 LLM would give, and would have done so two orders of magnitude faster!
What an amazing model hahahahaha!
I feel like a complete sucker for wasting time and money on this.
On the bright side, I lost my text diffusion virginity today.
I am not even gonna bother testing its edit capabilities, yep I am good.
I hope the curiosity of the anon who originally posted it is satisfied.
>>
File: 1765834556444184.jpg (1.2 MB)
1.2 MB JPG
the two anime girls have their hands on their hips instead of in the air.
klein edit q8 distilled, pretty cool.
>>
>>
File: 1760485396454689.jpg (627.8 KB)
627.8 KB JPG
>>108674436
oops, thats not the right image.
>>
>>
File: ComfyUI_07080_.png (3.4 MB)
3.4 MB PNG
>>108674396
>>
>>
>>
>>
File: 1775857383409052.jpg (34.3 KB)
34.3 KB JPG
Which model would be best for genning poses with mannequins to use as drawing references?
>>
>>
File: civitai.png (54.1 KB)
54.1 KB PNG
How the fuck do you have a website this big with only 30% up time?
It's like every time I want to use or upload anything the site is shitting itself.
>>
File: seedance comfy.png (527.5 KB)
527.5 KB PNG
>>108673555
>>
>>
>>
File: comfy digital id.png (88.8 KB)
88.8 KB PNG
>>108674848
They'd rather implement digital ID than improve local
>>
>>
>>
>>
>>108674881
only sfw sample I can share but it's getting better. man I hope it's an anima release but ngl regardless if I have to restart my training
>>
>>108674856
https://huggingface.co/unsloth/FLUX.2-klein-9B-GGUF/tree/main
using the q8 from there, any will do but best quality for the file size, comfy has templates for klein edit.
>>
File: 1761307784688859.png (1.2 MB)
1.2 MB PNG
>>108674436
hey I recognize this!!!
>>
>>
>>
File: 1772899325960428.png (1.4 MB)
1.4 MB PNG
the painting of the anime girl on the car is wearing a black business suit.
neat
>>
>try prompt relay to make long clips work better
"the woman pulls out a revolver and aims it at the camera and fires off one round with large muzzle flash as the camera falls back looking at the sky while the woman stands above the camera looking down and laughs at the camera.
|
the woman raises her hand holding a large revolver. she aims the revolver at the camera. she then pulls the trigger and the revolver fires with a large muzzle flash.
|
the camera falls backwards tilting the camera upwards revealing the blue sky.
|
the woman walks into frame laughing while looking down on the camera."
https://litter.catbox.moe/7znvt3pb46nyp421.mp4 (EXTREMELY LOUD)
Another schizo node, what a shame.
>>
>>
File: Screenshot 2026-04-24 083932fwqf.png (313 KB)
313 KB PNG
>download interesting workflow
AUGH
>>
>oh civitai finally has seedance 2
>test it out with an image that looks stylized but with detailed shading
>filtered for detecting a "real person"
I can't believe how cucked it became. It's legitimately unusable unless you're doing literal cartoons.
>>
File: Flux2_Klein_9b_kv_00343_.png (1.1 MB)
1.1 MB PNG
Oh nice, that klein consistency lora does wonders.
>>
>>
>>
>>
>>
File: 1770426184434932.jpg (1.8 MB)
1.8 MB JPG
>>108674924
sankyu
>>
>>
>>108675463
see: >>108672647
they're shilling api because they know they can shove as many ads into the ui as they want because local has no other options.
>>
File: civitai.png (5.9 KB)
5.9 KB PNG
>>108675463
It can always be worse.
>>
>>
>>
>>
>>
>>
>>
File: HGoffrpakAAcT21.jpg (194.8 KB)
194.8 KB JPG
>>108673555
Sorry Comfy I will spoil your news
>HANASEE will release its proprietary image generation model, “HANASEE-image-1.0,” specialized for vertical-scrolling manga expression.
>This model is based on an open-source image generation model and has been further trained with supervision and collaboration from professional manga artists. Its strengths include consistent character representation, a manga panel–optimized art style with visual coherence, and composition tailored specifically for vertical-scrolling manga.
>>
>>
>>
>>
>>
>>
>>
>>108673331
Looks like anon cloudfag have never run comfyui...
Also return to your boring thread with pegi12+ gen
>>108653190
>>
>>
>>
>>
File: AniStudio-00792.png (1.5 MB)
1.5 MB PNG
Ani Bee delivered!
>>108675788
>>108675805
lol, no, that model is not mine. In the end that project did not go anywhere, I had fun but I really want to move onto the next project I have planned
>>108675821
man, I want to get this in my app soon. I want to be able to put workflows into masking, brush and selection tools to customize everything in your mobile
>>
>>
>>
>>
>>
>>
File: AniStudio-00793.png (1.1 MB)
1.1 MB PNG
>>108675942
yup. kek.
>>108675977
I should have went back this week with my business partner and I miss the japan team though but I'm too busy rn for travelling as much as I did for the past two years.
>>
File: 190716868068185.png (3.1 MB)
3.1 MB PNG
>>108674089
I gave up on sampling and just disable it now. It never seems to accurately represent what the model ends up generating in comfy.
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
File: file.png (10.1 KB)
10.1 KB PNG
i tried that anima -> zit workflow but im missing some stuff it seems but now im nervous
i don't want to spend 8 hours debugging my comfyui install by updating anything
the [object Object] one is the RTX UPSCALE LATENT node
what do
>>
>>
File: 1773077258853740.png (61.1 KB)
61.1 KB PNG
>>108676178
don't you have ComfyUi manager installed? that way you can use it to install the missing custom nodes
>>
>>
File: Screenshot 2026-04-24 071031.png (108.9 KB)
108.9 KB PNG
>>108676178
install https://github.com/Comfy-Org/Nvidia_RTX_Nodes_ComfyUI
the rtx upscaler isn't a latent space upscaler, so whatever you are looking at is probably a subgraph with a vae decoder and the rtx upscaler to keep the wf clean.
>>
File: 1751932688614321.png (38.2 KB)
38.2 KB PNG
>>108676223
oh yeah you're right, you don't have much choice but to update comfyui anon, or you simply don't use the node
>>
>>108676178
Do apython -m pip install -U --no-build-isolation nvidia-vfx --index-url https://pypi.nvidia.com
inside your python_embeded folder.
>>
File: IMG_20260424_172626.jpg (630.7 KB)
630.7 KB JPG
Yoo whats the best photo to drawing model/lora nowadays? I was browsing and found this ai site ads. Tried it and it was actually not bad? Pic is the result.
>>
>>
>>
>>
File: ComfyUI_07120_.jpg (579.9 KB)
579.9 KB JPG
>>108674869
It's been a while since I collected the images so my memory might be off a little, I originally made a lora on pony with them way back. I split the panels with kumiko:
https://github.com/njean42/kumiko
Then went through them by hand and deleted any that got messed up or seemed like a bad image to train on (unusual content, bad framing, etc). I wanted the font so I kept in images with speech/text, trying to keep it around 50% text/no text. After that I upscaled all the remaining images with waifu2x. For this lora I regenerated all the captions with gemma 4 and took out the tags (mostly out of laziness).
>>
>>108675913
>>108675977
>>108675998
>>108676017
>>108676032
stop replying to yourself, you subhuman raped retard
>>
>>
File: 1771614326430397.png (186.1 KB)
186.1 KB PNG
>>108673555
>a few coming
hmm...
>>
>>
>>108676750
what could it even be? they usually roll out new api and local support unceremoniously, like a blog post saying they now support seedance 2.0, have they done countdowns in the past? i never pay attention to hype.
>>
>>
>>
>>
>>
>>
>>
>>
>>
File: 1vcZ99F3RsQ.jpg (101.3 KB)
101.3 KB JPG
>>108673555
This shit is gonna be ComfyClaw or similar bloatslop
>>
>>108677089
best how? they don't optimize anything in particular. the biggest versions of current qwen can certainly pad out stuff into a pretty decent story with a LOT of VRAM and compute, but in most models that doesn't necessarily result in better scoring for x thing
and for the most part throwing in random *booru tags or nouns or w/e also gives you additional random things if that's the goal
basically LLM prompt rewrite is dubious. maybe for future video models where there's a sequence of events to invent and you want something that might often happen
>>
<4 hours for christmas
https://comfy.org/countdown
https://www.youtube.com/watch?v=tapCjTA2E9Q
>>
File: 1747511340905070.png (118.2 KB)
118.2 KB PNG
>>108677389
and then we get something like
>Nodes 2.0 is finalized now no more beta. Say good bye to the old comfyui and welcome our new comfyUI.. UI
>>
>>
>>108677404
>>108677407
I hate Nodes 1.0. Nodes 2.0 have less visual distraction, they’re more muted, easier to see, and easier to organize, which I really like. Nodes 1.0 look like mini clowns.
>>
File: ComfyUI_22447.png (2.7 MB)
2.7 MB PNG
>>108673331
>>108674003
>still puts people out in the middle of a street
Even the big boy models haven't overcome that old SD1.5 quirk, huh?
>>108677404
Maybe he got the performance up to a massive 15fps when typing now. You never know!
>>
>>
>>
>>
>>
>>108677686
preview 3 only came out 2 weeks ago, but god it would be based if if they did a countdown and a big red carpet for anima.
i think that anti-anima fellow would legit kill himself.
>>108677699
i could actually see something like this happening if they really wanted to lean into comfy cloud.
>>
>>
>>
>>
>>
>>
>>
File: 1768710931277346.jpg (2 MB)
2 MB JPG
>>108677805
because fp8 is not that good, go for Q8 instead
>>
File: 1761888123515562.jpg (470.2 KB)
470.2 KB JPG
>have to make an account to download klein
>>
>>
>>108673402
>>108676104
There is quite a bit of different between both (v-pred and epsilon)
But here is what I post when people ask for noob parameters:
python sdxl_train_network.py --v_parameterization --pretrained_model_name_or_path ~/models/NoobAI-XL-Vpred-v1.0.safetensors --tokenizer_cache_dir ~/lora/tokenizercache/ --train_data_dir ~/lora/images/ --shuffle_caption --caption_separator , --caption_extension .txt --keep_tokens 1 --resolution 1024 --cache_latents --cache_latents_to_disk --enable_bucket --min_bucket_reso 256 --max_bucket_reso 2048 --bucket_reso_steps 64 --dataset_repeats 8 --output_dir ~/lora/output/ --save_precision fp16 --train_batch_size 2 --max_token_length 225 --xformers --max_train_epochs 10 --persistent_data_loader_workers --max_data_loader_n_workers 1 --seed 44453 --gradient_checkpointing --mixed_precision bf16 --logging_dir ~/lora/logs --log_with tensorboard --zero_terminal_snr --loss_type l2 --training_comment "Trigger word is blabla" --save_model_as safetensors --optimizer_type Prodigy --learning_rate 1.0 --max_grad_norm 1.0 --optimizer_args weight_decay=0.01 decouple=True d_coef=1 use_bias_correction=True safeguard_warmup=True betas=0.9,0.999 --lr_scheduler cosine --lr_warmup_steps 0 --min_snr_gamma 5 --prior_loss_weight 1.0 --network_dim 16 --network_alpha 1 --network_dropout 0.08 --network_module networks.lora --save_every_n_epochs 1
>>
>>
>>
>>
>>108677940
That wasn't a style lora.
Repeat information is useless without knowing dataset size.
What are you even basing "Snr gamma 5 seems a bit low." on? It's the value suggested in its paper and still enough to fuck up some loras sometimes.
>>
>>
>>
>>
>>108677950
if you're playing about playtime here's here
https://huggingface.co/Playtime-AI
>>
>>
>>
>>
File: 1763758150426851.png (1.8 MB)
1.8 MB PNG
>>108678077
it depends...
if you're a pedophile, go with anima
if you want straight up porn, find a zit mix (be prepared for body horror)
if you want tasteful lewds, go with gpt image 2
>>
>>
>>
>>
>>
>>
>>
>>
File: playtime.png (309.2 KB)
309.2 KB PNG
>>108678318
>>108678326
Damn, I was just going to post about the civitai hateboner against this guy too, what did he do? civitai mods just got them banned from reddit, huggingface, I think its bghira to be honest
>>
>>
>>
>>
>>108678390
>I think its bghira to be honest
probably, he's even lurking here kek >>108678330
>>
>>108678390
you might find them on
https://civitaiarchive.com/
>>
>>
>>
>>108678390
>No idea who the fuck bghira is
>Google the word
>Furtroon pfp on hf
I would believe it. I don't need further evidence at all.
As for civit jannies, they are total faggots like all jannies everywhere and they want to divest from "high risk" shit like video loras because of deepfake legal risk.
They are too pussy to ban it officially, so they slowly boil the frog by sporadically nuking major creators one by one.
>>
>>
>>
>>
>>
>>108678312
Go back to your room retard
>>108653190
>>
File: the plan is in his back!.png (267.5 KB)
267.5 KB PNG
>>108678427
what plan? you just pressed a button this isn't prison break
>>
File: 1729680298114.jpg (45.9 KB)
45.9 KB JPG
>>108678407
https://www.reddit.com/r/unstable_diffusion/comments/1srqlkb/ltx23_tit ty_drop_lora_by_playtime_ai_link_in side/
Damn, and just when he published this lora
I need a tittydrop lora for wan.2.2 so bad, I just have an old 2.1 one that works so-so
I guess I'll train one on my own. Dude needs to go the telegram group route, its the only safehaven for realistic ai nsfw content, fuck these sites (reddit, civitai, huggingface)
>>
File: furfaggot troon suicide.png (26.3 KB)
26.3 KB PNG
>>108678427
>>108678330
Kill yourself you mentally ill furfaggot troon
>>
File: 1750981375863865.png (274.1 KB)
274.1 KB PNG
I want to sleep but I'm gonna miss that, you better deliver Comfy!
>>
>>
>>
>>108678094
>>108678459
https://gofile.io/d/xsGBHe
LTX-2.3 - Titty Drop.safetensors
9CC9B261405DEC6AF8ED76BAB198BB72
literally downloaded this and saw his account was banned when i refreshed the page, what a save
>>
File: 1772421599175215.gif (956.7 KB)
956.7 KB GIF
>>108678519
not all hearos wear capes anon
>>
>>
>>108678519
>https://gofile.io/d/xsGBHe
God bless you anon, gotta download fast before bghira reports it (he is lurking here)
>>
File: reported.png (8.8 KB)
8.8 KB PNG
>>108678519
>>108678553
too late I guess
>>
>>
>>
File: Downloading.png (4.9 KB)
4.9 KB PNG
>>108678562
kek, gofile was ip-range banning me, nothing that a good VPN cannot fix ;)
Thanks anon again
>>
>>
>>