Thread #108655751
HomeIndexCatalogAll ThreadsNew ThreadReply
H
I Love LDG Edition

Discussion and Development of Local Image and Video Models

Previous: >>108652848

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, Upscalers, & Workflows
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info
https://openart.ai/workflows

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/ostris/ai-toolkit
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/musubi-tuner
https://github.com/tdrussell/diffusion-pipe

>Z
https://huggingface.co/Tongyi-MAI/Z-Image
https://huggingface.co/Tongyi-MAI/Z-Image-Turbo

>Anima
https://huggingface.co/circlestone-labs/Anima
https://tagexplorer.github.io/

>Qwen
https://huggingface.co/collections/Qwen/qwen-image

>Klein
https://huggingface.co/collections/black-forest-labs/flux2

>LTX-2
https://huggingface.co/Lightricks/LTX-2

>Wan
https://github.com/Wan-Video/Wan2.2

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
https://rentry.org/mvu52t46

>Illustrious
https://rentry.org/comfyui_guide_1girl

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe | https://litterbox.catbox.moe/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Collage: https://rentry.org/ldgcollage

>Neighbors
>>>/aco/csdg
>>>/b/degen
>>>/r/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

>Maintain Thread Quality
https://rentry.org/debo
https://rentry.org/animanon
+Showing all 311 replies.
>>
>mfw Resource news

04/21/2026

>MegaStyle: Constructing Diverse and Scalable Style Dataset via Consistent Text-to-Image Style Mapping
https://jeoyal.github.io/MegaStyle

>UDM-GRPO: Stable and Efficient Group Relative Policy Optimization for Uniform Discrete Diffusion Models
https://github.com/Yovecent/UDM-GRPO

>Noise-Adaptive Diffusion Sampling for Inverse Problems Without Task-Specific Tuning
https://github.com/NA-HMC/NA-HMC

>Evolutionary Negative Module Pruning for Better LoRA Merging
https://github.com/CaoAnda/ENMP-LoRAMerging

>DuQuant++: Fine-grained Rotation Enhances Microscaling FP4 Quantization
https://github.com/Hsu1023/DuQuant

>Generalizable Face Forgery Detection via Separable Prompt Learning
https://github.com/OUC-YER/SePL-DeepfakeDetection

>Adaptive receptive field-based spatial-frequency feature reconstruction network for few-shot fine-grained image classification
https://github.com/ICL-SUST/ARF-SFR-Net.git

>ComfyUI-DiffAid-Patches: Inference-time Diff-Aid-inspired text-conditioning patches for ComfyUI
https://github.com/xmarre/ComfyUI-DiffAid-Patches

>modl: Train LoRAs and generate images on your own GPU. Web UI + CLI
https://github.com/modl-org/modl

>ComfyUI-KleinRefGrid: Turns reference images into reference_latents
https://github.com/xb1n0ry/ComfyUI-KleinRefGrid

>node-banana: Free and open node based generative workflows
https://github.com/shrimbly/node-banana

>Extending One-Step Image Generation from Class Labels to Text via Discriminative Text Representation
https://github.com/AMAP-ML/EMF

04/20/2026

>Elucidating the SNR-t Bias of Diffusion Probabilistic Models
https://github.com/AMAP-ML/DCW

>(1D) Ordered Tokens Enable Efficient Test-Time Search
https://soto.epfl.ch

>Frequency-Aware Flow Matching for High-Quality Image Generation
https://github.com/OliverRensu/FreqFlow

>From Zero to Detail: A Progressive Spectral Decoupling Paradigm for UHD Image Restoration with New Benchmark
https://github.com/NJU-PCALab/ERR
>>
>mfw Research news

04/21/2026

>DreamShot: Personalized Storyboard Synthesis with Video Diffusion Prior
https://arxiv.org/abs/2604.17195

>Speculative Decoding for Autoregressive Video Generation
https://arxiv.org/abs/2604.17397

>LIVE: Leveraging Image Manipulation Priors for Instruction-based Video Editing
https://arxiv.org/abs/2604.17021

>AdaCluster: Adaptive Query-Key Clustering for Sparse Attention in Video Generation
https://arxiv.org/abs/2604.18348

>Coevolving Representations in Joint Image-Feature Diffusion
https://arxiv.org/abs/2604.17492

>Reward Score Matching: Unifying Reward-based Fine-tuning for Flow and Diffusion Models
https://arxiv.org/abs/2604.17415

>UniGeo: Unifying Geometric Guidance for Camera-Controllable Image Editing via Video Models
https://arxiv.org/abs/2604.17565

>FlowC2S: Flowing from Current to Succeeding Frames for Fast and Memory-Efficient Video Continuation
https://arxiv.org/abs/2604.17625

>ReCap: Lightweight Referential Grounding for Coherent Story Visualization
https://arxiv.org/abs/2604.18575

>UniCSG: Unified High-Fidelity Content-Constrained Style-Driven Generation via Staged Semantic and Frequency Disentanglement
https://arxiv.org/abs/2604.17850

>Towards Robust Text-to-Image Person Retrieval: Multi-View Reformulation for Semantic Compensation
https://arxiv.org/abs/2604.18376

>mEOL: Training-Free Instruction-Guided Multimodal Embedder for Vector Graphics and Image Retrieval
https://scene-the-ella.github.io/meol

>Depth Adaptive Efficient Visual Autoregressive Modeling
https://arxiv.org/abs/2604.17286

>When Text Hijacks Vision: Benchmarking and Mitigating Text Overlay-Induced Hallucination in Vision Language Models
https://arxiv.org/abs/2604.17375

>Cross-Modal Attention Analysis and Optimization in Vision-Language Models: A Study on Visual Reliability
https://arxiv.org/abs/2604.17217

>Spatiotemporal Sycophancy: Negation-Based Gaslighting in Video Large Language Models
https://arxiv.org/abs/2604.17873
>>
>>108655785
>>108655793
stop deblessing /ldg/ thread schizo
>>
mogged
>>
>>108655803
There's a picture of my penis in there. How the fuck?
>>
>>108655803
here's the context lol
>>108654985
>>108655069
>>
File: roastie.png (8.5 KB)
8.5 KB
8.5 KB PNG
Why people are so naive or ignorant when it comes to NSFW AI generated content, I always get asked "what AI you use bro?" when I share something on reddit or X, they think I just write a prompt on a site and then AI does all the magic?
>>
>>
File: roastie2.png (7.5 KB)
7.5 KB
7.5 KB PNG
>>108656011
For context, this OF whore wrote me a DM asking me how I generate my videos (I use wan, VACE, LTX, post processing, etc, etc), I tell her that I have a local setup and I use several workflows, that is not that simple but I'm happy to collab (for money ofc) and then she writes me that crap >>108656011
>>
>>108656011
>they think I just write a prompt on a site and then AI does all the magic?
yes? the future is AI models using tools to correct themselves and build all pieces together (like here, GPT Image-2 makes an image -> looks at it -> notices the issues -> fixes those issue with an image edit process) >>108655670
>>
>>108656084
generally when dealing with retards you charge a retard tax, I always give a stupid big quote to retards and sometimes they take it and it's worth the headache
>>
>>108656085
Have you not learned anything from the past few years? That's not how it works, especially with SaaS (tools get nerfed, rugpulled by the big corpos) and even more with NSFW content and we're talking about today, not the future you idiot
>>
>>108656105
>not the future
definitely the future, the /lmg/ fags are incorporating tools on gemma 4, as usual /ldg/ is completly clueless about the news and how to move forwards, it's filled with retards like you
>>
Why do anons lurk and post here if they think every other anon here is a retard? Surely they'd find some other place to post...
>>
>>108656122
AI doesn't think so any workflow based on AI "thinking" to itself will just end in piles of aesthetic trash. Even the best models still can't handle 2000 lines of code without going schizo on the task and that's code that is relatively simple, recursion destroys most models same with complex A() -> F() -> C() -> E() -> D() relationships
>>
>>108656122
gemma4 is local thoever? anons here use it to caption and write prompts all the time desu desu.
>>
>>108656147
>thoever
saar?
>>
>>108656122
>/lmg/ fags are incorporating tools on gemma 4
no?
>>
>>108656139
localchads live rent free in their minds.
>>
I saw GPT-Image 2 released. Can anon share some gens?
>>
It's over for local
>>
>>108656122
I can tell you that any serious genner/trainer (myself included) is using gemma4, you're just ignoring my original post that was that people overlook the process of generating NSFW AI content, especially images and videos, LLM and text/code based crap is easy as shit thats why /lmg/ threads are filled with happy people and /ldg/ is filled with frustrated anons that can't generate anything that other few anons can
>>
>>108656165
yes, cute cat
prompt:
>generate cat image
>>
>>108656174
How many rugpulls until you learn that they will take away the good model after the good press dies down? We've done this like 6 times now.
>>
>>108656011
>>108656084
Total OF whore death
>>
>>108656165
>Can anon share some gens?
no, there's a thread for that and the images are already shared here >>108653190
>>
>>108656189
its always the same cycle of these "groundbreaking" models, they get hyped, users start generating viral stuff, copyright holders get mad, tools get nerfed, userbase gets mad and tools dropped
>>
>>108656227
>its always the same cycle
there's even a name for that
https://en.wikipedia.org/wiki/
>>
>>108656227
No it's the costs, they are expensive to run so the first they do is cut compute, that's outside of the safety. Nano Banana is so bad because they do bullshit like switch you to fast mode even though it's completely shit. Also the core model just got worse slowly but surely.
>>
>>108656174
The killed sora for this.
>>
Tested GPT-Image in the API.
Can do celebs by attaching picture (edit mode).
No NSFW as expected though, maybe allows limited artistic stuff but didn't bother pushing too much.
Was able to give a woman cleavage by cloth swapping, but that's the extent of what I got.
Dogshit for style transfers, hallucinates what's style, what's content, omits or makes up details. 4 years into this and still not a single good model on this front.
Can make detailed infographics with text without slopping the text, for whatever that's worth.
>>
>>
>>108656284
>>108656199
>>
>>108656243
nice wiki page.
>>
>>108656320
lmao
https://en.wikipedia.org/wiki/Enshittification
>>
I want to make an anime style LoRA. I checked the official repository, but it seems like the program is only compatible with Linux. Is there another workaround?
>>108656284
How about copyrighted anime characters?
>>
>>108656153
>no?
yes >>108656365
>>
>>108656376
*Anima
I want to make an Anima LoRa but it seems it is only Linux compatible
>>
>>108656227
nope, its different now
google finally has a worthy image gen competitor, so they wont be able to fuck around with their users anymore.
here's whats going to happen. very soon, google will release nbp 2, which will be better than gpt image 2
it will look similar to the vibecoding war between claude and codex. if either of them starts to enshittify their model, then users will just jump ship.
apichads won, and as a consequence, localcucks will benefit since the chinese models will train off the api outputs
you're welcome
>>
>>108656383
https://github.com/gazingstars123/Anima-Standalone-Trainer
>>
>>108656383
https://github.com/gazingstars123/Anima-Standalone-Trainer

I'm using this on Linux but it shows Binbows support. No idea if it's any good on Winblows but try it I guess.
>>
>>108656389
Indians should be banned from the internet
>>
>>108656389
>localcuks benefit since chinese models will train off the api outputs
fuck that shit dude, China must stop eating the shit of API western models and do the Z-image turbo way (the kino way)
>>
>>108656389
NBP is still better that 2 at some things. But it's good that OpenAI has something that isn't complete dogshit.
>>
>>108656389
>words words words
APIcucks cant meme
>>
>>108656398
>>108656391
Thanks and you can replicate the same settings tdrusell shows in his official Anima Lora?
>>
>>108656084
think she'd let a humble localchad suck on her toes or something? how big are her tits
>>
>>108656423
stupid sexy gooks
>>
>>108656389
>so they wont be able to fuck around with their users anymore.
LOL
O
L
>>
>>108656376
>I want to make an anime style LoRA. I checked the official repository, but it seems like the program is only compatible with Linux. Is there another workaround?
sd-scripts supports it. I think it works on Windows, not sure.
>How about copyrighted anime characters?
I am done testing for today but I wouldn't expect it to be too pissy about it. If anything you are more likely to run into issues with Disney, Sony, Nintendo etc. characters.
>>
>>108656426
I don't know who that is or what you're talking about. Sorry.
>>
What is the status of copyrighted anime characters with GPT Image 2? We won?
>>
>>108656165
A plain prompt "Warhammer 40k crossover with one piece"
>>
>>108656426
Well I can't replicate the exact lora as he (understandably) does not provide the images used but the settings he provides seem like sane defaults. No "catastrophic" forgetting or anything I've heard claimed about training loras on it.
>>
>>108656466
What I mean is that TDRussell shared some LoRA training settings to use in his only Linux compatible workflow. My question is whether I can select the same settings here >>108656391
>>
>>108656466
He shared the training dataset for his rutkowski lora.
>>
>>108656458
why does it look like someone injected extra noise into the last steps of the diffusion process
>>
>>108656466
Sorry, i inderstood thanks
>>
>>108656486
There's nothing wildly out of the scope of the average trainer that I can tell. So you should be able to use the same settings just fine.
>>
>>108656486
Training settings do not depend on OS so the answer is yes if that tool has implemented every relevant feature.
>>
>>108656165
>>
>>108656513
that's crazy good. too bad no porn so it's worthless.
>>
>>108656458
howd they manage to keep the ugly sepia poison
>>
>>108656513
wtf this is next level, holy fuck...
>>
reminder that if you want to talk about that model you have to go here, this is a fucking local thread in case you forgot
>>108653190
>>108653190
>>108653190
>>
>>108656513
the hands are still fucked though
>>
>>108656513
hands are on another level
>>
>>108656389
Midjourney still mogs both NB2 and GPT Image-2 in terms of aesthetics.
>>
>>108656552
emoboy4ever had an accident. leave his hands out of this
>>
>>108656513
No fucking way...
>>
>>108656513
I didn't realize how much better an AI image gets when the text is correct, this makes the difference
>>
nice samefagging desu
>>
Cloudcucks, I dedicate this one to you (generated by yours truly, ACEStep XL 0.7 Merge)

https://vocaroo.com/15lInkgzMLR4

What good is a censored and gated model? VISA/Mastercard ain't changing any time soon, cloud is forever cucked and just a toy that waits 2-3 years for local to catch up.
>>
>>108656513
>solved text
>hands still fucked
:(
>>
>>108656513
Ok the level of detail mogs local to oblivion, sure.
But there are lots of errors too, such as everyone but the Drama Queen shirt girl having fucked hands.
Honestly they seem to have overtuned the detail during the iterative generation process.
Anyway this still proves that the local needs:
1) Autoregressive generation that iterates on the prompt
2) A smart VLM that inspects and guides the generation throughout it
3) Non-slop training dataset
If it wants to compete with API.
>>
after years of trying stuff out and practice
i'm a master of baking goon loras
and you know what
i will not share even a single one with you plebs kek
>>
>108656603
Didn't ask but cool story bro.
>>
>>108656591
>VISA/Mastercard ain't changing any time soon
do you understand that civitai got more and more cucked over time it's because of VISA/Mastercard? it's a poison for both API and localfags
>>
ive stopped genning as much. i guess i am depressed
>>
>>108656598
>Ok the level of detail mogs local to oblivion, sure.
it mogs everything, the lmarena ranking is absurd, never seen such a gap before >>108656174
>>
>>108656603
training a lora is easy bro, get over it
>>
>>108656603
>im not sharing my precious loras
>5 billion loras on civitai alone
wow bro, how will I ever cope?
>>
>>108656591
>ACEStep XL just needs a merge to match the best cloud has to offer in terms of overall audio quality, not even a LoRA

Local audio is saved. Also, I've tested a few LoRAs with it, and even though they're all underbaked, it's absolutely insane at replicating style/voice, far trumps the earlier version.
>>
>>108656591
can you make latina dance music? or quirky goblin music?
>>
>>108656652
90% of civitai loras are crap tho
>>
>>108656676
good thing I make my own because it's literally EZPZ. you should consider joining MAID.
inb4 more cope inb4 yeah but YOUR LORA IS BAD AND STINKY

get real; grow up.
>>
>>108656513
>so many bad hands
chroma bros, api has same limit kek
>>
This is embarrassing to read even by non-existent Julien standards.
He sounds completely mentally buckbroken at this point.
>>
>>108656719
>talking about e-celeb lolcows
B O R I N G
>>
>>108656591
>ACEStep XL 0.7 Merge
where do i get it?
>>
>>108656591
it sounds so bad lool
>>
>>
>>108656652
i don't think even 2% of them come close to my datasets and finetuning but enjoy your low tier gooning slopper lol
>>
>>108656851
cool story, bro
>>
>>108656851
fuck you
>>
I want to play with anima but I don't want to play with comfy.
What are my options for frontends?
>>
>>108656671
>can you make latina dance music

Holy shit kek, prompt/lyrics included moans and it just killed it
https://vocaroo.com/1iwOck91jlFh
>>
>>108656591
kek based
>>
>>108656749
https://huggingface.co/scragnog/ace-step-1.5-gguf-merge-models/tree/main

I'm using ACEStep cpp.
>>
File: fuck off.png (320.5 KB)
320.5 KB
320.5 KB PNG
>>108656591
>Cloudcucks, I dedicate this one to you
I won't be listening to this shit the quality is awful
>>
>Wan 2.2
>LTX-2
>Chroma
>Z
>Flux Klein

Did we plateau?
>>
>>108656591
>ACEStep XL 0.7
hey is there now comfyui support?
>>
>>108656911
Oh I'm sorry you thought local was about GOOD?

use case for GOOD?

Personally, I use it to make songs that praise the mustache man.
>>
>>108656885
damn nice
>>
>>108656911
Show us your definition of "quality"
Oh that's right, you can't even properly meme with audio
>>
>>
>>108657019
how does that even make sense when 80% of contemporary music is about fucking and sucking?
>>
AIs still think chubby = morbidly obese
>>
>>108657056
>he used an API model to make an image saying that API is bad
the irony is on point
>>
>>108657062
can't even put racial slurs on udio/suno as well, that means that all those poor black rappers can't use it proprely, this is racism!
>>
>>108657056
why dont you gen the same image on local?
oh wait, local cant do text
LOLOLOLOLOLOLOL
>>
>>108657100
Ikr kek
>>
35 melties
>>
>>108657082
>>108657100
got 'em bubblin' kek
>>
suno has anti- ace step shills. It's wild.
>>
>>108657110
if you love local so much, then why is it an API image?
>>
>>108656882
Vibecode your own front end that uses comfy as a back end. Unironically. I haven't touched ComfyUI web interface in months. It's bliss.
>>
>>108657125
Because its a meme, the only good thing cloudkeks are good for
>>
>CLOUD API SUCKS
>.... well except for this one case in which it's really good
>and this other case but that's it i swear!!
>okay fine cloud is way better than local but ummm its not free and im poor!!!
>>
>still seething
>>
>>108657202
Can you show another example for good use of cloud generation that is not a meme or an infographic?
>>
>>108657202
>im poor!!!
Cloudcucks say you need a million dollar GPU to gen though???
>>
It's actually crazy how far ahead of local API is. Local isn't even at the level of last year's ghibli GPT. It's no surprise most of us switched to API Nodes.
>>
so anyway
>>
>>108656513
Wow, it's just like hanging out with my emo friends in 2006. Thanks anon. Local could never compete.
>>
>>108657237
ghibli is a kino style for sure, i almost melted by old gpu back in 2023 cranking out gens with the sd1.5 ghibli style lora.
>>
GUESS WHAT IM GENNING GENTS
>>
>>108656539
it's there for recognition
normies don't bother to color correct their images
therefore 99% of these images are immediately recognizable
>>
chatgpt, generate me a good local video model
>>
the jeet too poor to afford a GPU seethes as he tries to pretend us local chads can't also use any actually useful cloud software
>>
>>108656492
does GPT even still use diffusion? I thought they used an LLM style image builder for SAAS stuff
>>
>>108657335
how does that work?
>>
>>108657342
Autoregressive image models
These generate images piece by piece, like language models generate text token by token.

Instead of predicting the whole image at once, they predict image tokens sequentially.

Strengths:

clean probabilistic setup
often good prompt alignment

Weaknesses:

can be slower at high resolution
image tokenization quality matters a lot

Many newer multimodal systems use this family, especially with tokenized images.
>>
>>108656178
>LLM and text/code based crap is easy as shit

Nta but I've never seen anyone posting their text output in /lmg/ threads. If it's so easy, then why are there no gens?
>>
>>108657351
doesn't sound right since there's a lot more pixels than the context window can fit
>>
>>108656178
>jumping every time one AI company gets 0.1% ahead
yeah even for my saas stuff I am not moving from my openai api setups for gemma yet
>>
>>108657357
it's what GPT uses but it takes like 100gb vram not 4gb vram for the same image
>>
>>108657363
>it's what GPT uses
how do you know?
>>
>>108657351
considering it's openai they are probably just running a basic bitch model for the image and then an inpainting upscaler for all the text.
>>
Was ldg linked somewhere? What's with all these newfrens
>>
>>108657376
wait, openai image gen 2.0 was released today?
https://www.youtube.com/watch?v=-7JSa_luc6k
>>
>>108657371
they tell us...
>>
>>108657388
where? i thought they stopped publishing information after gpt3
>>
>>108657387
howling.

>the most accurate information
>>
>>108657400
Compare Encyclopedia Britanica. NOT AI.
>>
>>108656513
Local can do way better than that with inpainting, controlnet and Photoshop though. I mean, sure the initial image is impressive but if you think one can't locally get good text on a wall don't kid yourself.
>>
>>108657553
>Local can do way better than that with inpainting, controlnet and Photoshop though.
okay, lets see then
>>
>>108657400
the brown sepia is so good
>>
>>108657553
>Local can do way better than tha
can't wait to see that image anon!
>>
>>108657553
>just grab a pencil
>>
cloudcucks please you are embarrassing yourselves
>>
>>108657567
Not much to see, it just requires basic Photoshop skill and editing a gen. Initial image -> Z-Image, then go from there. It would take several hours, so of course I'm not just shitposting it, but it's possible.
>>
localkeks absolutely crying as another year goes by without any local developments
>>
Yet you still lurk and post here... curious
>>
holy shit... and local models don't even know a single one of these characters without loracoping. plus it gets the logos right too. saas has advanced so much that their models actually think and review the outputs
>>
>>108657583
>It would take several hours
no rush, we can wait
>>
Why aren't the SaaS gens getting posted in the SaaS thread? >>108653190
>>
>>108657583
>Not much to see
He did the meme!
>>
>>108657601
>A
>>
>>108657611
why are realistic gens posted in anime thread (here)?
>>
>>108657614
>>108657400
I'm not a biology person, but just based on idk "feel" I spotted that in their rapid fire slideshow. It's one of their best...
>>
>>108657583
>It would take several hours
vs less than a minute with that API model, the absolute state of localkeks
>>
>>108657601
>>108657617
lmao
>>
>>108657611
>last post 1 hour ago
>complaining about censorship
KEK
>>
even with a shiny new model cloudkeks cant stop thinking about local chad cock
what a shame
>>
>>108657601
but we have the nsfw fun kek
>>
>>108656513
It knows lots of products, but fails with lighting and anatomy details. Meh/10
>>
>>108657601
Ever heard of Ernie? Anyways local can only get better as a result of this. Now BFL, the Chinese and anyone else will be forced to train on stuff they forbid if they want to compete. And it's not a matter of if they will censor this, it's a matter of when.
>>
>>108657601
Ichi the killer doesn't have Ichi the killer, sad!
>>
>>108656995
There appears to be support for the model itself, dunno about gguf (though ACEStep 1.5 was considerably faster on cpp version, I might check Comfy again to see of that's changed)
https://blog.comfy.org/p/ace-step-15-xl-commercial-grade-music
>>
>>108657621
Where is it written that this thread is only for a single style?
>>
>>108657819
Yeah, I figured it out. You just need nightly.
>>
>>
>>108657601
>local models don't even know a single one of these characters without loracoping
it actually searches the internet for reference images. grok does this so openai copied them
>>
>>108657925
>it actually searches the internet for reference images. grok does this so openai copied them
yep, it's just some tool calling to gather the image + asking the model to merge everything via editing
>>
>>108657932
what tool did it call for this?
>>108657400
>>
>>108657938
it only calls the tool when it wants to
>>
File: 15686427.png (31.2 KB)
31.2 KB
31.2 KB PNG
trololololol
>>
>>108657925
>>108657932
Seems like it's grabbing every image from some kind of promotional poster or something.
>>
>>
>>108657601
What is the purpose of Anima if I can use Chenkin Noob to GPT Image 2?
>>
>>108658027
you seem mentally challenged
>>
>>108657938
websearch
>>
>>108658009
kek what style?
>>
>>108658027
NAIv5 is coming too. It will be an edit model, so there will be no point in using local for anime.
>>
>>108657938
Very impressive. How did it know not to include the necessary features of real plant cells since nobody will check?
>>
>>108658027
anima will be a functional model for years to come. chatgpt is a fotm meme model that will fall apart in a few weeks.
>>
>>108658009
>MsPaint illustration of a morbid man, employed as reddit moderator, he is dressed as Hatsune Miku and wearing a wig with blue twintails. He has unfortunate round and flat face with patches of beard. He is wearing arm patch with a white face Reddit Logo. The color palette is lively and naturalistic.
No lora needed
>>
File: 1_00030_.jpg (3.4 MB)
3.4 MB
3.4 MB JPG
>>108657601
but can it do goon stuff?
>>
Holy smokes, SaaS is leapfrogging local 3 times over. Local models can't think.
>>
>>
>>108657932
>yep, it's just some tool calling
this concept is wildly outside the imagination of the average poster see: >>108658104 they have no idea how to hook up and image generator to a language model let alone get it to pull from the web
>>
>>108658130
>>108657400
>>
>>108658093
kino
>>
>i love wasting time connecting noooodes to do tasks than saas can do in seconds
localkeks have adopted the same role as the luddite inkcels they laughed at. API models have intrinsic intelligence and don't need controlnet copium or custom nodes to understand what Jojo's Bizarre Adventure looks like.
>>
>>108658093
ai is so bad at bad art. The proportions are exact, it's like an adult swim good art bad art but it's just good art art.

It should look more like Gooby.
>>
This would be more fun if you weren't so ignorant
>>
use case for fun
>>
>>108657601
>L
>Death Note
funny
>>
How many decades are we behind GPT image 2 and NBP?
>>
>>108658154
>doing something that requires more technical knowledge makes you a luddite
>>
>>108658154
>intrinsic intelligence
my sweet summer child
>>
>>108658111
>>
>>108658159
Have you not seen much AI art?
>>
"Local, Saas, Cucks"

https://vocaroo.com/1aPv3fibpWC3
>>
Blessed thread of frenship
>>
>>108658233
seen? I'm the model behind the huge penis lora.
>>
>>108658244
do you have ears?
>>
>>108658280
get the fuck out of my thread
>>
>>108658284
>you have to accept my complete audio garbage because it was generated locally
this is like retard republicans having to pretend kid rock's music is good because he deep throats trump
>>
>
>>
were we not supposed to get a z-edit model by now?
>>
>>108658288
kid rock is good, though
https://youtu.be/rKFx0MMqb48
>>
>>108658295
I'm still waiting for Pixart Gamma
>>
>>108658288
Leave the kids alone.
>>
>>108657601
>one week later
>all removed for copyright reasons
not your workflow not your waifu
>>
>>
anyway, the point of ACE STEP is it has mad good adherence. I guess suno is good now, but when I say "cabbage fat yo yo eat that" I don't want approximately that.

why?

For memorization. It's incredible for this, add to that idk, I use veo, to make a music video. You just stretch a clip to fit the portion, that makes it slow down. It's really easy.
>>
>>108658303
I saw an irl pretty lady at a shop today. But, she turned her face and I saw she had coverup over a birthmark. local would never do this to me.
>>
>>
i'm RAPING
>>
>>
>>108658328
SHOW FEETS
>>
>>
is there such thing as audio editing models? i want to remove wind while keeping other noises
>>
>>
>>108658374
What you want doesn't really exist at all in the world.

audio is harder than images.
>>
>>108658374
yeah, ultimatevocalremover and replayai. I mostly use them to remove vocals but it might pick up noise. there's no prompt though so it might not work at all it depends on the noise. it splits files into drums, vocals, synths, and other. the noise might get captured in other and you could fix this but like I said this is just a wild hair in my butt guess
>>
Thead CURSED
>>
>>108658408
you deserve rape, benchod
>>
>>108658288
kek
>>
>>108656286
Nice one Anon.
>>
>>
since the api diffusion thread can't give me an answer, what's the current meta for an unfiltered/uncensored model that's at least better than any current finetuned/lora sdxl? don't need any fancy anatomically correct sex positions, just want a nude portrait with proper clothing context and composition. it's mostly for my dynamic game characters. anime/photoreal is fine.
>>
>>108658437
anima. no realism yet, though
>>
>>108658433
spines dont work like that
>>
>>108658408
https://files.catbox.moe/sz4hcz.flac
>>
>>108658433
Nothing in this image makes sense, nonsensical architecture, chair, pose, even the trees.
>>
>>108658468
okay but butt
>>
>>108658451
of course you think that, faggot
>>
>>108658468
>>
>>108658444
thanks, anon. i'll try anima. doesn't matter if it's anime or realistic Tbh. i'm just kinda lazy to make game assets just for a porn game. if i wanted something artistically good, i'd just pick any high-rated game from dlsite, or lurk f95 with non-aicg tags
>>
>>
>>108658478
do hassidic and torah scrolls and she has a yahweh tattoo
>>
>>108658468
>Nothing in this image makes sense, nonsensical architecture, chair, pose, even the trees.
>>
>>
>>108658474
if you can't run anima your only bet is sd1.5 and that's kind of bad.
>>
>>
>updooted comfy
>less crashy
huh
>>
>>108658518
Kate Upton
>>
>>
>>
>>
>>108658468
it's local
>>
>>
>>108658601
was that guy in a movie or something?
>>
>>
>>108658468
poor loser, I bet you count the fingers on every photo you see online
>>
>>108658433
what model is this
>>
>>108658623
I bet those guys were in a movie or something. They have bandanas on their eyes, and they are wearing boxing gloves like a professional actor would when practicing acting.
>>
>>
>>108658639
anima + zit
>>
>>108658677
nice, is there a way for zit to not butcher genitals?
>>
>all those replies
lmao @ the seethe when you point and laugh at their shitty slop

>b-but it'll be good in a decade!!
>>
>>108657148
kino
>>
Can I generate anything with my lmao 16 gigs of VRAM, 9070xt in a somewhat reasonable time?
I think I tried to mess around some time ago but I really couldn't get anything special out and I gave up pretty quickly. Some plz spoon feed me a bit.
>>
>>
>>108658709
>amd
you can but im sorry for your choice :(
>>
File: 87654.png (158.2 KB)
158.2 KB
158.2 KB PNG
>>108658709
yes you can use ltx2.3 to generate 1080p kinos which are 15 seconds long within 10 minutes
>>
isn't it kind of dumb that weight emphasis doesn't work with llm text encoders?
>>
>>108658709
You'll have to research the absolute state of rocm first since I have no clue. Idk if imagegen even has any vulkan backends..?
>>
>>108658721
Use a larger number like 2.0 instead of 1.2
At least for Anima that's what you have to do
>>
>everyone is laughing at me because how retarded I am

>t-they must be seething
>>
anima is so cool
>>
>>108658722
idk how that isn't amds top fucking priority, even some cucked cuda emulation that ran at 70% speed would instantly sell out every amd card overnight.
>>
>>
>>108658744
They tried with ZLUDA but nvidia acted rapidly and killed it
>>
>>108658733
vintage
>>
>>108658767
>>
>>
>>
now that the dust has settled, is it better to buy a bigger GPU or is it better to max your CPU memory out?
>>
>>108658677
All this Anima->ZIT stuff really shows that Anima desperately needs a proper realism finetune. It's doing like 95% of the work here, ZIT just fixes up the textures and details. Anima could easily be made to do it all on its own. You would have the model that Chroma should have been.
>>
>>108655751
>do this shit for fun
>find out some make patreons for this
>"lmao why? you can prompt for free"
>one has 4000 paypigs and making like 32k a month

Why? the quality is ass he doesn't even check the number of fingers on the stuff, the retards even praise his "art style" which is generic anime girl covered in oil, he's not doing any weird fetish shit either. I'm not even mad at the guy I just don't get who is paying for this when you can make your own and there's already tons of AI goon material out there for free, is it just some scam?
>>
>>108658856
same reason why people pay e-thots for bikini pictures when there is a lifetime of free porn on the internet
>>
Holy fuck open source video and image is so done. Where the hell are the new image and video models? Between seedance 2.0 and gpt image 2 the gap is just way too far now.
>>
File: crap.jpg (491 KB)
491 KB
491 KB JPG
>>108658862
See thats retarded too but the e-thot is real and afaik is all about the parasocial relantionship crap and gooners' mental gymnastics thinking thats like having a gf, but in this case simps are paying for stuff they could make themselves, for free even.

That guy's stuff is low quality too, he's not even making stuff like this >>108658677 look how bad the eyes are, the lack of detail, the blurry print on her shirt, why even pay for this?
>>
>>108658887
>>
>>108658856
its because he appreciates that there are people who seek knowledge. maybe his knowledge is flawed. maybe your knowledge eclipses his. but the difference between him and you is he said "I want to give people knowledge" and put effort towards that goal. you didn't. and you won't. so why are you mad? you could choose to do what he's doing but you're too lazy for all that. but maybe that's you're real gripe. you're smart but lazy, and he's dumb but motivated. in your might, you think you should win by default because you have a gift he doesn't. the truth is that the people who win are the people who try

so will this motivate anon to try? or will anon just post about how unfair it all is and go back to gooning in his basement?
>>
>>108658856
>when you can make your own
You vastly overestimate the abilities of the average internet user.
>>108658906
Leave, fag.
>>
>>108658887
are you a bot? i see you posting this kind of thing literally all hours of the day every day
>>
>>108658856
>>108658897
anon, you overestimate what people can do, some people have real jobs, life, w/e, they don't have the time and resources to learn all this AI crap, coomers with money are the best really..
I'm looking at this new DA profile that has one month and its selling the most slop low-effort ai videos ever, he spammed over 700 videos and he's already selling crap, keep in mind tho that DA is already doing something about the AI spammers tho..
>>
>>108658910
>You vastly overestimate the abilities of the average internet user.
Point taken

>>108658906
Found the retard that's paying instead of prompting.
>>
>>108658855
leak the realism lora, russel
>>
>>108658932
>some people have real jobs, life, w/e
>coomers
I do get the point about most people being dumb and/or lazy tho, specially brainrotten gooners. Still for every AI goon patreon I see making big bucks there's a ton making nothing, specially furry ones for some reason.

>DA is already doing something about the AI spammers tho.
Doing what exactly? I heard they been banning anyone posting anything beyond softcore for a while now.
>>
>>108658903
>>
my slop is too highbrow for the everyman. i would get no where churning out kino for coin.
>>
>>108658953
COME ON, BIG RUSS!
>>
File: DA2.png (42.5 KB)
42.5 KB
42.5 KB PNG
>>108658954
yeah hardcore realistic is a big no-no from what I know, those accounts don't last too much, what I mean is they are doing something about those accounts that upload 1000s of videos/images per day, even tho I think DA doesn't care tbqh since they are the most prolific they have ever been thanks to AI slop, but eventually they will do something since Idk how their servers can host so much crap


https://www.tubefilter.com/2026/04/17/deviantart-paying-artists-ai-controversy-user-increase/
>>
>>108658922
Nope it's just genuinely a concern. The Chinese are open weighting/sourcing text models that are pretty much on the heels of frontier but image and video have stagnated hard. The character sheets and comic making ability of gpt image 2 is crazy in comparison to what anima puts out.
>>
>>108658953
just train one with whatever 3dpd you like yourself?
>>
File: ha96.jpg (152.1 KB)
152.1 KB
152.1 KB JPG
>finish making my prompt
>crank the resolution settings all the way up
>wait 30 minutes for the generation
>it didn't follow the prompt
>>
>>108658856
>32k a month
Totally jealous here... how many make any decent money though? I have to assume it's the OF situation everywhere else where a few make millions per month and the rest fight over the scraps.
>>
>>108659035
resolutioncels crack me up, you don't need anything larger than 1MP
>>
>>108659053
i need the resolution to be higher for the small text to not be blobbed
>>
still no model knows what an amputated limb looks like after it's healed.
>>
>>108659061
freak
>>
>>108655751
requesting this in any scenery and any effects, please
>>
Fresh

>>108659074
>>108659074
>>108659074
>>
>>108656011
>>108656084
Where the hell are you posting that an OF whore found you?
>>
>>108659059
Just inpaint.

Reply to Thread #108655751


Supported: JPG, PNG, GIF, WebP, WebM, MP4, MP3 (max 4MB)