r/StableDiffusion

Spatial Edit (Apache 2.0)

Has anyone tried this out?
https://github.com/EasonXiao-888/SpatialEdit
https://huggingface.co/EasonXiao-888/SpatialEdit-16B

https://redd.it/1sjcljf
@rStableDiffusion

GitHub

GitHub - EasonXiao-888/SpatialEdit: SpatialEdit: Benchmarking Fine-Grained Image Spatial Editing

SpatialEdit: Benchmarking Fine-Grained Image Spatial Editing - EasonXiao-888/SpatialEdit

7 views13:40

r/StableDiffusion

Can you use Qwen3.5 4b & Gemma 4 E4B with Z image/Turbo?

So I was wondering if I could use the latest for billion parameter versions of Qwen3.5 and Gemma 4 with Z image turbo and base version?

https://redd.it/1sje2ag
@rStableDiffusion

From the StableDiffusion community on Reddit

Explore this post and more from the StableDiffusion community

7 views14:40

r/StableDiffusion

The mysterious science of LoRA training (sdxl)

I find myself still unable to train good looking character loras for illustrious, and I don't know what I'm doing wrong. I'm using a 3D character for this purpose (blender model) and I've tried replicating training settings from other people's lora that I consider great, but I still have questions.

1. Can you train actually train a 3D character on illustrious or is it fighting the model too much? (considering it seems much better at handling 2D visuals)
2. I've noticed most great LoRAs out there are using hundreds of image in their dataset, usually 200 to 400. My dataset is more on the side of 50, is there an actual benefit to such large datasets?
3. Repeats. Sounds like 10 epochs of 10 repeats would be equivalent to a 100 epochs of 1 repeat, but is that truly the case? I always struggle to figure out how many repeats I should be using.
4. TE. I noticed some people do not train the text encoder at all, anyone has feedback on the benefits of doing this?
5. Batch size. I want to use 6 or 8 batch size, because I can. But I'm not sure how I need to dial the other settings based on that, in particular with learning rate and repeats.
6. Removing backgrounds. Beside the fact that is makes captionning easier, is there an actual benefit, have you noticed it yielded better results?

I have noticed the following issues with my attempt at training, perhaps this will help someone point me in the right direction on what I'm doing wrong here:

* Style locking in too much. For example I like prompting with "dark, dim lighting" keywords which works well with illustrious, but my loras will make the result much brighter than the base model (even when tagging the dataset with "day"). Dataset has a couple night shots but they are mostly bright daylight.
* Faces train fast and seem to overtrain before clothes, making it impossible to find a good balance. Either one is overtrained or the other is undertrained. (I do have less full body shot than upper body and portrait, but this is apparently a desired ratio?)
* I have settled down on a LR of 2e-4 but have tried higher and lower with no success.

If you take the time to give to answer some of that, thank you =)

https://redd.it/1sjhf1d
@rStableDiffusion

From the StableDiffusion community on Reddit

Explore this post and more from the StableDiffusion community

8 views15:40

r/StableDiffusion

1:14

This media is not supported in your browser

VIEW IN TELEGRAM

Free open-source tool to instantly rig and animate your illustrations (also with mesh deform)

https://redd.it/1sjj7ta
@rStableDiffusion

9 views16:40

r/StableDiffusion

Greg Rutkowski Anima Lora from Circlestone Labs (Anima makers) with training params
https://civitai.com/models/2536147/greg-rutkowski-style-anima

https://redd.it/1sjk7dc
@rStableDiffusion

Civitai

Greg Rutkowski Style - Anima - v1.0 | Anima LoRA | Civitai

Greg Rutkowski style LoRA for Anima. Trained on preview3. Prefix prompt with "@greg rutkowski. " Natural language prompts work best. All training d...

9 views17:40

r/StableDiffusion

0:15

This media is not supported in your browser

VIEW IN TELEGRAM

Ltx 2.3

https://redd.it/1sjjbna
@rStableDiffusion

8 views18:40

r/StableDiffusion

Suggestions on which model I should train an MC Escher Tessellation LoRA on?

https://redd.it/1sjoe8u
@rStableDiffusion

From the StableDiffusion community on Reddit: Suggestions on which model I should train an MC Escher Tessellation LoRA on?

Explore this post and more from the StableDiffusion community

6 views20:40

r/StableDiffusion

7 views20:40

r/StableDiffusion

Z-Image Turbo Checkpoint - Deedeemegadoodo Edition

https://redd.it/1sjsp13
@rStableDiffusion

From the StableDiffusion community on Reddit: Z-Image Turbo Checkpoint - Deedeemegadoodo Edition

Explore this post and more from the StableDiffusion community

8 views22:40

r/StableDiffusion

SD-FORGE EXTENSION
/r/StableDiffusion/comments/1sjty9x/sdforge_extension/

https://redd.it/1sjtyrz
@rStableDiffusion

From the sdforall community on Reddit: SD-FORGE EXTENSION

Posted by BusBackground5847 - 1 vote and 0 comments

7 views23:40

r/StableDiffusion

This media is not supported in your browser

VIEW IN TELEGRAM

Me whenever people on the PC building subreddits ask me why I need >32GB of system RAM.
https://redd.it/1sjsvjk
@rStableDiffusion

6 views00:40

r/StableDiffusion

Does anyone know which model and potentially Lora was used to create these?

https://redd.it/1sjv1ga
@rStableDiffusion

From the StableDiffusion community on Reddit: Does anyone know which model and potentially Lora was used to create these?

Explore this post and more from the StableDiffusion community

5 views01:40

r/StableDiffusion

6 views01:40

r/StableDiffusion

0:08

This media is not supported in your browser

VIEW IN TELEGRAM

IC-LoRA-Detailer: It's for post-processing, not just rendering (LTX2.3)

https://redd.it/1sjxoz6
@rStableDiffusion

7 views02:40

r/StableDiffusion

Haven't had more fun than today with subgraphs - Subgraphs are awesome!!!

https://redd.it/1sjs2bq
@rStableDiffusion

From the StableDiffusion community on Reddit: Haven't had more fun than today with subgraphs - Subgraphs are awesome!!!

Explore this post and more from the StableDiffusion community

7 views03:40

r/StableDiffusion

Dataset source for AceStep team! XD
https://redd.it/1sjzhva
@rStableDiffusion

7 views04:40

r/StableDiffusion

Used LTX 2.3 anchor frame injection to maintain brand consistency across AI video — before/after

Working on a brand campaign where consistency was everything — same can, same character, same lighting across all assets including video.

The main technique I used was anchor frame injection through using LTXV guides over inplace. Three reference frames injected at key points in the timeline:

a starting frame to lock the logo specifically,
a mid-point "consistency anchor" at frame 138 to bridge the gap, the guide is set low and the anchor image is designed with high almost flat contrast in key areas

and a hard end frame at reference strength 0.7 to leave enough room for natural movement.

Combined with canny edges, depth map, and pose estimation as control references.
The before GIF is the raw output. The after is the rerender with the anchor method applied.

The environment cleaned up significantly. One thing LTX over-interpreted was the walk — it added a fluidity that felt more runway than competitive player. Tighter pose constraints next pass.

Full case study in comments.

https://i.redd.it/fj2pl5covwug1.gif

https://i.redd.it/p0ubkd5pvwug1.gif

https://redd.it/1sk4051
@rStableDiffusion

7 views08:40

About

Blog

Apps

Platform