r/StableDiffusion

Retro Anime trailers with ViduQ3 x GPT-2-img with some sprinkles of seedance2 to get retro anime aestetics. A pain to edit.

https://redd.it/1tu4fzt
@rStableDiffusion

6 views20:40

r/StableDiffusion

I've been trying to replicate this Anima V1.0 image all day but I can't manage it. Maybe I need a special workflow or something?

https://redd.it/1tu4zsy
@rStableDiffusion

From the StableDiffusion community on Reddit: I've been trying to replicate this Anima V1.0 image all day but I can't manage it.…

Explore this post and more from the StableDiffusion community

7 views21:40

r/StableDiffusion

6 views21:40

r/StableDiffusion

0:05

This media is not supported in your browser

VIEW IN TELEGRAM

Cosmos3-Super-Image2Video running locally on a single RTX PRO 6000 96GB

https://redd.it/1tu7bth
@rStableDiffusion

6 views22:40

r/StableDiffusion

Best local realistic image model that is uncensored?

I would like to know which model gives the most realistic and candid images, similar to nano banana, that still allows for uncensored and 18+ generation?

https://redd.it/1ttxmvd
@rStableDiffusion

From the StableDiffusion community on Reddit

Explore this post and more from the StableDiffusion community

7 views23:40

r/StableDiffusion

Comfyui v0.23.0 Support NVIDIA PixelDiT and PiD (CORE-201) by @kijai in #14103

https://github.com/Comfy-Org/ComfyUI/releases/tag/v0.23.0

https://github.com/NVlabs/PixelDiT

https://redd.it/1tudtui
@rStableDiffusion

GitHub

Release v0.23.0 · Comfy-Org/ComfyUI

ComfyUI v0.23.0

6 views03:40

r/StableDiffusion

GitHub - orion4d/Orion4D_anaglyph: Orion4D Anaglyph** is a high-performance custom node designed to transform 2D images into stereoscopic (3D) renders via a depth map. It offers total control over parallax, convergence, and depth processing to ensure optimal visual comfort
https://github.com/orion4d/Orion4D_anaglyph

https://redd.it/1tuioz9
@rStableDiffusion

GitHub

GitHub - orion4d/Orion4D_anaglyph: Orion4D Anaglyph** is a high-performance custom node designed to transform 2D images into stereoscopic…

Orion4D Anaglyph** is a high-performance custom node designed to transform 2D images into stereoscopic (3D) renders via a depth map. It offers total control over parallax, convergence, and depth pr...

6 views07:40

r/StableDiffusion

Pallaidium: Omnimodal AI Movie Studio integrated in Blender
https://redd.it/1tujwzo
@rStableDiffusion

7 views08:40

r/StableDiffusion

GitHub - orion4d/Orion4D_maskpro: Mask editor for ComfyUI
https://github.com/orion4d/Orion4D_maskpro

https://redd.it/1tujq1t
@rStableDiffusion

GitHub

GitHub - orion4d/Orion4D_maskpro: Mask editor for ComfyUI

Mask editor for ComfyUI. Contribute to orion4d/Orion4D_maskpro development by creating an account on GitHub.

7 views09:40

r/StableDiffusion

0:13

This media is not supported in your browser

VIEW IN TELEGRAM

Bernini video test video edit

https://redd.it/1tumjhe
@rStableDiffusion

7 views10:40

r/StableDiffusion

Beginner prompting Guide for LTX 2.3 : tips and tricks

If you’ve been messing around with LTX 2.3 lately, you probably realized pretty quickly that this model is incredibly sensitive to text inputs. If you just throw standard Gen-AI prompts at it, you’re going to get a lot of mutated frames and chaotic motion.

After thousands of generations and a lot of hair-pulling, I’ve mapped out the core mechanics of how LTX 2.3 interprets data. If you are struggling to get clean, predictable outputs, here is the survival guide on what works, what doesn't, and how to structure your workflow.

# 1. Describe the physics, not the emotion

LTX 2.3 does not understand abstract concepts like "he is angry" or "she feels sad." When you use emotional adjectives, the model tends to over-correct or ignore them entirely.

The Fix: Describe the physical manifestation of the emotion. Instead of "furious," write "tightened jaw, narrowed eyes, stiff posture, micro-tremor in the shoulders". Give the model physical geometry to animate.

# 2. The prompt is a complement, not a replacement (I2V / Adapters)

When utilizing Image-to-Video (I2V) or control guidance layers (Pose, Canny, Depth), your prompt should never try to "re-describe" what the model can already see. More importantly, it must never contradict them.

The Fix: Treat your text prompt purely as an extension of the reference inputs. Describe only the change or the continuity of the scene, keeping the static elements strictly aligned with your source image or map. Fighting the adapters is the #1 cause of prompt-cooking and artifacts.

# 3. Use rough timecodes for chronological flow

If you want a sequential action to happen within your generation, LTX 2.3 needs temporal anchors. It cannot naturally guess the pacing of a scene from a continuous sentence.

The Fix: Insert loose timecodes directly into your prompt text to guide the timeline. They don’t need to be frame-accurate, but writing something like `[00:00] character looks ahead, [00:02] slowly turns head to the left, [00:04] frowns` gives the architecture a clear directional roadmap.

# 4. Optimize the "generation budget" with simple backgrounds

The more complex your environment is, the fewer processing resources the model can allocate to tracking fine-grained details on your main subject.

The Fix: Keep your backgrounds as clean and minimalist as possible. A simple, uncluttered setting allows LTX 2.3 to focus its attention entirely on making the main subject's motion fluid and accurate.

# 5. Avoid complex multi-character interactions

If you are planning an action movie sequence where two characters are wrestling or executing rapid, high-speed movements, prepare for frustration.

The Fix: Keep physical interactions to a minimum. Getting a clean, viable result out of complex choreography requires an exhausting number of seeds and iterations. Save your sanity: prompt the base motion cleanly, and handle the heavy lifting or fast pacing during post-production editing.

# 6. Drive performance with high-quality audio (A2V)

Maintaining character voice consistency across different shots can be a nightmare through text alone. If your character needs to speak, relying solely on text prompts will usually result in a completely mismatched voice from one clip to the next.

The Fix: Use a dedicated TTS system to generate clean, emotionally rich dialogue audio before you run the video generation. Feeding a high-quality audio track into the Audio-to-Video (A2V) workflow acts as a powerful anchor that naturally drives the facial physics and lip-sync accuracy far better than text ever could.

#

Ultimately, a long phase of trial, error, and mapping out the boundaries of what LTX 2.3 can and cannot do is completely inevitable. Treat it less like a magic box and more like a camera rig that requires precise technical calibration.

These are just a few of the macro strategies that saved my workflow, so this list is definitely non-exhaustive. If you guys have found any other specific tweaks or prompt

4 views11:40

r/StableDiffusion

structures that stop the model from breaking, drop them below!

https://redd.it/1tun069
@rStableDiffusion

From the StableDiffusion community on Reddit

Explore this post and more from the StableDiffusion community

5 views11:40

r/StableDiffusion

0:05

This media is not supported in your browser

VIEW IN TELEGRAM

Wan 2.2 with Audio works really well! Worflow included

https://redd.it/1tunl4s
@rStableDiffusion

6 views12:40

r/StableDiffusion

ComfyUI Tutorial: Create Two Talking AI Characters On 6GB VRAM
https://youtu.be/DAtAp4HlErE

https://redd.it/1tus405
@rStableDiffusion

YouTube

ComfyUI Tutorial: Create Two Talking AI Characters On 6GB VRAM #comfyui #comfyuitutorial #ltx2

Hello everyone! In this tutorial, we explore a new Dual Character Lip Sync LoRA for the LTX 2.3 model that enables two characters to speak simultaneously using an image, prompt, and custom audio file. This LoRA was specifically trained to improve character…

5 views14:40

r/StableDiffusion

Fizgig Klein 9b Lora Studio v1.2.4 - update targeting 16gb Card users
https://redd.it/1tuq8lw
@rStableDiffusion

5 views15:40

r/StableDiffusion

PixelDiT: Pixel Diffusion Transformers
for Image Generation
Pixel Diffusion Transformers
for Image Generation, 1.3B, no VAE
https://pixeldit.github.io/

https://redd.it/1tuujjg
@rStableDiffusion

From the StableDiffusion community on Reddit: PixelDiT: Pixel Diffusion Transformers
for Image Generation
Pixel Diffusion Transformers…

Posted by CornyShed - 6 votes and 0 comments

3 views16:40

r/StableDiffusion

Anima testing for complex scene
https://redd.it/1tuy3ye
@rStableDiffusion

2 views17:40

r/StableDiffusion

1:45

This media is not supported in your browser

VIEW IN TELEGRAM

MISO-TTS . 8 Billion text2speech model released.

https://redd.it/1tux5qx
@rStableDiffusion

1 view18:40

About

Blog

Apps

Platform