Media is too big
VIEW IN TELEGRAM
Retro Anime trailers with ViduQ3 x GPT-2-img with some sprinkles of seedance2 to get retro anime aestetics. A pain to edit.
https://redd.it/1tu4fzt
@rStableDiffusion
https://redd.it/1tu4fzt
@rStableDiffusion
I've been trying to replicate this Anima V1.0 image all day but I can't manage it. Maybe I need a special workflow or something?
https://redd.it/1tu4zsy
@rStableDiffusion
https://redd.it/1tu4zsy
@rStableDiffusion
Reddit
From the StableDiffusion community on Reddit: I've been trying to replicate this Anima V1.0 image all day but I can't manage it.…
Explore this post and more from the StableDiffusion community
This media is not supported in your browser
VIEW IN TELEGRAM
Cosmos3-Super-Image2Video running locally on a single RTX PRO 6000 96GB
https://redd.it/1tu7bth
@rStableDiffusion
https://redd.it/1tu7bth
@rStableDiffusion
Best local realistic image model that is uncensored?
I would like to know which model gives the most realistic and candid images, similar to nano banana, that still allows for uncensored and 18+ generation?
https://redd.it/1ttxmvd
@rStableDiffusion
I would like to know which model gives the most realistic and candid images, similar to nano banana, that still allows for uncensored and 18+ generation?
https://redd.it/1ttxmvd
@rStableDiffusion
Reddit
From the StableDiffusion community on Reddit
Explore this post and more from the StableDiffusion community
Comfyui v0.23.0 Support NVIDIA PixelDiT and PiD (CORE-201) by @kijai in #14103
https://github.com/Comfy-Org/ComfyUI/releases/tag/v0.23.0
https://github.com/NVlabs/PixelDiT
https://redd.it/1tudtui
@rStableDiffusion
https://github.com/Comfy-Org/ComfyUI/releases/tag/v0.23.0
https://github.com/NVlabs/PixelDiT
https://redd.it/1tudtui
@rStableDiffusion
GitHub
Release v0.23.0 · Comfy-Org/ComfyUI
ComfyUI v0.23.0
GitHub - orion4d/Orion4D_anaglyph: Orion4D Anaglyph** is a high-performance custom node designed to transform 2D images into stereoscopic (3D) renders via a depth map. It offers total control over parallax, convergence, and depth processing to ensure optimal visual comfort
https://github.com/orion4d/Orion4D_anaglyph
https://redd.it/1tuioz9
@rStableDiffusion
https://github.com/orion4d/Orion4D_anaglyph
https://redd.it/1tuioz9
@rStableDiffusion
GitHub
GitHub - orion4d/Orion4D_anaglyph: Orion4D Anaglyph** is a high-performance custom node designed to transform 2D images into stereoscopic…
Orion4D Anaglyph** is a high-performance custom node designed to transform 2D images into stereoscopic (3D) renders via a depth map. It offers total control over parallax, convergence, and depth pr...
GitHub - orion4d/Orion4D_maskpro: Mask editor for ComfyUI
https://github.com/orion4d/Orion4D_maskpro
https://redd.it/1tujq1t
@rStableDiffusion
https://github.com/orion4d/Orion4D_maskpro
https://redd.it/1tujq1t
@rStableDiffusion
GitHub
GitHub - orion4d/Orion4D_maskpro: Mask editor for ComfyUI
Mask editor for ComfyUI. Contribute to orion4d/Orion4D_maskpro development by creating an account on GitHub.
Beginner prompting Guide for LTX 2.3 : tips and tricks
If you’ve been messing around with LTX 2.3 lately, you probably realized pretty quickly that this model is incredibly sensitive to text inputs. If you just throw standard Gen-AI prompts at it, you’re going to get a lot of mutated frames and chaotic motion.
After thousands of generations and a lot of hair-pulling, I’ve mapped out the core mechanics of how LTX 2.3 interprets data. If you are struggling to get clean, predictable outputs, here is the survival guide on what works, what doesn't, and how to structure your workflow.
# 1. Describe the physics, not the emotion
LTX 2.3 does not understand abstract concepts like "he is angry" or "she feels sad." When you use emotional adjectives, the model tends to over-correct or ignore them entirely.
The Fix: Describe the physical manifestation of the emotion. Instead of "furious," write "tightened jaw, narrowed eyes, stiff posture, micro-tremor in the shoulders". Give the model physical geometry to animate.
# 2. The prompt is a complement, not a replacement (I2V / Adapters)
When utilizing Image-to-Video (I2V) or control guidance layers (Pose, Canny, Depth), your prompt should never try to "re-describe" what the model can already see. More importantly, it must never contradict them.
The Fix: Treat your text prompt purely as an extension of the reference inputs. Describe only the change or the continuity of the scene, keeping the static elements strictly aligned with your source image or map. Fighting the adapters is the #1 cause of prompt-cooking and artifacts.
# 3. Use rough timecodes for chronological flow
If you want a sequential action to happen within your generation, LTX 2.3 needs temporal anchors. It cannot naturally guess the pacing of a scene from a continuous sentence.
The Fix: Insert loose timecodes directly into your prompt text to guide the timeline. They don’t need to be frame-accurate, but writing something like `[00:00] character looks ahead, [00:02] slowly turns head to the left, [00:04] frowns` gives the architecture a clear directional roadmap.
# 4. Optimize the "generation budget" with simple backgrounds
The more complex your environment is, the fewer processing resources the model can allocate to tracking fine-grained details on your main subject.
The Fix: Keep your backgrounds as clean and minimalist as possible. A simple, uncluttered setting allows LTX 2.3 to focus its attention entirely on making the main subject's motion fluid and accurate.
# 5. Avoid complex multi-character interactions
If you are planning an action movie sequence where two characters are wrestling or executing rapid, high-speed movements, prepare for frustration.
The Fix: Keep physical interactions to a minimum. Getting a clean, viable result out of complex choreography requires an exhausting number of seeds and iterations. Save your sanity: prompt the base motion cleanly, and handle the heavy lifting or fast pacing during post-production editing.
# 6. Drive performance with high-quality audio (A2V)
Maintaining character voice consistency across different shots can be a nightmare through text alone. If your character needs to speak, relying solely on text prompts will usually result in a completely mismatched voice from one clip to the next.
The Fix: Use a dedicated TTS system to generate clean, emotionally rich dialogue audio before you run the video generation. Feeding a high-quality audio track into the Audio-to-Video (A2V) workflow acts as a powerful anchor that naturally drives the facial physics and lip-sync accuracy far better than text ever could.
#
Ultimately, a long phase of trial, error, and mapping out the boundaries of what LTX 2.3 can and cannot do is completely inevitable. Treat it less like a magic box and more like a camera rig that requires precise technical calibration.
These are just a few of the macro strategies that saved my workflow, so this list is definitely non-exhaustive. If you guys have found any other specific tweaks or prompt
If you’ve been messing around with LTX 2.3 lately, you probably realized pretty quickly that this model is incredibly sensitive to text inputs. If you just throw standard Gen-AI prompts at it, you’re going to get a lot of mutated frames and chaotic motion.
After thousands of generations and a lot of hair-pulling, I’ve mapped out the core mechanics of how LTX 2.3 interprets data. If you are struggling to get clean, predictable outputs, here is the survival guide on what works, what doesn't, and how to structure your workflow.
# 1. Describe the physics, not the emotion
LTX 2.3 does not understand abstract concepts like "he is angry" or "she feels sad." When you use emotional adjectives, the model tends to over-correct or ignore them entirely.
The Fix: Describe the physical manifestation of the emotion. Instead of "furious," write "tightened jaw, narrowed eyes, stiff posture, micro-tremor in the shoulders". Give the model physical geometry to animate.
# 2. The prompt is a complement, not a replacement (I2V / Adapters)
When utilizing Image-to-Video (I2V) or control guidance layers (Pose, Canny, Depth), your prompt should never try to "re-describe" what the model can already see. More importantly, it must never contradict them.
The Fix: Treat your text prompt purely as an extension of the reference inputs. Describe only the change or the continuity of the scene, keeping the static elements strictly aligned with your source image or map. Fighting the adapters is the #1 cause of prompt-cooking and artifacts.
# 3. Use rough timecodes for chronological flow
If you want a sequential action to happen within your generation, LTX 2.3 needs temporal anchors. It cannot naturally guess the pacing of a scene from a continuous sentence.
The Fix: Insert loose timecodes directly into your prompt text to guide the timeline. They don’t need to be frame-accurate, but writing something like `[00:00] character looks ahead, [00:02] slowly turns head to the left, [00:04] frowns` gives the architecture a clear directional roadmap.
# 4. Optimize the "generation budget" with simple backgrounds
The more complex your environment is, the fewer processing resources the model can allocate to tracking fine-grained details on your main subject.
The Fix: Keep your backgrounds as clean and minimalist as possible. A simple, uncluttered setting allows LTX 2.3 to focus its attention entirely on making the main subject's motion fluid and accurate.
# 5. Avoid complex multi-character interactions
If you are planning an action movie sequence where two characters are wrestling or executing rapid, high-speed movements, prepare for frustration.
The Fix: Keep physical interactions to a minimum. Getting a clean, viable result out of complex choreography requires an exhausting number of seeds and iterations. Save your sanity: prompt the base motion cleanly, and handle the heavy lifting or fast pacing during post-production editing.
# 6. Drive performance with high-quality audio (A2V)
Maintaining character voice consistency across different shots can be a nightmare through text alone. If your character needs to speak, relying solely on text prompts will usually result in a completely mismatched voice from one clip to the next.
The Fix: Use a dedicated TTS system to generate clean, emotionally rich dialogue audio before you run the video generation. Feeding a high-quality audio track into the Audio-to-Video (A2V) workflow acts as a powerful anchor that naturally drives the facial physics and lip-sync accuracy far better than text ever could.
#
Ultimately, a long phase of trial, error, and mapping out the boundaries of what LTX 2.3 can and cannot do is completely inevitable. Treat it less like a magic box and more like a camera rig that requires precise technical calibration.
These are just a few of the macro strategies that saved my workflow, so this list is definitely non-exhaustive. If you guys have found any other specific tweaks or prompt
structures that stop the model from breaking, drop them below!
https://redd.it/1tun069
@rStableDiffusion
https://redd.it/1tun069
@rStableDiffusion
Reddit
From the StableDiffusion community on Reddit
Explore this post and more from the StableDiffusion community
ComfyUI Tutorial: Create Two Talking AI Characters On 6GB VRAM
https://youtu.be/DAtAp4HlErE
https://redd.it/1tus405
@rStableDiffusion
https://youtu.be/DAtAp4HlErE
https://redd.it/1tus405
@rStableDiffusion
YouTube
ComfyUI Tutorial: Create Two Talking AI Characters On 6GB VRAM #comfyui #comfyuitutorial #ltx2
Hello everyone! In this tutorial, we explore a new Dual Character Lip Sync LoRA for the LTX 2.3 model that enables two characters to speak simultaneously using an image, prompt, and custom audio file. This LoRA was specifically trained to improve character…
Fizgig Klein 9b Lora Studio v1.2.4 - update targeting 16gb Card users
https://redd.it/1tuq8lw
@rStableDiffusion
https://redd.it/1tuq8lw
@rStableDiffusion
PixelDiT: Pixel Diffusion Transformers
for Image Generation
Pixel Diffusion Transformers
for Image Generation, 1.3B, no VAE
https://pixeldit.github.io/
https://redd.it/1tuujjg
@rStableDiffusion
for Image Generation
Pixel Diffusion Transformers
for Image Generation, 1.3B, no VAE
https://pixeldit.github.io/
https://redd.it/1tuujjg
@rStableDiffusion
Reddit
From the StableDiffusion community on Reddit: PixelDiT: Pixel Diffusion Transformers
for Image Generation
Pixel Diffusion Transformers…
for Image Generation
Pixel Diffusion Transformers…
Posted by CornyShed - 6 votes and 0 comments