Inpainting with reference to LTX-2.3 (MR2V)
Hey everyone, today I’m sharing an experimental IC LoRA I trained for LTX-2.3. It allows you to do reference-based inpainting inside a masked region in video.
This LoRA is still experimental, so don’t expect something fully polished yet, but it already works pretty well — especially when the prompt contains enough detail and the mask is large enough to properly fit the object you want to place.
I’m sharing everything here for anyone who wants to test it:
Hugging Face repo:
https://huggingface.co/Alissonerdx/LTX-LoRAs
Direct model download:
https://huggingface.co/Alissonerdx/LTX-LoRAs/blob/main/ltx23\_inpaint\_masked\_r2v\_rank32\_v1\_3000steps.safetensors
Workflow:
https://huggingface.co/Alissonerdx/LTX-LoRAs/blob/main/workflows/ltx23\_masked\_ref\_inpaint\_v1.json
Civitai page:
https://civitai.com/models/2484952
It can also work as text-to-video if you use a blank reference and describe everything only in the prompt.
Important note: this LoRA was not trained for body, head, face swap, or similar inpainting use cases. It was trained mainly for objects. If you want to do head swap, use my head swap LoRA called BFS instead.
Since this is still experimental, feedback, tests, and results are very welcome.
https://reddit.com/link/1secygl/video/bxrfa5bu7ntg1/player
https://reddit.com/link/1secygl/video/813vpjdh6ntg1/player
https://reddit.com/link/1secygl/video/jqnwx9bi6ntg1/player
https://redd.it/1secygl
@rStableDiffusion
Hey everyone, today I’m sharing an experimental IC LoRA I trained for LTX-2.3. It allows you to do reference-based inpainting inside a masked region in video.
This LoRA is still experimental, so don’t expect something fully polished yet, but it already works pretty well — especially when the prompt contains enough detail and the mask is large enough to properly fit the object you want to place.
I’m sharing everything here for anyone who wants to test it:
Hugging Face repo:
https://huggingface.co/Alissonerdx/LTX-LoRAs
Direct model download:
https://huggingface.co/Alissonerdx/LTX-LoRAs/blob/main/ltx23\_inpaint\_masked\_r2v\_rank32\_v1\_3000steps.safetensors
Workflow:
https://huggingface.co/Alissonerdx/LTX-LoRAs/blob/main/workflows/ltx23\_masked\_ref\_inpaint\_v1.json
Civitai page:
https://civitai.com/models/2484952
It can also work as text-to-video if you use a blank reference and describe everything only in the prompt.
Important note: this LoRA was not trained for body, head, face swap, or similar inpainting use cases. It was trained mainly for objects. If you want to do head swap, use my head swap LoRA called BFS instead.
Since this is still experimental, feedback, tests, and results are very welcome.
https://reddit.com/link/1secygl/video/bxrfa5bu7ntg1/player
https://reddit.com/link/1secygl/video/813vpjdh6ntg1/player
https://reddit.com/link/1secygl/video/jqnwx9bi6ntg1/player
https://redd.it/1secygl
@rStableDiffusion
huggingface.co
Alissonerdx/LTX-LoRAs · Hugging Face
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
Pixelsmile works in comfyui -Enabling fine-grained microexpression control. Workflow included.
https://redd.it/1sefy2t
@rStableDiffusion
https://redd.it/1sefy2t
@rStableDiffusion
Reddit
From the StableDiffusion community on Reddit: Pixelsmile works in comfyui -Enabling fine-grained microexpression control. Workflow…
Explore this post and more from the StableDiffusion community
The people getting the best outputs from AI tools aren't the best prompt engineers. They're the ones who already know what good looks like.
I've been thinking about this after using both Stable Diffusion and Claude/Lovable for a commercial website rebuild, and I think the same principle applies to both.
When I started using SD, I assumed the people getting extraordinary results just had better prompt formulas. But the more I paid attention, the more I noticed something else. The photographers were getting better photorealistic outputs. The illustrators were getting better illustrated outputs. The designers were getting better-designed outputs.
The tool didn't level the playing field. It amplified whatever eyes and taste people already had.
The prompt "a woman in dramatic lighting" gets very different results from someone who understands Rembrandt lighting, split lighting, practical light sources, and motivated shadows — versus someone who just knows the words. Same prompt on paper. Completely different understanding behind it.
I think this is the underappreciated skill gap in AI generation. It's not prompting. It's visual literacy. The ability to look at an output and know precisely what's wrong and why — and then know what words to change to fix it.
A useful exercise I found: when I get an output that's close but not quite right, instead of changing the prompt randomly, I ask Claude to analyse what specific visual or compositional principle is being violated. Then I know exactly what to change. Treating the AI as both executor and critic is more powerful than using it as just one or the other.
Curious if others here have found the same pattern. Does domain expertise in visual fields transfer more than people think? Or does the tool actually reduce the advantage of having a trained eye?
https://redd.it/1sej6y0
@rStableDiffusion
I've been thinking about this after using both Stable Diffusion and Claude/Lovable for a commercial website rebuild, and I think the same principle applies to both.
When I started using SD, I assumed the people getting extraordinary results just had better prompt formulas. But the more I paid attention, the more I noticed something else. The photographers were getting better photorealistic outputs. The illustrators were getting better illustrated outputs. The designers were getting better-designed outputs.
The tool didn't level the playing field. It amplified whatever eyes and taste people already had.
The prompt "a woman in dramatic lighting" gets very different results from someone who understands Rembrandt lighting, split lighting, practical light sources, and motivated shadows — versus someone who just knows the words. Same prompt on paper. Completely different understanding behind it.
I think this is the underappreciated skill gap in AI generation. It's not prompting. It's visual literacy. The ability to look at an output and know precisely what's wrong and why — and then know what words to change to fix it.
A useful exercise I found: when I get an output that's close but not quite right, instead of changing the prompt randomly, I ask Claude to analyse what specific visual or compositional principle is being violated. Then I know exactly what to change. Treating the AI as both executor and critic is more powerful than using it as just one or the other.
Curious if others here have found the same pattern. Does domain expertise in visual fields transfer more than people think? Or does the tool actually reduce the advantage of having a trained eye?
https://redd.it/1sej6y0
@rStableDiffusion
Reddit
From the StableDiffusion community on Reddit
Explore this post and more from the StableDiffusion community
LTX2.3 Multi Image reference
https://youtube.com/shorts/XfgfMuNB99g
https://redd.it/1secxzc
@rStableDiffusion
https://youtube.com/shorts/XfgfMuNB99g
https://redd.it/1secxzc
@rStableDiffusion
YouTube
LTX2.3 Multi Image Reference Shot #ltx2.3 #comfyui
LTX2.3 모델의 다중 이미지 참조 기능을 활용해서 일관성 있는 장면 연출을 시도해보았습니다. TEST 영상 입니다I tried to create consistent scenes by leveraging the multi-image reference capabilities of ...
Media is too big
VIEW IN TELEGRAM
The tool you've been waiting for, a FREE LOCAL ComfyUI based Full Movie Pipeline Agent. Enter anything in the prompt with a desired scejne time and let it go. Plenty of cool features. Enjoy :) KupkaProd Cinema Pipeline. 9 Min Video in post created with less than 40 words.
https://redd.it/1selevy
@rStableDiffusion
https://redd.it/1selevy
@rStableDiffusion
Ace-step 1.5XL's already up! I hope it will soon be available in a Comfyui format! ❤️
https://huggingface.co/collections/ACE-Step/ace-step-15-xl
https://redd.it/1semc1e
@rStableDiffusion
https://huggingface.co/collections/ACE-Step/ace-step-15-xl
https://redd.it/1semc1e
@rStableDiffusion
huggingface.co
Ace-Step 1.5-xl - a ACE-Step Collection
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
Help me find optimal hyper-parameters for Ultimate Stable Diffusion Upscale and complete my masters degree!
https://redd.it/1semi3t
@rStableDiffusion
https://redd.it/1semi3t
@rStableDiffusion
Ace Step 1.5 XL is out!!!
https://huggingface.co/ACE-Step/acestep-v15-xl-turbo
https://huggingface.co/ACE-Step/acestep-v15-xl-base
https://huggingface.co/ACE-Step/acestep-v15-xl-sft
Have fun all!
https://redd.it/1ses85i
@rStableDiffusion
https://huggingface.co/ACE-Step/acestep-v15-xl-turbo
https://huggingface.co/ACE-Step/acestep-v15-xl-base
https://huggingface.co/ACE-Step/acestep-v15-xl-sft
Have fun all!
https://redd.it/1ses85i
@rStableDiffusion
huggingface.co
ACE-Step/acestep-v15-xl-turbo · Hugging Face
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
Here's a trick you can perform with Depth map + FFLF
https://www.youtube.com/watch?v=1QvTmkXF-HY
https://redd.it/1seqv34
@rStableDiffusion
https://www.youtube.com/watch?v=1QvTmkXF-HY
https://redd.it/1seqv34
@rStableDiffusion
YouTube
Gourmet Pyramids
This showcase here is an AI visual technique I devised that enables the creation of food arranged in any desired shape. I think it can be utilised for TV commercials.
All of my AI demos:
https://www.youtube.com/playlist?list=PLe3OBqR7FeRhZM6SNoIWibQ1PA2JREYtL
All of my AI demos:
https://www.youtube.com/playlist?list=PLe3OBqR7FeRhZM6SNoIWibQ1PA2JREYtL