TVCG 2026: MARRS for Human Motion Action-Reaction Synthesis
# MARRS: Masked Autoregressive Unit-based Reaction Synthesis
Project page: **https://aigc-explorer.github.io/MARRS/**
Introducing MARRS: a new framework for human action-reaction synthesis that generates coordinated, fine-grained reactions conditioned on another person’s motion. By avoiding VQ and modeling body/hand units with UD-VAE + ACF + MUM, MARRS captures cross-unit perception more effectively and efficiently. It achieves state-of-the-art quantitative and qualitative results.
Overall framework
Demo
https://redd.it/1t853hg
@rStableDiffusion
# MARRS: Masked Autoregressive Unit-based Reaction Synthesis
Project page: **https://aigc-explorer.github.io/MARRS/**
Introducing MARRS: a new framework for human action-reaction synthesis that generates coordinated, fine-grained reactions conditioned on another person’s motion. By avoiding VQ and modeling body/hand units with UD-VAE + ACF + MUM, MARRS captures cross-unit perception more effectively and efficiently. It achieves state-of-the-art quantitative and qualitative results.
Overall framework
Demo
https://redd.it/1t853hg
@rStableDiffusion
aigc-explorer.github.io
MARRS: Masked Autoregressive Unit-based Reaction Synthesis
Deformable Neural Radiance Fields creates free-viewpoint portraits (nerfies) from casually captured videos.
Has anyone tried LTX2.3 for Image Gen?
Before I moved to ZIT, I used Wan for generating images and it worked quite well. Im wondering if anyone has tried with LTX and if the results were good.
https://redd.it/1t81hrg
@rStableDiffusion
Before I moved to ZIT, I used Wan for generating images and it worked quite well. Im wondering if anyone has tried with LTX and if the results were good.
https://redd.it/1t81hrg
@rStableDiffusion
Reddit
From the StableDiffusion community on Reddit
Explore this post and more from the StableDiffusion community
Trained a Vit model from scratch for auto tagging
I recently trained a new anime image tagging model. To prep the data, I used SmilingWolf v3 to fix 300k bad tags and fill in 1M missing ones. I also trained an initial baseline model to help identify and add around 30k low-frequency tags.
The current V1 model is a 320x320 ViT. V1.1 is currently training at 448x448, and the higher resolution is already improving accuracy. My next goal is to wait for a 2025 dataset, clean it heavily, and train from scratch with better vocab structures (e.g.,
You can find the model, card, and demo space on HuggingFace: https://huggingface.co/Grio43/OppaiOracle Live use of the model: https://huggingface.co/spaces/Grio43/OppaiOracle
CPU based tagger
https://huggingface.co/spaces/Grio43/OppaiCPU
https://redd.it/1t8bzb3
@rStableDiffusion
I recently trained a new anime image tagging model. To prep the data, I used SmilingWolf v3 to fix 300k bad tags and fill in 1M missing ones. I also trained an initial baseline model to help identify and add around 30k low-frequency tags.
The current V1 model is a 320x320 ViT. V1.1 is currently training at 448x448, and the higher resolution is already improving accuracy. My next goal is to wait for a 2025 dataset, clean it heavily, and train from scratch with better vocab structures (e.g.,
artist:name).You can find the model, card, and demo space on HuggingFace: https://huggingface.co/Grio43/OppaiOracle Live use of the model: https://huggingface.co/spaces/Grio43/OppaiOracle
CPU based tagger
https://huggingface.co/spaces/Grio43/OppaiCPU
https://redd.it/1t8bzb3
@rStableDiffusion
Its still nuts to me how realistic AI is getting, incredible i can run it on a RTX2060 and get these results. (Z-image-Turbo)
https://redd.it/1t8ehyj
@rStableDiffusion
https://redd.it/1t8ehyj
@rStableDiffusion
Reddit
From the StableDiffusion community on Reddit: Its still nuts to me how realistic AI is getting, incredible i can run it on a RTX2060…
Explore this post and more from the StableDiffusion community
Media is too big
VIEW IN TELEGRAM
Anyone else using LTX locally on Mac via Draw Things? Here’s a WWII-style short I made.
https://redd.it/1t8lagy
@rStableDiffusion
https://redd.it/1t8lagy
@rStableDiffusion
How I feel after upvoting a post that got downvoted by bots for mentioning Forge Neo.
https://redd.it/1t8oha2
@rStableDiffusion
https://redd.it/1t8oha2
@rStableDiffusion
Wan 2.2 with LTX 2.3 ID-LoRA
Wan 2.2 with LTX 2.3 ID-LoRA workflow
This is a workflow that combines the Comfy Wan 2.2 image-to-video workflow with the Comfy LTX 2.3 ID-LoRA workflow. You can use Wan 2.2 to make your initial video then it will automatically run through LTX 2.3 to add audio to your Wan 2.2 video and extend the Wan 2.2 video with whatever you want to happen next.
Wan 2.2 image-to-video of Crystal Sparkle throwing a champagne bottle against a yacht to christen the yacht
LTX 2.3 adds the foley audio to the Wan 2.2 clip for bottle smashing against boat and ID-LoRA adds Crystal Sparkle's actual voice
Here is a link to the workflow: https://huggingface.co/ussaaron/workflows/blob/main/wan2\_2\_i2v-with-ltx-id-lora.json
https://redd.it/1t8qloh
@rStableDiffusion
Wan 2.2 with LTX 2.3 ID-LoRA workflow
This is a workflow that combines the Comfy Wan 2.2 image-to-video workflow with the Comfy LTX 2.3 ID-LoRA workflow. You can use Wan 2.2 to make your initial video then it will automatically run through LTX 2.3 to add audio to your Wan 2.2 video and extend the Wan 2.2 video with whatever you want to happen next.
Wan 2.2 image-to-video of Crystal Sparkle throwing a champagne bottle against a yacht to christen the yacht
LTX 2.3 adds the foley audio to the Wan 2.2 clip for bottle smashing against boat and ID-LoRA adds Crystal Sparkle's actual voice
Here is a link to the workflow: https://huggingface.co/ussaaron/workflows/blob/main/wan2\_2\_i2v-with-ltx-id-lora.json
https://redd.it/1t8qloh
@rStableDiffusion
Hi-Dream 01 Out : 2k Images in 20seconds on a 4090 (fp8 dev) ComfyUI
https://redd.it/1t8ypmd
@rStableDiffusion
https://redd.it/1t8ypmd
@rStableDiffusion
Reddit
From the StableDiffusion community on Reddit: Hi-Dream 01 Out : 2k Images in 20seconds on a 4090 (fp8 dev) ComfyUI
Explore this post and more from the StableDiffusion community