TVCG 2026: MARRS for Human Motion Action-Reaction Synthesis

# MARRS: Masked Autoregressive Unit-based Reaction Synthesis

Project page: **https://aigc-explorer.github.io/MARRS/**

Introducing MARRS: a new framework for human action-reaction synthesis that generates coordinated, fine-grained reactions conditioned on another person’s motion. By avoiding VQ and modeling body/hand units with UD-VAE + ACF + MUM, MARRS captures cross-unit perception more effectively and efficiently. It achieves state-of-the-art quantitative and qualitative results.

Overall framework

Demo

https://redd.it/1t853hg
@rStableDiffusion
Has anyone tried LTX2.3 for Image Gen?

Before I moved to ZIT, I used Wan for generating images and it worked quite well. Im wondering if anyone has tried with LTX and if the results were good.

https://redd.it/1t81hrg
@rStableDiffusion
Trained a Vit model from scratch for auto tagging

I recently trained a new anime image tagging model. To prep the data, I used SmilingWolf v3 to fix 300k bad tags and fill in 1M missing ones. I also trained an initial baseline model to help identify and add around 30k low-frequency tags.

The current V1 model is a 320x320 ViT. V1.1 is currently training at 448x448, and the higher resolution is already improving accuracy. My next goal is to wait for a 2025 dataset, clean it heavily, and train from scratch with better vocab structures (e.g., artist:name).

You can find the model, card, and demo space on HuggingFace: https://huggingface.co/Grio43/OppaiOracle Live use of the model: https://huggingface.co/spaces/Grio43/OppaiOracle

CPU based tagger
https://huggingface.co/spaces/Grio43/OppaiCPU

https://redd.it/1t8bzb3
@rStableDiffusion
Media is too big
VIEW IN TELEGRAM
Anyone else using LTX locally on Mac via Draw Things? Here’s a WWII-style short I made.

https://redd.it/1t8lagy
@rStableDiffusion