This media is not supported in your browser
VIEW IN TELEGRAM
Nvidia solved VAE? Fast and High-Resolution Latent Decoding
with Pixel Diffusion

https://redd.it/1tn3m6n
@rStableDiffusion
FeatherOps: Fast fp8 matmul on RDNA3 without native fp8, now supports more models

https://github.com/woct0rdho/ComfyUI-FeatherOps

There was not much update on the kernel itself since March, and I did a lot on ComfyUI integration. Currently tested models are Anima, LTX 2.3, Qwen-Image, Wan, and other models may also work out of the box. For some workloads you may see 30~50% speedup, but your mileage may vary.

https://redd.it/1tn0noo
@rStableDiffusion