r/StableDiffusion

10 views12:47

Here's a trick you can perform with Depth map + FFLF
https://www.youtube.com/watch?v=1QvTmkXF-HY

https://redd.it/1seqv34
@rStableDiffusion

YouTube

Gourmet Pyramids

This showcase here is an AI visual technique I devised that enables the creation of food arranged in any desired shape. I think it can be utilised for TV commercials.

All of my AI demos:
https://www.youtube.com/playlist?list=PLe3OBqR7FeRhZM6SNoIWibQ1PA2JREYtL

9 views13:47

r/StableDiffusion

"Blade Trance" (ZIT + Wan 2.2)
https://youtu.be/RkQVrGKNMO4

https://redd.it/1sex2vd
@rStableDiffusion

YouTube

182 | "Blade Trance" | Arca Gidan 2 entry (ZIT + Wan 2.2) [4K]

"Blade Trance" Arca Gidan 2 entry
https://arcagidan.com/entry/6d9e7be8-817c-4465-a2a0-21763d33fb48

- Input images created with ZIT, and Wan 2.2 I2V as image editor
- Animated with Wan 2.2 I2V / FFLF, and Wan VACE Clip Joiner
- Upscaled with Wan 2.2 creative…

8 views14:57

r/StableDiffusion

Open Sourcing my 10M model for video interpolations with comfy nodes. (FrameFusion)

Hello everyone, today I’m releasing on GitHub the model that I use in my commercial application, FrameFusion Motion Interpolation.

# A bit about me

(You can skip this part if you want.)

Before talking about the model, I just wanted to write a little about myself and this project.

I started learning Python and PyTorch about six years ago, when I developed Rife-App together with Wenbo Bao, who also created the DAIN model for image interpolation.

Even though this is not my main occupation, it is something I had a lot of pleasure developing, and it brought me some extra income during some difficult periods of my life.

Since then, I never really stopped developing and learning about ML. Eventually, I started creating and training my own algorithms. Right now, this model is used in my commercial application, and I think it has reached a good enough point for me to release it as open source. I still intend to keep working on improving the model, since this is something I genuinely enjoy doing.

# About the model and my goals in creating it

My focus with this model has always been to make it run at an acceptable speed on low-end hardware. After hundreds of versions, I think it has reached a reasonable balance between quality and speed, with the final model having a little under 10M parameters and a file size of about 37MB in fp32.

The downside of making a model this small and fast is that sometimes the interpolations are not the best in the world. I made this video with examples so people can get an idea of what to expect from the model. It was trained on both live action and anime, so it works decently for both.

I’m just a solo developer, and the model was fully trained using Kaggle, so I do not have much to share in terms of papers. But if anyone has questions about the architecture, I can try to answer. The source code is very simple, though, so probably any LLM can read it and explain it better than I can.

# Video example:

https://reddit.com/link/1sezpz7/video/qltsdwpzgstg1/player

It seen that Reddit is having some trouble showing the video, the same video can be seen on youtube:

https://youtu.be/qavwjDj7ei8

# A bit about the architecture

Honestly, the main idea behind the architecture is basically “throw a bunch of things at the wall and see what sticks”, but the main point is that the model outputs motion flows, which are then used to warp the original images.

This limits the result a little, since it does not use RGB information directly, but at the same time it can reduce artifacts, besides being lighter to run.

# Comfy

I do not use ComfyUI that much. I used it a few times to test one thing or another, but with the help of coding agents I tried to put together two nodes to use the model inside it.

Inside the GitHub repo, you can find the folder ComfyUI_FrameFusion with the custom nodes and also the safetensor, since the model is only 32MB and I was able to upload it directly to GitHub.

You can also find the file "FrameFusion Simple Workflow.json" with a very simple workflow using the nodes inside Comfy.

I feel like I may still need to update these nodes a bit, but I’ll wait for some feedback from people who use Comfy more than I do.

# Shameless self-promotion

If you like the model and want an easier way to use it on Windows, take a look at my commercial app on Steam. It uses exactly the same model that I’m releasing on GitHub, it just has more tools and options for working with videos, runs 100% offline, and is still in development, so it may still have some issues that I’m fixing little by little. (There is a link for it on the github)

I hope the model is useful for some people here. I can try to answer any questions you may have. I’m also using an LLM to help format this post a little, so I hope it does not end up looking like slop or anything.

# And finally, the link:

GitHub:

From the StableDiffusion community on Reddit

Explore this post and more from the StableDiffusion community

7 views16:52

r/StableDiffusion

https://github.com/BurguerJohn/FrameFusion-Model/tree/main

https://redd.it/1sezpz7
@rStableDiffusion

GitHub

GitHub - BurguerJohn/FrameFusion-Model: Compact image interpolation model for generating in-between frames, with ComfyUI support…

Compact image interpolation model for generating in-between frames, with ComfyUI support and the same core model used in FrameFusion Motion Interpolation. - BurguerJohn/FrameFusion-Model

9 views16:52

r/StableDiffusion

My only wish (as of right now)
https://redd.it/1sf4cxr
@rStableDiffusion

7 views19:20

r/StableDiffusion

Anima preview3 was released

For those who has been following Anima, a new preview version was released around 2 hours ago.

Huggingface: https://huggingface.co/circlestone-labs/Anima

Civitai: https://civitai.com/models/2458426/anima-official?modelVersionId=2836417

The model is still in training. It is made by circlestone-labs.

The changes in preview3 (mentioned by the creator in the links above):

Highres training is in progress. Trained for much longer at 1024 resolution than preview2.
Expanded dataset to help learn less common artists (roughly 50-100 post count).

https://redd.it/1sf6w2x
@rStableDiffusion

huggingface.co

circlestone-labs/Anima · Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

7 views23:18

r/StableDiffusion

Last week in Generative Image & Video

I curate a weekly multimodal AI roundup, here are the open-source image & video highlights from the last week:

GEMS \- Closed-loop system for spatial logic and text rendering in image generation. Outperforms Nano Banana 2 on GenEval2. [GitHub](https://github.com/lcqysl/GEMS) | [Paper](https://arxiv.org/abs/2603.28088)

https://preview.redd.it/16r9ffhd9wtg1.png?width=1456&format=png&auto=webp&s=325ef8a75d23cfa625ac33dfd4d9727c690c11b0

ComfyUI Post-Processing Suite \- Photorealism suite by thezveroboy. Simulates sensor noise, analog artifacts, and camera metadata with base64 EXIF transfer and calibrated DNG writing. GitHub

https://preview.redd.it/mhs0fi5f9wtg1.png?width=990&format=png&auto=webp&s=716128b81d8dd091615d3ede8f0acbcb3d1327a6

CutClaw \- Open multi-agent video editing framework. Autonomously cuts hours of footage into narrative shorts. [Paper](https://arxiv.org/abs/2603.29664) | [GitHub](https://github.com/GVCLab/CutClaw) | [Hugging Face](https://huggingface.co/papers/2603.29664)

https://reddit.com/link/1sfj9dt/video/uw4oz84j9wtg1/player

Netflix VOID \- Video object deletion with physics simulation. Built on CogVideoX-5B and SAM 2. Project | Hugging Face Space

https://reddit.com/link/1sfj9dt/video/1vzz6zck9wtg1/player

Flux FaceIR \- Flux-2-klein LoRA for blind or reference-guided face restoration. [GitHub](https://github.com/cosmicrealm/ComfyUI-Flux-FaceIR)

https://preview.redd.it/05o2181m9wtg1.png?width=1456&format=png&auto=webp&s=691420332c1e42d9511c7d1cbecf305a5d885d67

Flux-restoration \- Unified face restoration LoRA on FLUX.2-klein-base-4B. GitHub

https://preview.redd.it/l69v7cfn9wtg1.png?width=1456&format=png&auto=webp&s=1711dc1321b997d4247e5db0ac8e13ec4e56180b

LTX2.3 Cameraman LoRA \- Transfers camera motion from reference videos to new scenes. No trigger words. [Hugging Face](https://huggingface.co/Cseti/LTX2.3-22B_IC-LoRA-Cameraman_v1)

https://reddit.com/link/1sfj9dt/video/v8jl2nlq9wtg1/player

Honorable Mentions:

Gen-Searcher \- Agentic search image generation across styles. Hugging Face | GitHub

https://preview.redd.it/suqsu3et9wtg1.png?width=1268&format=png&auto=webp&s=8008783b5d3e298703a8673b6a15c54f4d2155bd

OmniVoice \- 600+ language TTS with voice cloning. [Hugging Face](https://huggingface.co/k2-fsa/OmniVoice) | [ComfyUI](https://github.com/Saganaki22/ComfyUI-OmniVoice-TTS)

https://reddit.com/link/1sfj9dt/video/im1ywh7gcwtg1/player

DreamLite \- On-device 1024x1024 image gen and editing in under a second on a smartphone. (I couldnt find models on HF) GitHub

Checkout the full roundup for more demos, papers, and resources.

https://redd.it/1sfj9dt
@rStableDiffusion

GitHub

GitHub - lcqysl/GEMS: GEMS: Agent-Native Multimodal Generation with Memory and Skills

GEMS: Agent-Native Multimodal Generation with Memory and Skills - lcqysl/GEMS

6 views06:17

r/StableDiffusion

A new SOTA local video model (HappyHorse 1.0) will be released in april 10th.

https://redd.it/1sfo3dq
@rStableDiffusion

From the StableDiffusion community on Reddit: A new SOTA local video model (HappyHorse 1.0) will be released in april 10th.

Explore this post and more from the StableDiffusion community

8 views15:11

r/StableDiffusion

8 views15:12

r/StableDiffusion

Built a tool for anyone drowning in huge image folders: HybridScorer
https://redd.it/1sg5paj
@rStableDiffusion

7 views23:27

r/StableDiffusion

Anima Preview 3 is out and its better than illustrious or pony.

this is the biggest potential "best diffuser ever" for anime kind of diffusers. just take a look at it on civitai try it and you will never want to use illustrious or pony ever again.

https://redd.it/1sgfjbs
@rStableDiffusion

From the StableDiffusion community on Reddit

Explore this post and more from the StableDiffusion community

6 views08:01

r/StableDiffusion

Vibe Code Your First ComfyUI Custom Node Step by Step (Ep12)
https://www.youtube.com/watch?v=oiiCkrX8hq4

https://redd.it/1sfvnnz
@rStableDiffusion

YouTube

Vibe Code Your First ComfyUI Custom Node Step by Step (Ep12)

Learn how to create your first ComfyUI custom node step by step with AI, even if you have no coding experience. In this episode, I show how to vibe code a working custom node for ComfyUI using tools like Gemini and Claude, how custom nodes are structured…

3 views12:03

r/StableDiffusion

ACE-Step 1.5 XL Turbo — BF16 version (converted from FP32)

I converted the ACE-Step 1.5 XL Turbo model from FP32 to BF16.

The original weights were \~18.8 GB in FP32, this version is \~9.97 GB — same quality, lower VRAM usage.

🤗 https://huggingface.co/marcorez8/acestep-v15-xl-turbo-bf16

https://redd.it/1sgiqg7
@rStableDiffusion

huggingface.co

ACE-Step/acestep-v15-xl-turbo · Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

4 views12:40

r/StableDiffusion

Qwen 2512 is so Underrated, prompt understanding is really great, only Flux 2 Dev is better. I'm using Q4KS with 4-6 steps and it is fast (20-30 sec per gen), almost as fast as Anima model. It just need that LoRA love from the community.

https://redd.it/1sgnfv0
@rStableDiffusion

From the StableDiffusion community on Reddit: Qwen 2512 is so Underrated, prompt understanding is really great, only Flux 2 Dev…

Explore this post and more from the StableDiffusion community

4 views13:40

r/StableDiffusion

3 views13:40

About

Blog

Apps

Platform