r/StableDiffusion

Phosphene 3.0 — open source AI video + image suite for Apple Silicon. Train your own LTX characters.

Sharing Phosphene 3.0. It's a free panel that runs LTX-Video 2.3 and a couple of image models natively on Apple Silicon. Local, MIT license, no subs, no cloud.

The thing that sets it apart from "yet another LTX wrapper": you can **train your own characters** inside the panel. Drop 30 to 80 photos, click Train, get a face LoRA back. Add a voice clip and you get a voice LoRA too. Auto-captions with Gemma 3 12B locally. \~3 hours per character on an M4 Max 64 GB.

**What 3.0 ships**

\- Text → video+audio (LTX-2 generates joint audio+video in one pass)

\- Image → video+audio

\- Audio → video (drive a clip with an audio reference)

\- FFLF (first frame + last frame interpolation)

\- Extend (continue an existing clip)

\- Character training (face + optional voice LoRA, from a single dataset)

\- Image Studio with three engines: Qwen-Image-Edit-2511, HiDream-O1, and the FLUX.1 family. Multi-reference composition up to 3 subjects.

**HiDream-O1 ported to MLX**

HiDream released their O1 image model on May 14. Got it running natively on Apple Silicon five days later. Photoreal portraits, instruction edits, multi-subject. \~67 seconds per 1024² on a 64 GB Mac.

**Hardware**

Apple Silicon only. Capability tiers auto-detected:

\- 16 / 24 GB: 512 px video, text-to-image works

\- 32 GB: 768 px

\- 64 GB+: 1024×576 video, full HD image, character training

\- A 7-second character clip with synced audio renders in \~6 min on M4 Max 64 GB

\- Character training takes \~3 hours per character

**Install**

One-click via Pinokio (search Phosphene). Or clone the repo and run the panel directly.

**Credits**

LTX Video 2.3 by Lightricks (their license on the weights). MLX port by `dgrauet/ltx-2-mlx`. HiDream by HiDream AI. Phosphene the panel is MIT.

**Honest limits**

\- Apple Silicon only. No Intel Mac, no Windows, no Linux.

\- Dialogue audio is hit-or-miss. Ambient/diegetic sound is where LTX-2 shines.

\- Character LoRAs are video-only (face + voice). Image LoRAs work in the Studio via Qwen/HiDream + a separate LoRA stack.

\- First run downloads \~28 GB of weights. Takes a while.

Repo: github.com/mrbizarro/phosphene

X: x.com/PhospheneAI

Dev: https://x.com/AIBizarrothe

Feedback welcome. Especially curious what people make with the character training side.

https://redd.it/1tkh9c2
@rStableDiffusion

GitHub

GitHub - mrbizarro/phosphene: Local generative video panel for Apple Silicon. Wraps LTX-2 MLX, joint audio+video, one-click Pinokio…

Local generative video panel for Apple Silicon. Wraps LTX-2 MLX, joint audio+video, one-click Pinokio install. - mrbizarro/phosphene

5 views12:40

r/StableDiffusion

Tencent released Z-Image 6B with pixel space gen. No VAE & 1k Resolution.
https://redd.it/1tkipk6
@rStableDiffusion

6 views13:40

r/StableDiffusion

Creating character turnaround sheets with Flux 2 Klein in ComfyUI

I made a small ComfyUI workflow for creating multi angle reference sheets from a single input image.

The main use case is character sheets. You give it one character image, and the workflow tries to generate multiple consistent views like front three quarter, side profile, rear view, rear three quarter, high angle, low angle, and a close detail view. The goal is to keep the same face, outfit, pose, expression, proportions, and general design while only changing the camera angle.

I built it mostly with native ComfyUI nodes. The only non native part, as far as I remember, is the GGUF loader. The prompts are written in a generic way, so it can also work for people, props, vehicles, creatures, or objects, but I mainly made it for character sheet generation.

I tested it with the Flux 2 Klein 4B Q4 GGUF model because I currently have access to only 4 GB VRAM. For such a small setup, it is giving acceptable results. It is not perfect, especially with difficult rear views or fine clothing continuity, but it is usable for blocking out reference angles and building rough character sheets.

I expect the 9B variant to give much better consistency and detail, especially for faces, costume continuity, proportions, and rear view inference.

This is not meant to be a final polished character turnaround solution. It is more of a practical workflow for quickly getting usable angle references from one image, especially when working with AI video, inpainting, first frame last frame generation, or character continuity.

Sharing it in case it is useful to anyone experimenting with Flux 2 Klein on low VRAM setups.

https://pastebin.com/EyRM0zed

https://preview.redd.it/y8v7v06d4o2h1.png?width=5824&format=png&auto=webp&s=3d7acb275bf8652b68501e9efb33af7d324e75ca

https://redd.it/1tkf9uc
@rStableDiffusion

Pastebin

{ "id": "154e9cc6-7022-4964-b7df-b3aa9402b32a", "revision": 0, "last_no - Pastebin.com

Pastebin.com is the number one paste tool since 2002. Pastebin is a website where you can store text online for a set period of time.

5 views14:40

r/StableDiffusion

FLUX klein: "We may monitor use"... wait what?

>Safety. Black Forest Labs takes model safety seriously. We may monitor use to detect misuse or abuse of our models and services.

https://huggingface.co/black-forest-labs/FLUX.2-klein-base-9B

How would they monitor your usage if you run it locally? Unless they spy and send data back to their servers?

https://redd.it/1tkh0m1
@rStableDiffusion

huggingface.co

black-forest-labs/FLUX.2-klein-base-9B · Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

5 views15:40

r/StableDiffusion

Nvidia released "Anyflow" based on Wan, basically it kinda like dynamic time step adjuster depends on your compute budget
https://redd.it/1tkmxol
@rStableDiffusion

5 views16:40

r/StableDiffusion

An Update on Nodes 2.0 from Comfy Org

Hi r/StableDiffusion, Nodes 2.0 has been in beta since last July, and we want to be transparent with the community about where we’re headed.

**Over time, we plan to gradually make the new interface the default experience in ComfyUI.**

We know the reception has been mixed. There are many things we handled ineffectively early on, and the team has been working hard over the past months to address them. We appreciate everyone who has continued testing, giving feedback, and pushing us on where the experience falls short.

# The Problem With Canvas

Canvas rendering worked, but it cut us off from everything the modern web has built over the last two decades: component libraries, design systems, accessibility tooling, the entire ecosystem developers rely on to ship fast. Every widget had to be drawn pixel by pixel.

Generative AI doesn't sit still. New models, new modalities, new techniques, new ways of combining them. The workflows that made sense six months ago get rethought constantly. Our users are doing professional creative work, and they expect the controls that professional tools have had for years: curve editors, color grading, histograms, timeline scrubbing. We can't keep rebuilding those from scratch.

# What a Modern Frontend Unlocks

With a modern frontend framework, a curve editor that would have taken weeks now takes days. A gradient slider with live preview, hours.

Since the Nodes 2.0 beta launched, we’ve already shipped:

* Curve editors
* Histogram displays
* Live cropping UI
* Before/after comparison sliders
* Image processing nodes for color correction, film grain, chromatic aberration, sharpening, and levels
* Realtime shader nodes with subgraph blueprints
* Inline error displays and status badges directly on nodes

This foundation also unlocks things that were previously impractical or impossible:

* Live execution previews on subgraphs
* Parallel node execution with realtime feedback
* Richer interfaces for future modalities and workflows

# Custom Nodes

Most custom nodes work unchanged. For nodes that require updates, we’re investing heavily in migration support:

* A new public frontend API
* Documentation and migration guides
* Reference implementations
* Direct collaboration with node authors to identify gaps

We understand this creates additional work for maintainers. For many popular custom nodes, we’re happy to directly help submit PRs and assist with migration work ourselves.

Recent advances in coding agents have also made these frontend migrations significantly easier than they would have been even a year ago.

Thank you for your patience as we work through this transition together.

# Timeline

There is no fixed cutoff timeline yet. Right now, the priority is being transparent early and giving the ecosystem time to adapt.

Current plan:

* Nodes 2.0 remains opt-in for now (`Settings > Rendering > Nodes 2.0`)
* It later becomes the default while legacy mode remains available
* Eventually, legacy mode will become unmaintained and will likely break over time

Going forward, **new frontend-focused ComfyUI features will ship exclusively on Nodes 2.0.**

# Feedback

Please let us know what you think and the problems you run into. We need testing on complex workflows, large graphs, and custom nodes with unusual rendering. Report issues on [GitHub](https://github.com/Comfy-Org/ComfyUI_frontend/issues) or #bug-reports on Discord 🙏

Once again, thank you all for supporting Comfy.

And most importantly, thank you to all the custom node authors who continue making this ecosystem incredibly vibrant, creative, and powerful.

https://redd.it/1tkqrwy
@rStableDiffusion

GitHub

Issues · Comfy-Org/ComfyUI_frontend

Official front-end implementation of ComfyUI. Contribute to Comfy-Org/ComfyUI_frontend development by creating an account on GitHub.

5 views18:40

r/StableDiffusion

Why isn't there a video model specifically made for anime?

Most current video models are completely focused on realism. The few that try to handle anime usually end up producing results that look like a weird mix of 3D and realism instead of something that actually feels 2D.

Wouldn't it actually be easier to create a smaller model similar to Anima, but trained exclusively on anime datasets? In theory, excluding realism and other styles should reduce compute requirements and simplify training quite a bit.

Personally, I'm already tired of almost every video model chasing the exact same goal: cinematic realism. There are dozens of models doing that already; some better, some worse, but in the end they all feel pretty similar.

Meanwhile, there’s barely anything that truly understands 2D anime physics, exaggerated expressions, or the way traditional animation moves. Or at least I don't know of any open-source model that comes close.

Back then, Sora was probably the best AI model for anime-style video because it understood 2D expressions and physics surprisingly well. Right now, Seedance seems to be the closest thing to that, with Grok somewhere behind it, but on the open-source side I still don't see anything remotely similar.

Maybe instead of trying to build one massive all-in-one model that does every style imaginable, it would make more sense to have smaller specialized models focused on specific styles.

I don't know, maybe I'm completely wrong and anime-style video generation is actually harder or more computationally expensive than realism. It's just something I've been wondering about for a while.

https://redd.it/1tksg1c
@rStableDiffusion

From the StableDiffusion community on Reddit

Explore this post and more from the StableDiffusion community

4 views19:40

r/StableDiffusion

Sulphur released as LORA for LTX2.3

https://huggingface.co/SulphurAI/Sulphur-2-base/blob/main/experimental/sulphur\_experimental\_lora\_v1.safetensors

https://redd.it/1tkus7j
@rStableDiffusion

huggingface.co

experimental/sulphur_experimental_lora_v1.safetensors · SulphurAI/Sulphur-2-base at main

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

4 views21:40

r/StableDiffusion

0:15

This media is not supported in your browser

VIEW IN TELEGRAM

using ltx 2.3 i2v 3d animation with reference voice using TalkVid Lora.

https://redd.it/1tksa6t
@rStableDiffusion

4 views22:40

r/StableDiffusion

AI image generator vs drawing by hand, an artist's honest take.

the people who frame this as one replacing the other are missing something. they are different activities that scratch different parts of my brain. generation is fast and expansive. drawing is slow and specific. both are useful. neither is the same as the other.

four years of drawing. started traditional, moved to digital, still do both.
picked up AI image generation about a year ago mostly out of curiosity. expected to use it a few times and move on.

that is not what happened.

what i did not expect was how much using AI generation made me better at drawing. having the ability to instantly visualize a composition or a lighting setup or a color palette before committing hours to it changed how i approach my own work. i use it to explore. i use it to get unstuck. i use it to see things i could not have imagined as clearly on my own. and then i draw the thing myself anyway because that is still the part i actually want to do. if you draw and have been avoiding AI generation because it feels like a threat, i get it. i felt that way too at first. it just turned out not to be true for me.

https://redd.it/1tkv6er
@rStableDiffusion

From the StableDiffusion community on Reddit

Explore this post and more from the StableDiffusion community

4 views00:40

About

Blog

Apps

Platform