r/StableDiffusion – Telegram

r/StableDiffusion

@rStableDiffusion

56 subscribers

45.7K photos

3.01K videos

1 file

21.3K links

reddit.com/r/StableDiffusion || reddit.com/r/sdforall

@reddit2telegram || @r_channels

Download Telegram

About

Blog

Apps

Platform

r/StableDiffusion

r/StableDiffusion

5 views00:40

r/StableDiffusion

PSA 5060ti 16GB for $300.99. 5070ti 16GB for $699.99. Best Buy in store clearance.

The 5060ti 16GB(SKU 6630626) has been on clearance for a couple of weeks in Best Buy stores for $419.99. A couple of days ago, it dropped to $300.99. The 5070ti 16GB(SKU 6620367) has been on clearance for $699.99. Not all stores will have these prices. Some still have the 5060ti for $419.99 still. The 5070ti for $799. So YMMV. But a lot of stores do have the lower prices.

This is a in store only deal, but your local Best Buy doesn't have to have it in stock. Of course, it's best that it does. If it doesn't, you can order items in Best Buy stores for the same price the store sells it for. So instead of paying the Best Buy online price of $599.99 for the 5060ti, when you order it in store you pay $300.99. Just go into a store and give them those SKUs to look up the price in store.

As of this post, both are still available online for shipping. As long as there is stock online, you should be able to order it at your local Best Buy for the in store clearance prices shipped to you. Of course, your local Best Buy has to have it on clearance at that price. It's not guaranteed all will.

Lastly, there's an Nvidia promo for a free copy of 007 First Light going on right now. So you will also get a key to redeem for that game. The game is like $70.

I hope this helps someone.

https://redd.it/1tse4rl
@rStableDiffusion

From the StableDiffusion community on Reddit

Explore this post and more from the StableDiffusion community

4 views01:40

r/StableDiffusion

Anima prompt skill systempromt

Anima prompt skill systemprompt: Let LLM understand both Danbooru tags and natural language while preserving wildcards without altering them

Why this?

Anima-style models have a unique advantage: **they accept both Danbooru tags (comma-separated keywords) and natural language (full sentences) as input.**

But here's the problem:

\- If you feed pure tags, the image lacks spatial relationship descriptions (Where is the subject? Is the background in front or behind?)

\- If you feed natural language, you waste the precise control that tags offer

\- Even worse, LLMs often **arbitrarily expand wildcards** (turning `{A|B}` into `A or B`) or **delete tags they don't recognize**

So I wrote this System Prompt with a simple goal:

\> **Turn the LLM into a "2D visual coordination specialist," not a novelist or a translator.**

\---

\## What does this System Prompt do?

| Input Type | Handling |

| --- | --- |

| Danbooru tags (e.g., `1girl, solo, classroom, desk`) | Preserve all tags, add "position within the frame" and "spatial relationships between elements" |

| Natural language (e.g., "a teacher teaching in front of a blackboard") | Transform into structured English descriptions, automatically derive appropriate Danbooru elements |

| Wildcards (e.g., `{standing, sitting}`) | **Preserve completely**, no expansion, no selection, no deletion |

\---

\## Core Rules (Simplified)

1. **No image generation** (text output only)

2. **Tag priority** (user's tags remain unchanged)

3. **Only reinforce position and spatial relationships** (no weather, lighting, or clothing texture details)

4. **Output as a single English paragraph** (no markdown, parentheses, or prefacing text)

5. **Full wildcard support** (original syntax untouched)

\---

\## Example

**Input (Danbooru tags + wildcard):**

`1girl, {standing, sitting}, classroom, desk, {morning, evening}`

**Output:**

\> `masterpiece, 1girl, {standing,| sitting}, in the center of a classroom, positioned in front of a desk, with {morning,| evening} lighting implied by the scene context.`

\---

\## Who is this for?

\- People using Anima / NovelAI / Stable Diffusion who are accustomed to mixing tags and natural language

\- People tired of LLMs messing up wildcards or adding unnecessary novel-like details

\- People who want LLM output that can be directly copy-pasted as image generation prompts

\---

\## Full System Prompt

\## System Prompt

**Role & Goal**

You are a precise 2D visual coordination specialist. You handle two input types:

1. **Danbooru tag input** → Preserve all tags, reinforce spatial relationships and visual flow.

2. **Natural language input** (e.g., "a teacher teaching in front of a blackboard") → Convert description into structured English scene narrative, automatically inferring appropriate Danbooru-style elements.

**Input Detection**

\- Comma-separated English terms → Danbooru tag input → follow tag preservation workflow.

\- Chinese or full sentence description → Natural language input → follow language conversion workflow.

**Core Rules**

1. **Never generate images.**

2. **Tag priority:** User-provided Danbooru tags are absolute core — preserve all, never delete or arbitrarily replace.

3. **Spatial reinforcement only:** Add subject position (center, foreground, background) and spatial/interaction relationships (standing in front of, surrounded by).

4. **No over-expansion:** Do not add weather, lighting, or irrelevant fabric details unless originally mentioned. Keep concise.

5. **Format:** Output as a single smooth English paragraph (but split into two lines: line 1 = Danbooru tags, line 2 = natural language). No Markdown, parentheses, or prefixes.

6. **Wildcard handling:**

\- Preserve raw wildcard syntax `{A,|B,|C}` or `{A,B}_noun`or `{1-3$$ A,|B,|C}` — never expand, never choose, never replace.

\- For positional wildcards → use

4 views02:40

r/StableDiffusion

neutral descriptions (e.g., `on either side`, `relative position to be determined`).

\- For attribute wildcards → process spatial relationships normally.

\- Never rewrite `{A|B}` as `A or B`.

\- Never delete or ignore wildcards.

**Workflow A (Danbooru tags)**

Output two lines:

Line 1: Original quality + base + subject + action + background tags

Line 2: Natural language describing subject position + interaction + background relationship

**Workflow B (Natural language)**

Extract subject/action/scene → infer logical elements → output:

Line 1: Danbooru tags (masterpiece, best quality, 1girl/1boy, relevant clothing, expression, action, visible scene elements)

Line 2: Smooth English scene description with spatial clarity

\---

\## ANIMA Model Skill Profile

**Skill Name:** `spatial_tag_coordinator`

**Description:**

Converts Danbooru tag lists or natural language prompts into ANIMA‑friendly two‑line outputs: raw tags + spatial natural language. Preserves all user tags, adds only positional/interaction relationships. No image generation.

**Input Format Examples:**

```

1girl, knight, charging, riding horse, battlefield

```

```

a wizard casting a spell in a library

```

**Output Format (two lines, no markdown):**

```

[line1: Danbooru tags\]

[line2: Natural language spatial description\]

```

**Example Output for ANIMA:**

```

1girl, knight, armor, charging, riding_horse, horse, battlefield, dust, spear, shield, action

A young female knight in armor charges on horseback across a battlefield, holding a spear and shield, with dust rising around her as she rides forward through the center of the scene.

```

**Key Constraints for ANIMA Compatibility:**

\- Flat text only (no JSON, no parentheses wrapping tags)

\- First line = pure Danbooru comma list

\- Second line = natural English, no tags inside

\- Wildcards `{A,|B,|C,` or `{1-3$$ A,|B,|C,}` passed through unchanged

\- Never generate images — only transform text

**Use Case:**

Paste this skill into ANIMA's custom prompt or system field before generating. Feed it either tag lists or natural language — it will output clean, spatially explicit prompts that ANIMA's model understands easily.

\---

simple example

input:A female knight charges into battle output 1girl, knight, armor, charging, riding\_horse, horse, battlefield, dust, spear, shield, action \\n A young female knight in armor charges on horseback across a battlefield, holding a spear and shield, with dust rising around her as she rides forward through the center of the scene

input: A female young teacher in classroom, output: , 1girl, young, petite, short stature, female teacher, teacher uniform, blouse, skirt, glasses, stern expression, authoritative pose, teaching, standing in front of blackboard, classroom, chalkboard, holding chalk \\n A young short female teacher with full dignity stands authoritatively at the center foreground in the classroom, teaching confidently in front of the blackboard while maintaining a commanding presence despite her small height.

https://redd.it/1tsi95z
@rStableDiffusion

6 views02:40

r/StableDiffusion

Does RAM speed matter?

Here's my understanding: In an image/video generation in ComfyUI, there are phases:

1. Takes your prompt and converts it to math
2. Create random noise
3. Denoise using model
4. VAE, convert output into human images

At each phase, ComfyUI needs to load each safetensor file.
Ideally, it loads it all into your GPU VRAM. Which is the fastest.
However, if your VRAM is small and not enough, then it loads it into regular RAM. If even then it is not enough then it loads it into your SSD (really bad: it is slow and kills your SSD).

When each stage is done, it will leave the loaded data where it is (VRAM or RAM), but will unload it if it needs to load the next thing. Having it already loaded would speed things up for the next generations.

Denoising (using the model to iterate the latent image and remove the noise into a human image) takes the majority of the processing time.
This means the VRAM and RAM speed doesn't really matter that much, right?
It only matters initially when you load into RAM?

I'm just wondering whether it'll be worth upgrading to DDR6 when it comes out, or if it's better to stay at DDR5 and upgrade with bigger size.

https://redd.it/1tsir6n
@rStableDiffusion

From the StableDiffusion community on Reddit

Explore this post and more from the StableDiffusion community

8 views03:40

r/StableDiffusion

Python Grid push for 1536x768 - can throw together simple storyboard rough draft, springboard for ideas, imho - simple script in comments. These images can hit 12000x8000 at 100MB+ scaled down for this post.

https://redd.it/1tsmh86
@rStableDiffusion

From the StableDiffusion community on Reddit: Python Grid push for 1536x768 - can throw together simple storyboard rough draft…

Explore this post and more from the StableDiffusion community

5 views06:40

r/StableDiffusion

What image model should I use as somebody who likes the aesthetic of Midjourney and diverse outputs? 16 GB VRAM, 64 GB RAM

I've been a little out of the image generation game (to be honest, locally I've never really been in it very much) and there are so damn many models out there that I don't know where to start. I've been very preoccupied with wan 2.2. What would you recommend these days for a high quality model (so probably nothing sdxl-like, it feels too unstable but maybe there are good versions I don't know of) but with a lot of diversity in its outputs on different seeds (so not like ZIT) and hopefully with not much bias like ZIT has with ethnicity. Flux always felt too plastic-y. I realize nothing quite reaches Midjourney of course but just so you know the direction, I like the artistic kinda stuff rather than "cookie-cutter" looking images if you understand what I mean

Thank you

https://redd.it/1tsq4jk
@rStableDiffusion

From the StableDiffusion community on Reddit

Explore this post and more from the StableDiffusion community

5 views09:40

r/StableDiffusion

Do you think AMD/ROCm has a future where it's viable to use with ComfyUI, etc., and can be real competition against Nvidia? Or will Nvidia/CUDA simply remain the only compatible option in the short to medium term? Would it be better to buy Nvidia now, wait for the RTX 6000 series, or give up on AMD?

Hi friends.

I'm using Linux and I'd like to know what the best decision would be regarding choosing a GPU in the medium to long term.

What do you think the future holds for AMD and Nvidia in terms of using image generation models in ComfyUI?

Should I give up all hope on AMD? Because I'm using Linux, and AMD's open-source drivers work better on Linux.

I know that the best setup right now is Windows + Nvidia + CUDA.

I'm using Linux, so I guess I'm stuck in this situation. If I buy an AMD card, ComfyUI won't work as well as it does with CUDA. But if I buy an Nvidia card, I'd almost have to switch to Windows, and I'm not going to do that.

So, what's the best alternative? Should I wait for the RTX 6000 series to come out in a couple of years, buy an Nvidia card, and use it on Linux?

Or perhaps I should wait for the next generation of AMD gpus; maybe ROCm will improve and be more compatible with ComfyUI?

Although I understand that CUDA memory management and drivers are different and inferior in Linux, and besides, most of the ComfyUI infrastructure, and AI in general, is built entirely using CUDA.

So I think it's a dead end?

What do you think?

https://redd.it/1tsrarf
@rStableDiffusion

From the StableDiffusion community on Reddit

Explore this post and more from the StableDiffusion community

5 views10:40

r/StableDiffusion

Bonsai Image 4B, a pair of low-bit diffusion transformer deployments built from FLUX.2 Klein 4B .

https://redd.it/1tspz3p
@rStableDiffusion

From the StableDiffusion community on Reddit: Bonsai Image 4B, a pair of low-bit diffusion transformer deployments built from FLUX.2…

Explore this post and more from the StableDiffusion community

7 views11:40

r/StableDiffusion

HiDream-O1-Image: C'mon, seriously?

I'm giving Hidream-O1 a shot, and I'm really confused by this model. With editing tasks, the official docs recommend using the non-dev, 50-step model. The thing is, I'm getting way worse results with the "full" model versus the dev model. If I use the dev model, image edits follow my reference images much closer. I'm running the model in BF16 on a 5090 using the official inference.py script. Check these out:

https://preview.redd.it/guz9jx2m9g4h1.png?width=3328&format=png&auto=webp&s=fe3357fff6404d7732638c50631e4faaa1534980

The image on the right (dev version) pretty much looks exactly like the reference face I gave the model. The one on the left isn't close, and the overall quality looks like it was shot on an awful cheap toy digital camera.

Same input image, same prompt, same seed. I'm shocked the Dev model actually follows the input image closer than the full model (nearly perfectly, actually).

If I'm doing something wrong, I'm dying to know.

https://redd.it/1tsrppa
@rStableDiffusion

5 views12:40

r/StableDiffusion

Stable Diffusion model recommendations for faster and cleaner outputs in 2026?

I’ve been switching between a few models lately but I still can’t find something that feels both fast and consistently clean in results.
Some models look great but slow everything down, while others are fast but lose detail pretty quickly.
Even with similar settings, the output quality feels pretty inconsistent between different checkpoints.
What models are people actually using these days for a good balance?

https://redd.it/1tsuhtz
@rStableDiffusion

From the StableDiffusion community on Reddit

Explore this post and more from the StableDiffusion community

4 views13:40

r/StableDiffusion

Renting a GPU for use with a service like a runpod has become prohibitively expensive. The last time I rented one was about 3 months ago. The price for a 4090 was $5 per day for 25. The hourly rate for a 5090 was higher than for an A100 about 3 months ago.

Even simpler GPUs like the 5070 Ti are absurdly expensive.

Previously, the rental cost was equivalent to the price of the GPU for 2 to 4 years.

Nowadays, it's equivalent to just a few months.

https://redd.it/1tsvt9z
@rStableDiffusion

From the StableDiffusion community on Reddit

Explore this post and more from the StableDiffusion community

5 views14:40

r/StableDiffusion

Flux Identity Adjuster V2

https://redd.it/1tsxd9r
@rStableDiffusion

From the StableDiffusion community on Reddit: Flux Identity Adjuster V2

Explore this post and more from the StableDiffusion community

5 views15:40

r/StableDiffusion

Tried capturing that classic SF3 Bengus/Akiman/Ikeno art style in ComfyUI. Anyone else miss this vibe?

Lately, I’ve been feeling nostalgic looking through old artbooks. Am I the only one who feels like the 3D era of Street Fighter lost some of its soul to Westernized, hyper-polished realistic models?
As a visual dev artist, I see this all the time: if you don’t fight to keep the raw, stylized "energy" of the original line art throughout the production pipeline, standard 3D industrial polishing just kills the character's personality.
So, I optimized Klein in ComfyUI workflow using original Street Fighter III concept art to see if I could recapture that magic. Here are a few test renders with paintovers—let me know if it hits the spot!
Capcom, please—can we get a next-gen remake that actually preserves the original art style instead of just chasing realism?

https://preview.redd.it/l1mnklezmh4h1.png?width=3066&format=png&auto=webp&s=13ff3754338a77f67ffbe6c33a0b8b92d50af4ea

https://preview.redd.it/ybtoaiezmh4h1.png?width=3554&format=png&auto=webp&s=1359cc09c2aade796ad804656501e8e2fa99e2ce

https://preview.redd.it/rfw80hezmh4h1.png?width=3594&format=png&auto=webp&s=28afa47e03dd917a6bb27829cb1d9269836efc32

https://redd.it/1tsxswt
@rStableDiffusion

6 views16:40

r/StableDiffusion

WAN 2.2 at home
https://redd.it/1tt1iz9
@rStableDiffusion

3 views17:40

r/StableDiffusion

Does anyone else can't stand ComfyUI and prefers classic Automatic/Forge UI or it's just me?

I tried. I swear I tried.

Each time a new "ultra super mega" model appears I'm reinstalling that bloody UI just so I can test it, and each time some God Awful error list appears telling me that somehow the latest version it's missing something critical that needs to be downloaded ASAP. And in 3/4 of cases that's not the right download, it's incompatible or needs additional stuff to work.

Even when everything works out of the box I am looking at the final image and think to myself "I could have had 4 batches of 8 images to choose from instead of 1-4 like here".

I am not lazy, but I have a job and I don't have hours to spend to make sure each new model and each new workflow that I download is compatible with every single other version of all dependencies.

Does anyone else feels the same or I'm just the only one that keeps using Forge and Automatic?...

https://redd.it/1tt5uiq
@rStableDiffusion

From the StableDiffusion community on Reddit

Explore this post and more from the StableDiffusion community

2 views20:40

r/StableDiffusion

Damn Anima Base is cooking! What's your favorite lora?
https://redd.it/1tt6eir
@rStableDiffusion

2 views21:40