Ideogram 4 isn't overhyped, it's underrated

Just to set some context before I dive in. I'm not someone who gets hyped over every new model that drops. Ernie, MS Lens, HiDream, even ZiT (sorry ZiT fans)... I thought most of them were overhyped. Z-Image is solid, but I personally stick to Flux and Qwen Image. So when I say Ideogram is the first model since Z-Image that genuinely caught my attention, that means something. And it did not disappoint.

I think this is the closest we've gotten to NB or GPT Image quality in an open model. In some cases, depending on how you prompt it, I'd argue it's even better. And keep in mind that this is the model with zero LoRAs, no custom nodes or months worth of community optimizations. This is the floor, the worst it'll ever be, and it's already impressive.

On the safety filter

I haven't had a single image blocked. I'm using Kijai's JSON prompt builder workflow along with the safety filter bypass node, and it handles explicit content without issues. The only real limitation is genitals looking a bit rough, but that's an expected model constraint, not a filter problem. Hopefully that can be fixed through training.

On generation times

If your 3090 or 5070 is taking 15 minutes per image, something is wrong with your setup. I'm running 2MP images at 20 steps in about 2 minutes. Drop to 1MP and 12 steps and you're at roughly 30 seconds. Quality takes a hit, but it's perfectly fine for quick scene testing. I have a 4080 and 64GB DDR4 RAM.

On JSON prompting

This is the complaint I find most frustrating, because it's largely a non-issue. It's not like you have to write JSON by hand - there's already a node that lets you visually draw and build your scene, which generates the JSON for you. If you don't want to do even that, you can just write a normal prompt and have an LLM convert it. Having fine-grained control over composition and scene layout is a feature, not a burden. I'd much rather place elements deliberately than write a wall of text and hope the model interprets it correctly. People have been asking for open models that compete with closed ones, and now that we have one with this level of control, it seems odd to complain about that being the issue.

This is still "v1", no community fine-tunes, no loras, no custom nodes (except for the ones mentioned), no optimized workflows, nothing. It's only going to get better from here. I really hope the community gets behind it. A few months of training and experimentation and we could have something special.

The main reason I wrote this is because I keep seeing criticism that just doesn't match my experience with the model, and I wanted to push back on some of it with some actual context.

https://redd.it/1tzwl34
@rStableDiffusion