Artificial Analysis needs to address HiDream-01 Benchmarks

I'm struggling to understand how an utterly deficient model like HiDream-01 could have performed so well on user preference benchmarks. I don't want to jump to conclusions or speculate baselessly on how they did it, but it absolutely warrants an investigation if people are expected to take this benchmark seriously in the future. I just want an explanation for how something like this happens and, if it was illegitimate, how they will prevent it in the future.

https://redd.it/1t9eifa
@rStableDiffusion
I made some Slider Loras for Ace-Step 1.5 if anyone is interested

https://huggingface.co/Xanthius/Ace-Step-1.5-XL-Concept-Sliders/tree/main


Unfortunately AI Toolkit doesn't have native support for Slider Loras for Ace-step 1.5 but I was able to edit the code enough to get it working properly and now I can train concept sliders in about 10 mins to an hour each and without needing specific datasets for the concepts. Since nobody else has a working way to get sliders trained up themselves, I decided to put together a collection of them for people to use if they want to.


My first sliders on there are:


\- male to female voice

\- studio production to lofi

\- Bass boost

\- Choir to solo vocalist

\- digital to acoustic sound

\- Aggressive to gentle

\- drum intensity

\- energetic to calm

\- happiness

\- soft to projected voice

\- talking to singing

\- tempo

\- danceability


But I intend to add some more if people have ideas for them

https://redd.it/1t9e5cj
@rStableDiffusion
OSTRIS about HiDream-O1 LoRA on ToolKit

I am running my first test on training a HiDream-O1 LoRA on AI Toolkit. I don't want to get too excited too early. But this is the coolest model I have EVER seen. Super efficient pixel space. No VAE. No Text Encoder. Trains super fast. This is an industry changing innovation!

https://x.com/ostrisai/status/2053256188142428341


https://redd.it/1t9h7ps
@rStableDiffusion
Why is realistic skin such an issue for models?

The internet is full of normal, candid photos of people with natural skin texture. Theres a subset of heavily retouched editorial or beauty photography with that smooth porcelain skin look, but that’s clearly a minority of all human images online. Most photos of people are just regular snapshots where skin looks like actual skin.

So why do image models, especially open source ones, struggle so much to generate realistic looking people out of the box? Why do they default to this plasticky, airbrushed, over-retouched aesthetic when that’s not what the majority of the training data actually looks like?

Its striking how hard it is for models to reproduce something as common and statistically ordinary as normal human skin without needing specialized prompting, LoRAs, finetunes, or upscalers. Natural skin texture should arguably be the baseline behavior, yet it very obviously isnt. Why?

https://redd.it/1t9gv4z
@rStableDiffusion
Which workflows are you guys using now for LTX 2.3?

Since prompt relay and other new workflows have released recently, it looks like there are far more options to use ltx 2.3, what are some of the best quality, or coolest workflows you guys have seen or used so far?

https://redd.it/1t9itpr
@rStableDiffusion
I built a site to create free AI videos using LTX 2.3 running on my own GPUs
https://redd.it/1t9juoy
@rStableDiffusion
This media is not supported in your browser
VIEW IN TELEGRAM
TenStrip's Workflow is the first LTX 2.3 workflow I found that actually works for Spicy Content it's almost like using the old Grok.

https://redd.it/1t9pbjd
@rStableDiffusion
The Anima realism model is crazy good. Don’t miss it!

I’ve been messing with the anima realism model posted here (https://civitai.red/models/2585622/ultrareal-fine-tune-anima). If you want prompt adherence for weird stuff, it does a really good job. What’s cool is you can do hybrid danbooru / natural language and it just goes with it.

I’m stunned at how good it is and surprised it’s not getting more traction, especially since this is the authors experiment and the model and this finetune aren’t done yet. The output is decent if you prompt well. It’s not as photo realistic as ZIT or whatever but it will do all your weird danbooru tags other ones blush over. I actually think for the amateur photography all you guys want here it’s a good model.

I do 50 steps , 5cfg, euler (not ancestral). Anima is slow as hell on my Mac for such a small model but hoping the devs improve it somehow. It also works with the turbo lora!

Additionally I saw someone extracted the realism ‘stuff’ as a lora. It’s in the comments of the civitai page, linked in a random Google Drive.

Anyway try it out and if the author sees this thanks dude. Lmk if I can chip in for another training run. There is so much potential here.

https://redd.it/1t9r8c6
@rStableDiffusion