ComfyUI HiDream text->image and image-edit templates - multiple reference image facility. Discuss please.
A recent ComfyUI update has included the two new HiDream templates mentioned in the title. I should welcome responses to the following questions.
1. The general pros and cons of HiDream.
2. Use of multiple reference images. How best to organise? How many? How to integrate with textual instructions?
3. Is the use of multiple reference images implemented for other visual AI models?
https://redd.it/1tihygu
@rStableDiffusion
A recent ComfyUI update has included the two new HiDream templates mentioned in the title. I should welcome responses to the following questions.
1. The general pros and cons of HiDream.
2. Use of multiple reference images. How best to organise? How many? How to integrate with textual instructions?
3. Is the use of multiple reference images implemented for other visual AI models?
https://redd.it/1tihygu
@rStableDiffusion
Reddit
From the StableDiffusion community on Reddit
Explore this post and more from the StableDiffusion community
How to achieve this style where the face is anime but the body is a realistic 3D render?
https://redd.it/1tiksdz
@rStableDiffusion
https://redd.it/1tiksdz
@rStableDiffusion
Reddit
From the StableDiffusion community on Reddit: How to achieve this style where the face is anime but the body is a realistic 3D…
Explore this post and more from the StableDiffusion community
LTX Color Shifting
reference image
I'm having a problem with color changing basically since I started usng the id lora node with LTX 2.3, even though I don't think he is behind this, but every generation since then is iffy. At first, it started by color changing when the video progressed, now it became less and less perceptible since I reduced the reference image size and increased the weight in "LTXImgToVideoInplace" at the upscale stage to values above one. But the results still iffy. The problem always happens at the upscale stage, regardless of the upscaler I'm using, here are some of the examples of how it is supposed to be and how it is now.
working example
color shift example
https://redd.it/1tijjkf
@rStableDiffusion
reference image
I'm having a problem with color changing basically since I started usng the id lora node with LTX 2.3, even though I don't think he is behind this, but every generation since then is iffy. At first, it started by color changing when the video progressed, now it became less and less perceptible since I reduced the reference image size and increased the weight in "LTXImgToVideoInplace" at the upscale stage to values above one. But the results still iffy. The problem always happens at the upscale stage, regardless of the upscaler I'm using, here are some of the examples of how it is supposed to be and how it is now.
working example
color shift example
https://redd.it/1tijjkf
@rStableDiffusion
Worth Upgrading just GPU or entire System needs upgrade?
Hello,
Read some about different GPU's and ram requirement and see some conflicting stuff. But i think my system needs full upgrade, just wanna confirmation, before overspending.
Right now I have Ryzen 3600 CPU, AMD R5700 GPU 8GB and 32 GB ram (4x 8gb) mobo is MSI Gaming Plus B450 so PCIE 3.0 slot, 650W Corsairs RMx PSU
So idea was maybe to get a 5060 TI 16GB or 5070TI 16GB (as what i read, dont bother with AMD, Intel if you want out of the box working and less tinkering and Windows)
Also have access to wife's PC that is AMD 5600 CPU, 5060 8GB GPU and 16GB ram, B550M Pro-VDH motherboard has PCIE 4.0
So Worth to get a 16GB GPU in either system with 32GB ram. Or also need 64GB ram? Or better get a newer AM5 system with like 64-128GB ram and 16GB card?
A used 3090 here is around 800eur, refurb 900+eur
64GB ram DDR4 - 500eur
5060TI 16GB - 600eur
5070TI 16GB - 1000eur
5080 - 1.5k
5090 - 3.6k and up :D
Would like to have Image and Video gen, TTS, make consistent chars, images with same char, like comic etc :) Try new stuff.
New system would cost me like 3k with a 5070TI, with 64GB ram, new PSU to support 2 GPU's and Taichi Motherboard (as wanna try local LLM later also)
But for now, i would like to see if i can get by with existing system and if its even worth trying, or need to save up a bit and get a complete new system.
Thanks for answers and help :)
https://redd.it/1tiljco
@rStableDiffusion
Hello,
Read some about different GPU's and ram requirement and see some conflicting stuff. But i think my system needs full upgrade, just wanna confirmation, before overspending.
Right now I have Ryzen 3600 CPU, AMD R5700 GPU 8GB and 32 GB ram (4x 8gb) mobo is MSI Gaming Plus B450 so PCIE 3.0 slot, 650W Corsairs RMx PSU
So idea was maybe to get a 5060 TI 16GB or 5070TI 16GB (as what i read, dont bother with AMD, Intel if you want out of the box working and less tinkering and Windows)
Also have access to wife's PC that is AMD 5600 CPU, 5060 8GB GPU and 16GB ram, B550M Pro-VDH motherboard has PCIE 4.0
So Worth to get a 16GB GPU in either system with 32GB ram. Or also need 64GB ram? Or better get a newer AM5 system with like 64-128GB ram and 16GB card?
A used 3090 here is around 800eur, refurb 900+eur
64GB ram DDR4 - 500eur
5060TI 16GB - 600eur
5070TI 16GB - 1000eur
5080 - 1.5k
5090 - 3.6k and up :D
Would like to have Image and Video gen, TTS, make consistent chars, images with same char, like comic etc :) Try new stuff.
New system would cost me like 3k with a 5070TI, with 64GB ram, new PSU to support 2 GPU's and Taichi Motherboard (as wanna try local LLM later also)
But for now, i would like to see if i can get by with existing system and if its even worth trying, or need to save up a bit and get a complete new system.
Thanks for answers and help :)
https://redd.it/1tiljco
@rStableDiffusion
Reddit
From the StableDiffusion community on Reddit
Explore this post and more from the StableDiffusion community
Announcing the release of Stable Audio 3!
Taken straight from the HarmonAI discord server.
We're excited to announce the launch of Stable Audio 3, our new family of text-to-audio models for music and sound effects, including new *open-weights models*! We're releasing three models today on Hugging Face as well as a GitHub repo specifically tailored to Stable Audio 3 inference, as well as LoRA fine-tuning.
* Stable Audio 3 Small Music ([https://huggingface.co/stabilityai/stable-audio-3-small-music](https://huggingface.co/stabilityai/stable-audio-3-small-music))
* Stable Audio 3 Small SFX ([https://huggingface.co/stabilityai/stable-audio-3-small-sfx](https://huggingface.co/stabilityai/stable-audio-3-small-sfx))
* Stable Audio 3 Medium ([https://huggingface.co/stabilityai/stable-audio-3-medium](https://huggingface.co/stabilityai/stable-audio-3-medium))
Stable Audio 3 GitHub: [https://github.com/Stability-AI/stable-audio-3](https://github.com/Stability-AI/stable-audio-3) The Medium model generates music and sound effects with lengths up to **six minutes and twenty seconds**, inferencing in a matter of seconds on NVIDIA GPUs. The Small models make music and sound effects (respectively) with lengths up to **two minutes**, and can be optimized to run efficiently on CPUs. These models are licensed under our Stability AI Community License, meaning it's totally free for personal and creative use. We don't claim any royalties or ownership on the model outputs, they're yours to do with as you please. We've also published two academic papers on this model as well the new SAME autoencoder architecture the models are based on.
Stable Audio 3 paper: [https://arxiv.org/abs/2605.17991](https://arxiv.org/abs/2605.17991)
SAME paper: [https://arxiv.org/abs/2605.18613](https://arxiv.org/abs/2605.18613)
Blog post: [https://stability.ai/news-updates/meet-stable-audio-3-the-model-family-built-for-artistic-experimentation-with-open-weight-models](https://stability.ai/news-updates/meet-stable-audio-3-the-model-family-built-for-artistic-experimentation-with-open-weight-models)
We're so excited to share this release with you, and we can't wait to see what you make with it!
https://redd.it/1tiq820
@rStableDiffusion
Taken straight from the HarmonAI discord server.
We're excited to announce the launch of Stable Audio 3, our new family of text-to-audio models for music and sound effects, including new *open-weights models*! We're releasing three models today on Hugging Face as well as a GitHub repo specifically tailored to Stable Audio 3 inference, as well as LoRA fine-tuning.
* Stable Audio 3 Small Music ([https://huggingface.co/stabilityai/stable-audio-3-small-music](https://huggingface.co/stabilityai/stable-audio-3-small-music))
* Stable Audio 3 Small SFX ([https://huggingface.co/stabilityai/stable-audio-3-small-sfx](https://huggingface.co/stabilityai/stable-audio-3-small-sfx))
* Stable Audio 3 Medium ([https://huggingface.co/stabilityai/stable-audio-3-medium](https://huggingface.co/stabilityai/stable-audio-3-medium))
Stable Audio 3 GitHub: [https://github.com/Stability-AI/stable-audio-3](https://github.com/Stability-AI/stable-audio-3) The Medium model generates music and sound effects with lengths up to **six minutes and twenty seconds**, inferencing in a matter of seconds on NVIDIA GPUs. The Small models make music and sound effects (respectively) with lengths up to **two minutes**, and can be optimized to run efficiently on CPUs. These models are licensed under our Stability AI Community License, meaning it's totally free for personal and creative use. We don't claim any royalties or ownership on the model outputs, they're yours to do with as you please. We've also published two academic papers on this model as well the new SAME autoencoder architecture the models are based on.
Stable Audio 3 paper: [https://arxiv.org/abs/2605.17991](https://arxiv.org/abs/2605.17991)
SAME paper: [https://arxiv.org/abs/2605.18613](https://arxiv.org/abs/2605.18613)
Blog post: [https://stability.ai/news-updates/meet-stable-audio-3-the-model-family-built-for-artistic-experimentation-with-open-weight-models](https://stability.ai/news-updates/meet-stable-audio-3-the-model-family-built-for-artistic-experimentation-with-open-weight-models)
We're so excited to share this release with you, and we can't wait to see what you make with it!
https://redd.it/1tiq820
@rStableDiffusion
huggingface.co
stabilityai/stable-audio-3-small-music · Hugging Face
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
How do you figure out which samplers to use?
I usually just use what is given to me in example workflows but there are so many to choose from. Will reading and learning about the model help inform the decision on what sampler to use? Things like skipping steps and 2 step samplers are they just trial and error or is their a method to the madness?
https://redd.it/1tiofkw
@rStableDiffusion
I usually just use what is given to me in example workflows but there are so many to choose from. Will reading and learning about the model help inform the decision on what sampler to use? Things like skipping steps and 2 step samplers are they just trial and error or is their a method to the madness?
https://redd.it/1tiofkw
@rStableDiffusion
Reddit
From the StableDiffusion community on Reddit
Explore this post and more from the StableDiffusion community