Echo Chamber - AceStep 1.5 song (XL version)
Echo Chamber \(XL version\)
As an experiment I regenerated my Ace Step 1.5 song using XL model (same parameters etc.). It's similar, but there are differences. I've noticed that the old 1.5 would sometimes improvise a bit to fit lyrics better to the song, while XL will more often rush with lyrics and leave a pause. I've had yet another version of this song, that failed to generate properly with 1.5 (with interesting results), but would properly generate using XL model.
I'm not sure I like the XL version of this song better, but XL tends to be better with following lyrics (if somewhat less flexible).
Here is the non-XL version of this song (with prompt, lyrics, etc.): https://www.reddit.com/r/AceStep/comments/1sf99em/echo\_chamber\_acestep\_15\_song/
I've also noticed that the text encoder for Ace Step isn't 100% deterministic. Haven't boiled down which factor is causing this, but if I run AceStep with same parameters (seed, model. prompt, the whole shebang) on a different machine, I'll get a different song. I still get the same song on the same machine though. It might be tied to OS, pytorch or ROCm version (not sure which). Previously I thought it was a change in ComfyUI (that might have been true at some point in the past), but I was wrong (otherwise I wouldn't be able to generate this version of the song).
https://redd.it/1sikd31
@rStableDiffusion
Echo Chamber \(XL version\)
As an experiment I regenerated my Ace Step 1.5 song using XL model (same parameters etc.). It's similar, but there are differences. I've noticed that the old 1.5 would sometimes improvise a bit to fit lyrics better to the song, while XL will more often rush with lyrics and leave a pause. I've had yet another version of this song, that failed to generate properly with 1.5 (with interesting results), but would properly generate using XL model.
I'm not sure I like the XL version of this song better, but XL tends to be better with following lyrics (if somewhat less flexible).
Here is the non-XL version of this song (with prompt, lyrics, etc.): https://www.reddit.com/r/AceStep/comments/1sf99em/echo\_chamber\_acestep\_15\_song/
I've also noticed that the text encoder for Ace Step isn't 100% deterministic. Haven't boiled down which factor is causing this, but if I run AceStep with same parameters (seed, model. prompt, the whole shebang) on a different machine, I'll get a different song. I still get the same song on the same machine though. It might be tied to OS, pytorch or ROCm version (not sure which). Previously I thought it was a change in ComfyUI (that might have been true at some point in the past), but I was wrong (otherwise I wouldn't be able to generate this version of the song).
https://redd.it/1sikd31
@rStableDiffusion
Reddit
From the StableDiffusion community on Reddit
Explore this post and more from the StableDiffusion community
Built a local browser to organize my output folder chaos -- search by prompt, checkpoint, LoRA, node type, etc
https://redd.it/1siqf2v
@rStableDiffusion
https://redd.it/1siqf2v
@rStableDiffusion
Reddit
From the StableDiffusion community on Reddit: Built a local browser to organize my output folder chaos -- search by prompt, checkpoint…
Explore this post and more from the StableDiffusion community
Why is Wan 2.2 N.S.F.W Remix Lightning Model so much better at things like hair flip, hair combing and feminine energy than regular Wan?
I am not talking about actual N.S.F.W I am talking about the model that has such a name in it, and just feminine energy, seductive performance, shampoo commercial hair toss, sensual movements, elegant leg cross sitting on bar stool.
Whenever I use any of these WAN models it comes out very static and it ignores the prompt, when I use the remix it comes out nearly perfect.
It's almost like using Grok, not the new Grok but the old one before it was censored.
https://redd.it/1sipeko
@rStableDiffusion
I am not talking about actual N.S.F.W I am talking about the model that has such a name in it, and just feminine energy, seductive performance, shampoo commercial hair toss, sensual movements, elegant leg cross sitting on bar stool.
Whenever I use any of these WAN models it comes out very static and it ignores the prompt, when I use the remix it comes out nearly perfect.
It's almost like using Grok, not the new Grok but the old one before it was censored.
https://redd.it/1sipeko
@rStableDiffusion
Reddit
From the StableDiffusion community on Reddit
Explore this post and more from the StableDiffusion community
Sharing my creative node suite for ComfyUI
Hey guys, Winnougan here. It's time to give back to the community. I've been growing my nodes suite on GitHub, which started out as the nodes that I personally wanted to make life easier in ComfyUI. I'll keep adding to them to make my overall ComfyUI experience faster and user-friendly. Enjoy the nodes and happy gooning!
1. Resolution picker: too many presets to count plus custom height and weight if that's your thing. Visual icons to easily pick what you want. I do a ton of high res images, so this helps me out a lot.
2. LTX and Wan resolution picker: I cobbled together all the best resolutions for these video models and made it easy to pick and choose what you want
3. Power Lora Loader: I wanted to add and remove loras quickly. I have thousands of loras stashed away, so I decided to make it easy to search for them by visually. Easy to adjust the strength and toggle on and off, move up and down or remove them.
4. The beloved Cache Dit series: regular cache dit, cache dit for Wan2.2 and cache dit for LTX-2.3. Visually shows you how it speeds up your workflow.
5. More to come! Stay tuned as I'll be adding a ton more nodes to my suite.
Grab the suite here: https://github.com/Winnougan/winnougan-nodes.git
Or, in the Comfyui Manager by typing "Winnougan": or in your custom_nodes folder do a "git clone https://github.com/Winnougan/winnougan-nodes.git"
https://redd.it/1siu2zd
@rStableDiffusion
Hey guys, Winnougan here. It's time to give back to the community. I've been growing my nodes suite on GitHub, which started out as the nodes that I personally wanted to make life easier in ComfyUI. I'll keep adding to them to make my overall ComfyUI experience faster and user-friendly. Enjoy the nodes and happy gooning!
1. Resolution picker: too many presets to count plus custom height and weight if that's your thing. Visual icons to easily pick what you want. I do a ton of high res images, so this helps me out a lot.
2. LTX and Wan resolution picker: I cobbled together all the best resolutions for these video models and made it easy to pick and choose what you want
3. Power Lora Loader: I wanted to add and remove loras quickly. I have thousands of loras stashed away, so I decided to make it easy to search for them by visually. Easy to adjust the strength and toggle on and off, move up and down or remove them.
4. The beloved Cache Dit series: regular cache dit, cache dit for Wan2.2 and cache dit for LTX-2.3. Visually shows you how it speeds up your workflow.
5. More to come! Stay tuned as I'll be adding a ton more nodes to my suite.
Grab the suite here: https://github.com/Winnougan/winnougan-nodes.git
Or, in the Comfyui Manager by typing "Winnougan": or in your custom_nodes folder do a "git clone https://github.com/Winnougan/winnougan-nodes.git"
https://redd.it/1siu2zd
@rStableDiffusion
GitHub
GitHub - Winnougan/winnougan-nodes: Load up them loras in Comfyui
Load up them loras in Comfyui. Contribute to Winnougan/winnougan-nodes development by creating an account on GitHub.
What are the current best models quality-wise?
Lots of models get attention for being able to run fast or on low VRAM or whatever but what is currently considered state of the art for local Image, Video, audio, etc... generation?
I've been around here since the first days of stablediffusion and when A111 was the go-to, but I've always had a system with only a 2070 super, so 8GB VRAM and few supported optimizations. As such I've only really dealt with GGUF models and quants that worked on lower-end systems and am not as caught up on what the best models are if resources aren't an issue.
I'll have a system with a 5090 soon to try some of them out but I'm curious what you guys would rank the highest for the various models, be they straight text2image, image edit, video models, music, tts, etc...
I'm sure quite a few people would benefit from this since the leaderboards are constantly shifting for models.
https://redd.it/1siyftp
@rStableDiffusion
Lots of models get attention for being able to run fast or on low VRAM or whatever but what is currently considered state of the art for local Image, Video, audio, etc... generation?
I've been around here since the first days of stablediffusion and when A111 was the go-to, but I've always had a system with only a 2070 super, so 8GB VRAM and few supported optimizations. As such I've only really dealt with GGUF models and quants that worked on lower-end systems and am not as caught up on what the best models are if resources aren't an issue.
I'll have a system with a 5090 soon to try some of them out but I'm curious what you guys would rank the highest for the various models, be they straight text2image, image edit, video models, music, tts, etc...
I'm sure quite a few people would benefit from this since the leaderboards are constantly shifting for models.
https://redd.it/1siyftp
@rStableDiffusion
Reddit
From the StableDiffusion community on Reddit
Explore this post and more from the StableDiffusion community