Model test: SenseNova U1 vs GPT Image2 vs Nano Banana in Infographic generation
https://redd.it/1to7gu9
@rStableDiffusion
https://redd.it/1to7gu9
@rStableDiffusion
Reddit
From the StableDiffusion community on Reddit: Model test: SenseNova U1 vs GPT Image2 vs Nano Banana in Infographic generation
Explore this post and more from the StableDiffusion community
Stable Audio 3 in ComfyUI: Create AI Music and Sound Effects (Ep19)
https://www.youtube.com/watch?v=Pzc569C3xUY
https://redd.it/1to9w18
@rStableDiffusion
https://www.youtube.com/watch?v=Pzc569C3xUY
https://redd.it/1to9w18
@rStableDiffusion
YouTube
Stable Audio 3 in ComfyUI: Create AI Music and Sound Effects (Ep19)
Learn how to use Stable Audio 3 in ComfyUI to create AI-generated music, sound effects, and audio prompts using Stable Audio 3 Medium.
In this tutorial, you’ll see how to install the required Stable Audio 3 models, load the workflows, and generate audio…
In this tutorial, you’ll see how to install the required Stable Audio 3 models, load the workflows, and generate audio…
Anima-Base is magic and i don't think people realize how good it is.
https://redd.it/1tobzgq
@rStableDiffusion
https://redd.it/1tobzgq
@rStableDiffusion
Reddit
From the StableDiffusion community on Reddit: Anima-Base is magic and i don't think people realize how good it is.
Explore this post and more from the StableDiffusion community
This media is not supported in your browser
VIEW IN TELEGRAM
I built a full AI animation pipeline and made a 2.5 minute animated show in 5 days (Qwen, Flux, LTXV)
https://redd.it/1to7uzx
@rStableDiffusion
https://redd.it/1to7uzx
@rStableDiffusion
Anima TrainFlow — Simple One-Page LoRA Trainer for Anima (Portable, Auto-Captioning, Smart Cropping & Bucketing)
https://redd.it/1to9o8i
@rStableDiffusion
https://redd.it/1to9o8i
@rStableDiffusion
Reddit
From the StableDiffusion community on Reddit: Anima TrainFlow — Simple One-Page LoRA Trainer for Anima (Portable, Auto-Captioning…
Explore this post and more from the StableDiffusion community
Official Turbo lora for anima 1.0 has been posted
https://civitai.com/models/2560840/anima-turbo-lora
https://redd.it/1togknz
@rStableDiffusion
https://civitai.com/models/2560840/anima-turbo-lora
https://redd.it/1togknz
@rStableDiffusion
Civitai
Anima Turbo LoRA - v0.1 | Anima LoRA | Civitai
Turbo LoRA trained on preview3. Use CFG 1 and 8-12 steps. Works well on other Anima checkpoints. You can decrease the LoRA strength a bit below 1 f...
Testing the newly released Microsoft Lens Turbo in my low vram GPU, it is good and it works very well
https://redd.it/1togrx4
@rStableDiffusion
https://redd.it/1togrx4
@rStableDiffusion
Reddit
From the StableDiffusion community on Reddit: Testing the newly released Microsoft Lens Turbo in my low vram GPU, it is good and…
Explore this post and more from the StableDiffusion community
Old forgotten AI model fixes eyes in under 10 min! Forget about pain of randomness and lack of quality of new AI models ;)
https://redd.it/1toihvn
@rStableDiffusion
https://redd.it/1toihvn
@rStableDiffusion
PrismML just released Binary and Ternary Bonsai Image 4B: 1-bit/ternary text-to-image diffusion transformers that can even run 100% locally in your browser on WebGPU.
https://reddit.com/link/1toi5yz/video/y6gh4lxydj3h1/player
The PrismML team really cooked with these models. They're only \~3GB in size (compared to FLUX.2 Klein 4B, which is \~16GB). Apache-2.0!
Official collection on HF: https://huggingface.co/collections/prism-ml/bonsai-image
Link to demo: https://huggingface.co/spaces/webml-community/bonsai-image-webgpu
Original posted in r/locallama. Thank you xenovatech!
https://redd.it/1toi5yz
@rStableDiffusion
https://reddit.com/link/1toi5yz/video/y6gh4lxydj3h1/player
The PrismML team really cooked with these models. They're only \~3GB in size (compared to FLUX.2 Klein 4B, which is \~16GB). Apache-2.0!
Official collection on HF: https://huggingface.co/collections/prism-ml/bonsai-image
Link to demo: https://huggingface.co/spaces/webml-community/bonsai-image-webgpu
Original posted in r/locallama. Thank you xenovatech!
https://redd.it/1toi5yz
@rStableDiffusion
Reddit
From the StableDiffusion community on Reddit
Explore this post and more from the StableDiffusion community
Anima can edit images! And this is possible in two different methods.
# Good afternoon!
Yes, that's true.
https://preview.redd.it/sn84yzrt8l3h1.png?width=1280&format=png&auto=webp&s=421a79b66f346e0335ad9dffac0fd6b2f76ec4a6
Having become interested in this topic, I found two methods for how to implement this.
I'll start with what I found myself:
# 1. Split screen and Anima-lllite-inpainting:
https://preview.redd.it/9d2x8a3s3l3h1.png?width=1440&format=png&auto=webp&s=3acb8abb789f5f3612dc1ab6296c0ac5c2d921dd
This method is similar to what I used for SDXL in my "Consistency characters" workflow. Adding a reference next to the generated image using inpaint. Inspired by IC-Loras and a post about the hidden potential of SDXL.
But without additional magic in the form of the "anima-lllite-inpainting-v2" controlnet, it doesn't work.
https://preview.redd.it/sbuoirfdel3h1.png?width=1072&format=png&auto=webp&s=deee09e98f681fc7cd347946dae027e33d9f8da5
It's still a bit unstable and may not work at all. But spoiler - is the most adaptive method that allows you to not only change clothes or facial expressions, but also completely change the pose.
The more changes, the less details of the character will remain.
# 2. Apply Cosmos Reference Latent + Edit lora
https://preview.redd.it/v719wl8ael3h1.png?width=1891&format=png&auto=webp&s=52f4e079848a9ed087a42969dca5c8c53e2fe717
Yesterday I saw two different lores that implement image editing via Reference Latent. One from mattehe(AnimaEditV1), the other from GOOKLE(lora_edit_ZeroTwo). I like Lora from GOOKLE better. In mattehe she is a bit overcooked.
But the problem with these Loras is that their training data was mainly about dressing/undressing. So they hardly change the character's pose.
See the third hand?
I also want to note that it is better to change the clothes of a naked character, because these Loras have problems with the clothes already present on the character's body.
https://preview.redd.it/vduyjxm5gl3h1.png?width=2200&format=png&auto=webp&s=8519449cec593fe5888f4f53904fe7acf32ae9e1
But they dress the characters well:
Yes, I see a third hand.
And also facial expressions:
https://preview.redd.it/yt9zo3hshl3h1.png?width=1960&format=png&auto=webp&s=be5e9e9825b6805ab88bfaa5bd560e29df6a023a
# Conclusions:
Overall, both approaches are capable. I will keep an eye on updates to these loras, and it is also possible that someone will be able to train IC-Lora for Anima.
# Link to the workflow for tests
https://redd.it/1totumo
@rStableDiffusion
# Good afternoon!
Yes, that's true.
https://preview.redd.it/sn84yzrt8l3h1.png?width=1280&format=png&auto=webp&s=421a79b66f346e0335ad9dffac0fd6b2f76ec4a6
Having become interested in this topic, I found two methods for how to implement this.
I'll start with what I found myself:
# 1. Split screen and Anima-lllite-inpainting:
https://preview.redd.it/9d2x8a3s3l3h1.png?width=1440&format=png&auto=webp&s=3acb8abb789f5f3612dc1ab6296c0ac5c2d921dd
This method is similar to what I used for SDXL in my "Consistency characters" workflow. Adding a reference next to the generated image using inpaint. Inspired by IC-Loras and a post about the hidden potential of SDXL.
But without additional magic in the form of the "anima-lllite-inpainting-v2" controlnet, it doesn't work.
https://preview.redd.it/sbuoirfdel3h1.png?width=1072&format=png&auto=webp&s=deee09e98f681fc7cd347946dae027e33d9f8da5
It's still a bit unstable and may not work at all. But spoiler - is the most adaptive method that allows you to not only change clothes or facial expressions, but also completely change the pose.
The more changes, the less details of the character will remain.
# 2. Apply Cosmos Reference Latent + Edit lora
https://preview.redd.it/v719wl8ael3h1.png?width=1891&format=png&auto=webp&s=52f4e079848a9ed087a42969dca5c8c53e2fe717
Yesterday I saw two different lores that implement image editing via Reference Latent. One from mattehe(AnimaEditV1), the other from GOOKLE(lora_edit_ZeroTwo). I like Lora from GOOKLE better. In mattehe she is a bit overcooked.
But the problem with these Loras is that their training data was mainly about dressing/undressing. So they hardly change the character's pose.
See the third hand?
I also want to note that it is better to change the clothes of a naked character, because these Loras have problems with the clothes already present on the character's body.
https://preview.redd.it/vduyjxm5gl3h1.png?width=2200&format=png&auto=webp&s=8519449cec593fe5888f4f53904fe7acf32ae9e1
But they dress the characters well:
Yes, I see a third hand.
And also facial expressions:
https://preview.redd.it/yt9zo3hshl3h1.png?width=1960&format=png&auto=webp&s=be5e9e9825b6805ab88bfaa5bd560e29df6a023a
# Conclusions:
Overall, both approaches are capable. I will keep an eye on updates to these loras, and it is also possible that someone will be able to train IC-Lora for Anima.
# Link to the workflow for tests
https://redd.it/1totumo
@rStableDiffusion
I created a Microsoft Lens (open-source) | Standalone App for you to try - 4090 HD generation in about 2 seconds after initial model load
https://redd.it/1tos2f1
@rStableDiffusion
https://redd.it/1tos2f1
@rStableDiffusion
WAN2.2 - DaSiWa or Remix??
This thread is not intended to advertise any model so I won't post the link. I just want to know everyone's opinion on how to use these two models.
Not discussing whether N-SFW, Remix (3.0) and DaSiWa (v10 - I haven't tested v11) are specialized models for a certain video genre or are they a versatile video model for many different types of videos.
In each model's CivitAI page, the owners have very different examples. For example, for Remix, FX_FeiHou shares example videos of real people; while for DaSiWa, Darksidewalker uses example videos of Anime style. Is this a correct assessment of the direction of use of the two models? What's everyone's opinion?
In case you need to create videos in the form of daily life, real life, documentary, which model should you use?
I'm really impressed with LoRA GalaxyACE with LTX2.3, although the prompting is a torture, hopefully the creator of GalaxyACE Lora will release a version for Wan2.2 soon.
https://redd.it/1tot9zu
@rStableDiffusion
This thread is not intended to advertise any model so I won't post the link. I just want to know everyone's opinion on how to use these two models.
Not discussing whether N-SFW, Remix (3.0) and DaSiWa (v10 - I haven't tested v11) are specialized models for a certain video genre or are they a versatile video model for many different types of videos.
In each model's CivitAI page, the owners have very different examples. For example, for Remix, FX_FeiHou shares example videos of real people; while for DaSiWa, Darksidewalker uses example videos of Anime style. Is this a correct assessment of the direction of use of the two models? What's everyone's opinion?
In case you need to create videos in the form of daily life, real life, documentary, which model should you use?
I'm really impressed with LoRA GalaxyACE with LTX2.3, although the prompting is a torture, hopefully the creator of GalaxyACE Lora will release a version for Wan2.2 soon.
https://redd.it/1tot9zu
@rStableDiffusion
Reddit
From the StableDiffusion community on Reddit
Explore this post and more from the StableDiffusion community