Official Turbo lora for anima 1.0 has been posted
https://civitai.com/models/2560840/anima-turbo-lora
https://redd.it/1togknz
@rStableDiffusion
https://civitai.com/models/2560840/anima-turbo-lora
https://redd.it/1togknz
@rStableDiffusion
Civitai
Anima Turbo LoRA - v0.1 | Anima LoRA | Civitai
Turbo LoRA trained on preview3. Use CFG 1 and 8-12 steps. Works well on other Anima checkpoints. You can decrease the LoRA strength a bit below 1 f...
Testing the newly released Microsoft Lens Turbo in my low vram GPU, it is good and it works very well
https://redd.it/1togrx4
@rStableDiffusion
https://redd.it/1togrx4
@rStableDiffusion
Reddit
From the StableDiffusion community on Reddit: Testing the newly released Microsoft Lens Turbo in my low vram GPU, it is good and…
Explore this post and more from the StableDiffusion community
Old forgotten AI model fixes eyes in under 10 min! Forget about pain of randomness and lack of quality of new AI models ;)
https://redd.it/1toihvn
@rStableDiffusion
https://redd.it/1toihvn
@rStableDiffusion
PrismML just released Binary and Ternary Bonsai Image 4B: 1-bit/ternary text-to-image diffusion transformers that can even run 100% locally in your browser on WebGPU.
https://reddit.com/link/1toi5yz/video/y6gh4lxydj3h1/player
The PrismML team really cooked with these models. They're only \~3GB in size (compared to FLUX.2 Klein 4B, which is \~16GB). Apache-2.0!
Official collection on HF: https://huggingface.co/collections/prism-ml/bonsai-image
Link to demo: https://huggingface.co/spaces/webml-community/bonsai-image-webgpu
Original posted in r/locallama. Thank you xenovatech!
https://redd.it/1toi5yz
@rStableDiffusion
https://reddit.com/link/1toi5yz/video/y6gh4lxydj3h1/player
The PrismML team really cooked with these models. They're only \~3GB in size (compared to FLUX.2 Klein 4B, which is \~16GB). Apache-2.0!
Official collection on HF: https://huggingface.co/collections/prism-ml/bonsai-image
Link to demo: https://huggingface.co/spaces/webml-community/bonsai-image-webgpu
Original posted in r/locallama. Thank you xenovatech!
https://redd.it/1toi5yz
@rStableDiffusion
Reddit
From the StableDiffusion community on Reddit
Explore this post and more from the StableDiffusion community
Anima can edit images! And this is possible in two different methods.
# Good afternoon!
Yes, that's true.
https://preview.redd.it/sn84yzrt8l3h1.png?width=1280&format=png&auto=webp&s=421a79b66f346e0335ad9dffac0fd6b2f76ec4a6
Having become interested in this topic, I found two methods for how to implement this.
I'll start with what I found myself:
# 1. Split screen and Anima-lllite-inpainting:
https://preview.redd.it/9d2x8a3s3l3h1.png?width=1440&format=png&auto=webp&s=3acb8abb789f5f3612dc1ab6296c0ac5c2d921dd
This method is similar to what I used for SDXL in my "Consistency characters" workflow. Adding a reference next to the generated image using inpaint. Inspired by IC-Loras and a post about the hidden potential of SDXL.
But without additional magic in the form of the "anima-lllite-inpainting-v2" controlnet, it doesn't work.
https://preview.redd.it/sbuoirfdel3h1.png?width=1072&format=png&auto=webp&s=deee09e98f681fc7cd347946dae027e33d9f8da5
It's still a bit unstable and may not work at all. But spoiler - is the most adaptive method that allows you to not only change clothes or facial expressions, but also completely change the pose.
The more changes, the less details of the character will remain.
# 2. Apply Cosmos Reference Latent + Edit lora
https://preview.redd.it/v719wl8ael3h1.png?width=1891&format=png&auto=webp&s=52f4e079848a9ed087a42969dca5c8c53e2fe717
Yesterday I saw two different lores that implement image editing via Reference Latent. One from mattehe(AnimaEditV1), the other from GOOKLE(lora_edit_ZeroTwo). I like Lora from GOOKLE better. In mattehe she is a bit overcooked.
But the problem with these Loras is that their training data was mainly about dressing/undressing. So they hardly change the character's pose.
See the third hand?
I also want to note that it is better to change the clothes of a naked character, because these Loras have problems with the clothes already present on the character's body.
https://preview.redd.it/vduyjxm5gl3h1.png?width=2200&format=png&auto=webp&s=8519449cec593fe5888f4f53904fe7acf32ae9e1
But they dress the characters well:
Yes, I see a third hand.
And also facial expressions:
https://preview.redd.it/yt9zo3hshl3h1.png?width=1960&format=png&auto=webp&s=be5e9e9825b6805ab88bfaa5bd560e29df6a023a
# Conclusions:
Overall, both approaches are capable. I will keep an eye on updates to these loras, and it is also possible that someone will be able to train IC-Lora for Anima.
# Link to the workflow for tests
https://redd.it/1totumo
@rStableDiffusion
# Good afternoon!
Yes, that's true.
https://preview.redd.it/sn84yzrt8l3h1.png?width=1280&format=png&auto=webp&s=421a79b66f346e0335ad9dffac0fd6b2f76ec4a6
Having become interested in this topic, I found two methods for how to implement this.
I'll start with what I found myself:
# 1. Split screen and Anima-lllite-inpainting:
https://preview.redd.it/9d2x8a3s3l3h1.png?width=1440&format=png&auto=webp&s=3acb8abb789f5f3612dc1ab6296c0ac5c2d921dd
This method is similar to what I used for SDXL in my "Consistency characters" workflow. Adding a reference next to the generated image using inpaint. Inspired by IC-Loras and a post about the hidden potential of SDXL.
But without additional magic in the form of the "anima-lllite-inpainting-v2" controlnet, it doesn't work.
https://preview.redd.it/sbuoirfdel3h1.png?width=1072&format=png&auto=webp&s=deee09e98f681fc7cd347946dae027e33d9f8da5
It's still a bit unstable and may not work at all. But spoiler - is the most adaptive method that allows you to not only change clothes or facial expressions, but also completely change the pose.
The more changes, the less details of the character will remain.
# 2. Apply Cosmos Reference Latent + Edit lora
https://preview.redd.it/v719wl8ael3h1.png?width=1891&format=png&auto=webp&s=52f4e079848a9ed087a42969dca5c8c53e2fe717
Yesterday I saw two different lores that implement image editing via Reference Latent. One from mattehe(AnimaEditV1), the other from GOOKLE(lora_edit_ZeroTwo). I like Lora from GOOKLE better. In mattehe she is a bit overcooked.
But the problem with these Loras is that their training data was mainly about dressing/undressing. So they hardly change the character's pose.
See the third hand?
I also want to note that it is better to change the clothes of a naked character, because these Loras have problems with the clothes already present on the character's body.
https://preview.redd.it/vduyjxm5gl3h1.png?width=2200&format=png&auto=webp&s=8519449cec593fe5888f4f53904fe7acf32ae9e1
But they dress the characters well:
Yes, I see a third hand.
And also facial expressions:
https://preview.redd.it/yt9zo3hshl3h1.png?width=1960&format=png&auto=webp&s=be5e9e9825b6805ab88bfaa5bd560e29df6a023a
# Conclusions:
Overall, both approaches are capable. I will keep an eye on updates to these loras, and it is also possible that someone will be able to train IC-Lora for Anima.
# Link to the workflow for tests
https://redd.it/1totumo
@rStableDiffusion
I created a Microsoft Lens (open-source) | Standalone App for you to try - 4090 HD generation in about 2 seconds after initial model load
https://redd.it/1tos2f1
@rStableDiffusion
https://redd.it/1tos2f1
@rStableDiffusion
WAN2.2 - DaSiWa or Remix??
This thread is not intended to advertise any model so I won't post the link. I just want to know everyone's opinion on how to use these two models.
Not discussing whether N-SFW, Remix (3.0) and DaSiWa (v10 - I haven't tested v11) are specialized models for a certain video genre or are they a versatile video model for many different types of videos.
In each model's CivitAI page, the owners have very different examples. For example, for Remix, FX_FeiHou shares example videos of real people; while for DaSiWa, Darksidewalker uses example videos of Anime style. Is this a correct assessment of the direction of use of the two models? What's everyone's opinion?
In case you need to create videos in the form of daily life, real life, documentary, which model should you use?
I'm really impressed with LoRA GalaxyACE with LTX2.3, although the prompting is a torture, hopefully the creator of GalaxyACE Lora will release a version for Wan2.2 soon.
https://redd.it/1tot9zu
@rStableDiffusion
This thread is not intended to advertise any model so I won't post the link. I just want to know everyone's opinion on how to use these two models.
Not discussing whether N-SFW, Remix (3.0) and DaSiWa (v10 - I haven't tested v11) are specialized models for a certain video genre or are they a versatile video model for many different types of videos.
In each model's CivitAI page, the owners have very different examples. For example, for Remix, FX_FeiHou shares example videos of real people; while for DaSiWa, Darksidewalker uses example videos of Anime style. Is this a correct assessment of the direction of use of the two models? What's everyone's opinion?
In case you need to create videos in the form of daily life, real life, documentary, which model should you use?
I'm really impressed with LoRA GalaxyACE with LTX2.3, although the prompting is a torture, hopefully the creator of GalaxyACE Lora will release a version for Wan2.2 soon.
https://redd.it/1tot9zu
@rStableDiffusion
Reddit
From the StableDiffusion community on Reddit
Explore this post and more from the StableDiffusion community
NVIDIA PiD-based img upscaler (no workflow but .py)
I've "created" a simple img2img upscaler using the FLUX2VAE-variant of NVIDIA's PiD. It's a simple python script, not a Comfy workflow.
You'll need a 24GB VRAM GPU for 1024px and 32 GB for >1024px.
https://github.com/geronimi73/3090\_shorts/tree/main/NVIDIA-PiD-FLUX2VAE-upscaler
It's stripped of all the training related stuff in the original nv-tlabs/PiD github repo. Just torch and transformers. That's how I burned my Claude Code tokens for the day.
I think the model is pretty good. Unfortunately NVIDIA once again changed their mind when it comes to license.
https://preview.redd.it/o1ko8dr7in3h1.png?width=1856&format=png&auto=webp&s=557f50b14c380ba6255acd356fdb7d26974d71ed
https://redd.it/1tp0qzx
@rStableDiffusion
I've "created" a simple img2img upscaler using the FLUX2VAE-variant of NVIDIA's PiD. It's a simple python script, not a Comfy workflow.
You'll need a 24GB VRAM GPU for 1024px and 32 GB for >1024px.
https://github.com/geronimi73/3090\_shorts/tree/main/NVIDIA-PiD-FLUX2VAE-upscaler
It's stripped of all the training related stuff in the original nv-tlabs/PiD github repo. Just torch and transformers. That's how I burned my Claude Code tokens for the day.
I think the model is pretty good. Unfortunately NVIDIA once again changed their mind when it comes to license.
https://preview.redd.it/o1ko8dr7in3h1.png?width=1856&format=png&auto=webp&s=557f50b14c380ba6255acd356fdb7d26974d71ed
https://redd.it/1tp0qzx
@rStableDiffusion
huggingface.co
nvidia/PiD · Hugging Face
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
Regarding Anima, can there be a site where we see the artist styles from both Danbooru and Gelbooru? I read it uses both, but I'm only seeing sites with the Danbooru artist tags, can there be one with Gelbooru too?
https://redd.it/1tp2h86
@rStableDiffusion
https://redd.it/1tp2h86
@rStableDiffusion
Reddit
From the StableDiffusion community on Reddit
Explore this post and more from the StableDiffusion community
Anima - Goku Transformation Series: From SSJ1 to Mastered Ultra Instinct
https://redd.it/1tp671f
@rStableDiffusion
https://redd.it/1tp671f
@rStableDiffusion
Reddit
From the StableDiffusion community on Reddit: Anima - Goku Transformation Series: From SSJ1 to Mastered Ultra Instinct
Explore this post and more from the StableDiffusion community
InvokeAI 6.13 just released, its largest community-driven release ever. Adds full support for Anima & Qwen Image, support for API models (like GPT Image), support for Prompt Expansion & Image To Prompt, lasso & polygon tools, overhauled docs website and more
InvokeAI no longer has a commercial entity backing its development, this release was entirely community driven by 30+ individual volunteers.
https://preview.redd.it/b1n3s1afuo3h1.png?width=2559&format=png&auto=webp&s=cd96c211b7b72f4dbba187e017a2f114512ad97f
Highlights include:
Full Support for Anima
Text to image, image to image, and LoRAs. Support was also added for the ER SDE scheduler. Improved regional guidance support and controlnet support will be added soon.
Full Support for Qwen and Qwen Image Edit
Text to image, image to image, LoRAs, reference image, regional guidance, and controlnet support.
Support for API models such as GPT Image and Nano Banana
If local models ever can't quite do what you need it to do, you can link an API key to an external API service and generate images directly in the canvas. This was originally a feature in the paid commercial version of invoke (which no longer exists) and was built from scratch for the free community edition.
Support for Prompt Expansion and Image To Prompt
Expand your prompt using an LLM such as Gemma or Qwen Instruct, or convert your image into a prompt.
New Canvas Tools (Lasso, Polygon Tool)
Last release the Text tool and Gradient tools were added. In this release, the available tools continue to expand with Lasso and Polygon tools.
Extended Multi-User Mode
Multi-user mode now supports creating private or shared boards and workflows
New Website & New Documentation Site
After the original team behind the commercial entity was hired by adobe, the website was effectively closed down. In this release, the website and documentation sites have a new coat of paint https://invoke.ai/
Full release notes: https://github.com/invoke-ai/InvokeAI/releases/tag/v6.13.0
Download: https://github.com/invoke-ai/launcher/releases/tag/v1.8.1
https://redd.it/1tp7e6w
@rStableDiffusion
InvokeAI no longer has a commercial entity backing its development, this release was entirely community driven by 30+ individual volunteers.
https://preview.redd.it/b1n3s1afuo3h1.png?width=2559&format=png&auto=webp&s=cd96c211b7b72f4dbba187e017a2f114512ad97f
Highlights include:
Full Support for Anima
Text to image, image to image, and LoRAs. Support was also added for the ER SDE scheduler. Improved regional guidance support and controlnet support will be added soon.
Full Support for Qwen and Qwen Image Edit
Text to image, image to image, LoRAs, reference image, regional guidance, and controlnet support.
Support for API models such as GPT Image and Nano Banana
If local models ever can't quite do what you need it to do, you can link an API key to an external API service and generate images directly in the canvas. This was originally a feature in the paid commercial version of invoke (which no longer exists) and was built from scratch for the free community edition.
Support for Prompt Expansion and Image To Prompt
Expand your prompt using an LLM such as Gemma or Qwen Instruct, or convert your image into a prompt.
New Canvas Tools (Lasso, Polygon Tool)
Last release the Text tool and Gradient tools were added. In this release, the available tools continue to expand with Lasso and Polygon tools.
Extended Multi-User Mode
Multi-user mode now supports creating private or shared boards and workflows
New Website & New Documentation Site
After the original team behind the commercial entity was hired by adobe, the website was effectively closed down. In this release, the website and documentation sites have a new coat of paint https://invoke.ai/
Full release notes: https://github.com/invoke-ai/InvokeAI/releases/tag/v6.13.0
Download: https://github.com/invoke-ai/launcher/releases/tag/v1.8.1
https://redd.it/1tp7e6w
@rStableDiffusion