Media is too big
VIEW IN TELEGRAM
vlo 0.2.0 - A ComfyUI-powered editor designed for complex control [repost with fixed video]

https://redd.it/1to4359
@rStableDiffusion
This media is not supported in your browser
VIEW IN TELEGRAM
I built a full AI animation pipeline and made a 2.5 minute animated show in 5 days (Qwen, Flux, LTXV)

https://redd.it/1to7uzx
@rStableDiffusion
Old forgotten AI model fixes eyes in under 10 min! Forget about pain of randomness and lack of quality of new AI models ;)
https://redd.it/1toihvn
@rStableDiffusion
PrismML just released Binary and Ternary Bonsai Image 4B: 1-bit/ternary text-to-image diffusion transformers that can even run 100% locally in your browser on WebGPU.



https://reddit.com/link/1toi5yz/video/y6gh4lxydj3h1/player

The PrismML team really cooked with these models. They're only \~3GB in size (compared to FLUX.2 Klein 4B, which is \~16GB). Apache-2.0!

Official collection on HF: https://huggingface.co/collections/prism-ml/bonsai-image
Link to demo: https://huggingface.co/spaces/webml-community/bonsai-image-webgpu

Original posted in r/locallama. Thank you xenovatech!

https://redd.it/1toi5yz
@rStableDiffusion
Anima can edit images! And this is possible in two different methods.

# Good afternoon!

Yes, that's true.

https://preview.redd.it/sn84yzrt8l3h1.png?width=1280&format=png&auto=webp&s=421a79b66f346e0335ad9dffac0fd6b2f76ec4a6

Having become interested in this topic, I found two methods for how to implement this.

I'll start with what I found myself:

# 1. Split screen and Anima-lllite-inpainting:

https://preview.redd.it/9d2x8a3s3l3h1.png?width=1440&format=png&auto=webp&s=3acb8abb789f5f3612dc1ab6296c0ac5c2d921dd

This method is similar to what I used for SDXL in my "Consistency characters" workflow. Adding a reference next to the generated image using inpaint. Inspired by IC-Loras and a post about the hidden potential of SDXL.
But without additional magic in the form of the "anima-lllite-inpainting-v2" controlnet, it doesn't work.

https://preview.redd.it/sbuoirfdel3h1.png?width=1072&format=png&auto=webp&s=deee09e98f681fc7cd347946dae027e33d9f8da5

It's still a bit unstable and may not work at all. But spoiler - is the most adaptive method that allows you to not only change clothes or facial expressions, but also completely change the pose.

The more changes, the less details of the character will remain.

# 2. Apply Cosmos Reference Latent + Edit lora

https://preview.redd.it/v719wl8ael3h1.png?width=1891&format=png&auto=webp&s=52f4e079848a9ed087a42969dca5c8c53e2fe717

Yesterday I saw two different lores that implement image editing via Reference Latent. One from mattehe(AnimaEditV1), the other from GOOKLE(lora_edit_ZeroTwo). I like Lora from GOOKLE better. In mattehe she is a bit overcooked.

But the problem with these Loras is that their training data was mainly about dressing/undressing. So they hardly change the character's pose.

See the third hand?

I also want to note that it is better to change the clothes of a naked character, because these Loras have problems with the clothes already present on the character's body.

https://preview.redd.it/vduyjxm5gl3h1.png?width=2200&format=png&auto=webp&s=8519449cec593fe5888f4f53904fe7acf32ae9e1

But they dress the characters well:

Yes, I see a third hand.

And also facial expressions:

https://preview.redd.it/yt9zo3hshl3h1.png?width=1960&format=png&auto=webp&s=be5e9e9825b6805ab88bfaa5bd560e29df6a023a

# Conclusions:

Overall, both approaches are capable. I will keep an eye on updates to these loras, and it is also possible that someone will be able to train IC-Lora for Anima.

# Link to the workflow for tests

https://redd.it/1totumo
@rStableDiffusion
I created a Microsoft Lens (open-source) | Standalone App for you to try - 4090 HD generation in about 2 seconds after initial model load
https://redd.it/1tos2f1
@rStableDiffusion
WAN2.2 - DaSiWa or Remix??

This thread is not intended to advertise any model so I won't post the link. I just want to know everyone's opinion on how to use these two models.
Not discussing whether N-SFW, Remix (3.0) and DaSiWa (v10 - I haven't tested v11) are specialized models for a certain video genre or are they a versatile video model for many different types of videos.
In each model's CivitAI page, the owners have very different examples. For example, for Remix, FX_FeiHou shares example videos of real people; while for DaSiWa, Darksidewalker uses example videos of Anime style. Is this a correct assessment of the direction of use of the two models? What's everyone's opinion?
In case you need to create videos in the form of daily life, real life, documentary, which model should you use?
I'm really impressed with LoRA GalaxyACE with LTX2.3, although the prompting is a torture, hopefully the creator of GalaxyACE Lora will release a version for Wan2.2 soon.

https://redd.it/1tot9zu
@rStableDiffusion