The issue of repetitive compositions in ANIMA.
Is anyone else having this issue? Every time I enter a prompt, the composition ends up being almost identical. It lacks the randomness you get in illustrious or NAI.
Anyone know a good way to improve this?
https://preview.redd.it/t790dskfna1h1.png?width=590&format=png&auto=webp&s=1de07356f73d4615f3cdfd00a3a8072840378209
https://preview.redd.it/bf8oyjxzma1h1.png?width=603&format=png&auto=webp&s=3b16a80daa72d4705c6b7e42cca5c928267aa57e
https://redd.it/1tdu9s9
@rStableDiffusion
Is anyone else having this issue? Every time I enter a prompt, the composition ends up being almost identical. It lacks the randomness you get in illustrious or NAI.
Anyone know a good way to improve this?
https://preview.redd.it/t790dskfna1h1.png?width=590&format=png&auto=webp&s=1de07356f73d4615f3cdfd00a3a8072840378209
https://preview.redd.it/bf8oyjxzma1h1.png?width=603&format=png&auto=webp&s=3b16a80daa72d4705c6b7e42cca5c928267aa57e
https://redd.it/1tdu9s9
@rStableDiffusion
It appears that Microsoft uploaded an image model on HuggingFace and then deleted it.
https://redd.it/1tdxf4t
@rStableDiffusion
https://redd.it/1tdxf4t
@rStableDiffusion
Media is too big
VIEW IN TELEGRAM
I guess this happened a Week after Riker Rick Rolled the ship. With a Special Ending. lol.
https://redd.it/1te17ky
@rStableDiffusion
https://redd.it/1te17ky
@rStableDiffusion
Anima is in process of being added to diffusers
https://github.com/huggingface/diffusers/pull/13732
Hopefully support on major trainers like OneTrainer is coming after this.
With all the respect to diffusion-pipe its bucketing is a headscratcher and I don't really trust all standalone trainers based on kohya-SS after issues reported and do not want a stack of those.
https://redd.it/1te2d2i
@rStableDiffusion
https://github.com/huggingface/diffusers/pull/13732
Hopefully support on major trainers like OneTrainer is coming after this.
With all the respect to diffusion-pipe its bucketing is a headscratcher and I don't really trust all standalone trainers based on kohya-SS after issues reported and do not want a stack of those.
https://redd.it/1te2d2i
@rStableDiffusion
GitHub
Add Anima modular pipeline by rmatif · Pull Request #13732 · huggingface/diffusers
What does this PR do?
Adds modular-only support for Anima, a text-to-image model built on top of the Cosmos Predict2 DiT architecture.
This PR adds:
AnimaModularPipeline and AnimaAutoBlocks
AnimaT...
Adds modular-only support for Anima, a text-to-image model built on top of the Cosmos Predict2 DiT architecture.
This PR adds:
AnimaModularPipeline and AnimaAutoBlocks
AnimaT...
Microsoft lens is less than 4B params. The tendency is less params...
Ok, they have retired it. It was 3.8B IIRC. In any case, it seems there´s this tendency to do smaller and smaller models but they manage to get better and better anyhow.
My 12GB card loves it. Lets keep the good work
https://redd.it/1te4ieu
@rStableDiffusion
Ok, they have retired it. It was 3.8B IIRC. In any case, it seems there´s this tendency to do smaller and smaller models but they manage to get better and better anyhow.
My 12GB card loves it. Lets keep the good work
https://redd.it/1te4ieu
@rStableDiffusion
Reddit
From the StableDiffusion community on Reddit
Explore this post and more from the StableDiffusion community
Pixal3D: Generate high-fidelity 3D assets from a single image. (TencentARC, locally runnable model)
https://huggingface.co/TencentARC/Pixal3D
"Pixal3D generates high-fidelity 3D assets from a single image. Unlike previous methods that loosely inject image features via attention, Pixal3D explicitly lifts pixel features into 3D through back-projection, establishing direct pixel-to-3D correspondences. This enables near-reconstruction-level fidelity with detailed geometry and PBR textures."
Looks like no one mentioned this in the sub, so here's everyone's notification.
Some fast points:
* It's a locally runnable model
* I got it working on an RTX 5090 by yelling "Fix it!" at Claude over and over like Philip J. Fry. (This works on most models by the way, I suggest you try it if you have Claude and want to try local models before Comfy's team gets around to it)
* To my eyes, this looks like a step up from Trellis.2 raw, but don't take my word on that. It has some online demo, give it a go.
Please note that it did take a good amount of time getting creative with the yelling-at-claude part, with me having to make some judgment calls and give it advice about how to proceed. But tenacity paid off for me, and I figure it will pay off for anyone else who cares to put in the effort, at least until someone makes a more broadly available guide.
https://redd.it/1te93yi
@rStableDiffusion
https://huggingface.co/TencentARC/Pixal3D
"Pixal3D generates high-fidelity 3D assets from a single image. Unlike previous methods that loosely inject image features via attention, Pixal3D explicitly lifts pixel features into 3D through back-projection, establishing direct pixel-to-3D correspondences. This enables near-reconstruction-level fidelity with detailed geometry and PBR textures."
Looks like no one mentioned this in the sub, so here's everyone's notification.
Some fast points:
* It's a locally runnable model
* I got it working on an RTX 5090 by yelling "Fix it!" at Claude over and over like Philip J. Fry. (This works on most models by the way, I suggest you try it if you have Claude and want to try local models before Comfy's team gets around to it)
* To my eyes, this looks like a step up from Trellis.2 raw, but don't take my word on that. It has some online demo, give it a go.
Please note that it did take a good amount of time getting creative with the yelling-at-claude part, with me having to make some judgment calls and give it advice about how to proceed. But tenacity paid off for me, and I figure it will pay off for anyone else who cares to put in the effort, at least until someone makes a more broadly available guide.
https://redd.it/1te93yi
@rStableDiffusion
huggingface.co
TencentARC/Pixal3D · Hugging Face
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
stable-diffusion-webui-codex v0.3.0-beta is live (now with link 😅)
https://redd.it/1te4zvv
@rStableDiffusion
https://redd.it/1te4zvv
@rStableDiffusion
This media is not supported in your browser
VIEW IN TELEGRAM
Playing with Anima Base 1.0 + Flux.2 Klein 9b + Wan 2.2 (No Audio)
https://redd.it/1teatpv
@rStableDiffusion
https://redd.it/1teatpv
@rStableDiffusion