r/StableDiffusion

Trained a Vit model from scratch for auto tagging

I recently trained a new anime image tagging model. To prep the data, I used SmilingWolf v3 to fix 300k bad tags and fill in 1M missing ones. I also trained an initial baseline model to help identify and add around 30k low-frequency tags.

The current V1 model is a 320x320 ViT. V1.1 is currently training at 448x448, and the higher resolution is already improving accuracy. My next goal is to wait for a 2025 dataset, clean it heavily, and train from scratch with better vocab structures (e.g., artist:name).

You can find the model, card, and demo space on HuggingFace: https://huggingface.co/Grio43/OppaiOracle Live use of the model: https://huggingface.co/spaces/Grio43/OppaiOracle

CPU based tagger
https://huggingface.co/spaces/Grio43/OppaiCPU

https://redd.it/1t8bzb3
@rStableDiffusion

7 views18:40

r/StableDiffusion

Its still nuts to me how realistic AI is getting, incredible i can run it on a RTX2060 and get these results. (Z-image-Turbo)

https://redd.it/1t8ehyj
@rStableDiffusion

From the StableDiffusion community on Reddit: Its still nuts to me how realistic AI is getting, incredible i can run it on a RTX2060…

Explore this post and more from the StableDiffusion community

7 views19:40

r/StableDiffusion

Teal Dark - Flux.2 Klein 9b style/aesthetic LORA

https://redd.it/1t8bn7f
@rStableDiffusion

From the StableDiffusion community on Reddit: Teal Dark - Flux.2 Klein 9b style/aesthetic LORA

Explore this post and more from the StableDiffusion community

7 views20:40

r/StableDiffusion