UniGenDet - A Unified Generative-Discriminative Framework for Co-Evolutionary Image Generation and Generated Image Detection.

https://preview.redd.it/9fl7fg1l25yg1.png?width=2870&format=png&auto=webp&s=2f9a3e9832717e9320ec424c2bead3efeedf04cb

Image generation and generated-image detection have both advanced rapidly, but mostly along separate technical paths: generation is dominated by generative architectures, while detection is dominated by discriminative ones. This separation creates a persistent gap in practice: generators are not directly optimized by forensic criteria, and detectors are often trained on static snapshots of old forgeries, which limits robustness to new generators.

UniGenDet addresses this gap with a unified co-evolutionary framework that jointly optimizes generation and detection in one loop. The core idea is to make both tasks explicitly exchange useful signals instead of evolving independently.

* **Symbiotic multimodal self-attention** bridges generation and authenticity understanding in a shared architecture.
* **Generation-detection unified fine-tuning (GDUF)** equips the detector with generative priors, improving generalization and interpretability.
* **Detector-informed generative alignment (DIGA)** feeds authenticity constraints back into synthesis, improving realism and fidelity.

In short, UniGenDet turns the traditional "generator vs. detector" arms race into a closed-loop collaboration. This repository provides the full training and evaluation pipeline built on pretrained BAGEL components.


HF: [Yanran21/UniGenDet · Hugging Face](https://huggingface.co/Yanran21/UniGenDet)

GH: [Zhangyr2022/UniGenDet](https://github.com/Zhangyr2022/UniGenDet)

https://redd.it/1sz0ci4
@rStableDiffusion
Anima LoRA Training Config Recommendations?

I've been trying to train an Anima Style LoRA, but thus far they've been... lackluster. The first was okay, might've just not liked it because of the simplistic artstyle.

I've been using Adam48bitKhan with Rex Annealing Warm Restarts but I'm not very familiar with Adam as I've let Adafactor do all the work up till now

I see ppl recommend low learning rates with no text encoder, but all these people have over 200 images while I have 50. Any time I've tried low learning rate at that many images it looks terrible.

I've tried finding other configs but most people erase all the metadata these days so I can't figure out what anybody is actually doing.

Any help would be much appreciated!

https://redd.it/1sz8y14
@rStableDiffusion