Crucible - local open source application for dataset handling
Hi,
I've created Crucible, a local dataset management app aimed at diffusion models. No cloud, no subscriptions, runs on your own hardware. Developed for myself but decided to open source.
Video showcase: https://www.youtube.com/watch?v=Ig4j5ijovCI
Github: https://github.com/Blandmarrow/Crucible
Key features:
Caption images in batch using local ML models (Ollama, Florence-2, PaliGemma-2)
Score every image across aesthetic quality, technical quality, watermark detection, and style similarity
ML upscaling and LUT color grading
Filter & curate via search, quality flags, and score ranges
Batch edit captions, crops, and resizes
Version datasets with named snapshots and branches — restore any prior state
Object detection and phrase grounding via Florence-2 bounding-box detection
Built-in file browser with generation metadata preview (A1111 + ComfyUI)
Export to Kohya, AI Toolkit, or plain folder with per-export filtering and resizing
Split view — run any two pages side-by-side
I'll keep updating it as my own workflow evolves. Would love feedback on what's missing, particularly around features and perhaps integrations you'd find useful.
I have some automated workflows planned for creating datasets and training them utilizing this application but nothing concrete to show right now if anyone would be interested in that.
https://preview.redd.it/njfdatfpl23h1.png?width=1908&format=png&auto=webp&s=4631099ad036f269590e0273fde5f6d0fa48b459
https://preview.redd.it/xobpp25ql23h1.png?width=3835&format=png&auto=webp&s=9cc09b5b9143a63cb19e67ea02278d4c6c1c4dc4
https://preview.redd.it/tw4qe3jrl23h1.png?width=1920&format=png&auto=webp&s=6c88f0b00824efae9c553bfd7a5aba2fc785949c
https://redd.it/1tmb7n7
@rStableDiffusion
Hi,
I've created Crucible, a local dataset management app aimed at diffusion models. No cloud, no subscriptions, runs on your own hardware. Developed for myself but decided to open source.
Video showcase: https://www.youtube.com/watch?v=Ig4j5ijovCI
Github: https://github.com/Blandmarrow/Crucible
Key features:
Caption images in batch using local ML models (Ollama, Florence-2, PaliGemma-2)
Score every image across aesthetic quality, technical quality, watermark detection, and style similarity
ML upscaling and LUT color grading
Filter & curate via search, quality flags, and score ranges
Batch edit captions, crops, and resizes
Version datasets with named snapshots and branches — restore any prior state
Object detection and phrase grounding via Florence-2 bounding-box detection
Built-in file browser with generation metadata preview (A1111 + ComfyUI)
Export to Kohya, AI Toolkit, or plain folder with per-export filtering and resizing
Split view — run any two pages side-by-side
I'll keep updating it as my own workflow evolves. Would love feedback on what's missing, particularly around features and perhaps integrations you'd find useful.
I have some automated workflows planned for creating datasets and training them utilizing this application but nothing concrete to show right now if anyone would be interested in that.
https://preview.redd.it/njfdatfpl23h1.png?width=1908&format=png&auto=webp&s=4631099ad036f269590e0273fde5f6d0fa48b459
https://preview.redd.it/xobpp25ql23h1.png?width=3835&format=png&auto=webp&s=9cc09b5b9143a63cb19e67ea02278d4c6c1c4dc4
https://preview.redd.it/tw4qe3jrl23h1.png?width=1920&format=png&auto=webp&s=6c88f0b00824efae9c553bfd7a5aba2fc785949c
https://redd.it/1tmb7n7
@rStableDiffusion
YouTube
Showcase of Crucible
A short video showcasing some of the features of the Crucible application.
https://github.com/Blandmarrow/Crucible
https://github.com/Blandmarrow/Crucible
SD-WebUI-Codex + "Z-Image 6B with pixel space gen. No VAE.." thread
https://redd.it/1tmfc4a
@rStableDiffusion
https://redd.it/1tmfc4a
@rStableDiffusion