sand-ai/MAGI-1 have just released their small version 4.5b. Anyone tried it yet?
https://huggingface.co/sand-ai/MAGI-1/tree/main/ckpt/magi
https://redd.it/1kuz351
@rStableDiffusion
https://huggingface.co/sand-ai/MAGI-1/tree/main/ckpt/magi
https://redd.it/1kuz351
@rStableDiffusion
huggingface.co
sand-ai/MAGI-1 at main
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
Are Diffusion Models Fundamentally Limited in 3D Understanding?
So if I understand correctly, Stable Diffusion is essentially a denoising algorithm. This means that all models based on this technology are, in their current form, incapable of truly understanding the 3D geometry of objects.
As a result, they would fail to reliably convert a third-person view into a first-person perspective or to change the viewing angle of a scene without introducing hallucinations or inconsistencies.
Am I wrong in thinking this way?
Edit: they can't be used for editing existing images/ videos. Only for generating new content?
Edit: after thinking about it I think I found where I was wrong. I was thinking about a one step scene angle transition like from a 3d scene to a first person view of someone in that scene. Clearly it won't work in one step. But if we let it render all the steps in between, like letting it use time dimension, then it will be able to do that accurately.
I would be happy if someone could illustrate it on an example.
https://redd.it/1kv12pw
@rStableDiffusion
So if I understand correctly, Stable Diffusion is essentially a denoising algorithm. This means that all models based on this technology are, in their current form, incapable of truly understanding the 3D geometry of objects.
As a result, they would fail to reliably convert a third-person view into a first-person perspective or to change the viewing angle of a scene without introducing hallucinations or inconsistencies.
Am I wrong in thinking this way?
Edit: they can't be used for editing existing images/ videos. Only for generating new content?
Edit: after thinking about it I think I found where I was wrong. I was thinking about a one step scene angle transition like from a 3d scene to a first person view of someone in that scene. Clearly it won't work in one step. But if we let it render all the steps in between, like letting it use time dimension, then it will be able to do that accurately.
I would be happy if someone could illustrate it on an example.
https://redd.it/1kv12pw
@rStableDiffusion
Reddit
From the StableDiffusion community on Reddit
Explore this post and more from the StableDiffusion community
Am I the only one who feels like the have an AI drug addiction?
Seriously. Between all the free online AI resources (Github, Discord, YouTube, Reddit) and having a system that can run these apps fairly decently 5800X, 96GB RAM, 4090 24GB VRAM, I feel like I'm a kid in a candy store.. or a crack addict in a free crack store? I get to download all kinds of amazing AI applications FOR FREE, many of which you can even use commercially for free. I feel almost like I have an AI problem and I need an intervention... but I don't want one :D
https://redd.it/1kv5i6f
@rStableDiffusion
Seriously. Between all the free online AI resources (Github, Discord, YouTube, Reddit) and having a system that can run these apps fairly decently 5800X, 96GB RAM, 4090 24GB VRAM, I feel like I'm a kid in a candy store.. or a crack addict in a free crack store? I get to download all kinds of amazing AI applications FOR FREE, many of which you can even use commercially for free. I feel almost like I have an AI problem and I need an intervention... but I don't want one :D
https://redd.it/1kv5i6f
@rStableDiffusion
Reddit
From the StableDiffusion community on Reddit
Explore this post and more from the StableDiffusion community
Can Open-Source Video Generation Realistically Compete with Google Veo 3 in the Near Future?
https://redd.it/1kuwmzn
@rStableDiffusion
https://redd.it/1kuwmzn
@rStableDiffusion
Reddit
From the StableDiffusion community on Reddit
Explore this post and more from the StableDiffusion community
This media is not supported in your browser
VIEW IN TELEGRAM
FREE ComfyUI Workflows + Guide | Built For Understanding, Not Just Using
https://redd.it/1kva7hz
@rStableDiffusion
https://redd.it/1kva7hz
@rStableDiffusion
This media is not supported in your browser
VIEW IN TELEGRAM
I Added Native Support for Audio Repainting and Extending in ComfyUI
https://redd.it/1kvbifv
@rStableDiffusion
https://redd.it/1kvbifv
@rStableDiffusion
PSA: Flux loras works EXTREMELY well on Chroma. Like very, VERY well
Tried a couple and, Well, saying I was mesmerized is an understatement.
Plus Chroma is fully uncensored so... Uh, yeah.
https://redd.it/1kvenmw
@rStableDiffusion
Tried a couple and, Well, saying I was mesmerized is an understatement.
Plus Chroma is fully uncensored so... Uh, yeah.
https://redd.it/1kvenmw
@rStableDiffusion
Reddit
From the StableDiffusion community on Reddit
Explore this post and more from the StableDiffusion community
How come Jenga is not talked about here
https://github.com/dvlab-research/Jenga
This looks like an amazing piece of research, enabling Hunyuan and soon WAN2.1 at a much lower cost. They managed to 10x the generation time of Hunyuan t2v and 4x Hunyuan i2v. Excited to see what's gonna go down with WAN2.1 with this project.
https://redd.it/1kvfauk
@rStableDiffusion
https://github.com/dvlab-research/Jenga
This looks like an amazing piece of research, enabling Hunyuan and soon WAN2.1 at a much lower cost. They managed to 10x the generation time of Hunyuan t2v and 4x Hunyuan i2v. Excited to see what's gonna go down with WAN2.1 with this project.
https://redd.it/1kvfauk
@rStableDiffusion
GitHub
GitHub - dvlab-research/Jenga: Official Implementation: Training-Free Efficient Video Generation via Dynamic Token Carving
Official Implementation: Training-Free Efficient Video Generation via Dynamic Token Carving - dvlab-research/Jenga
what is the best alternative for CivitAI now? For browsing checkpoints/loras etc.
https://redd.it/1kvg8rg
@rStableDiffusion
https://redd.it/1kvg8rg
@rStableDiffusion
Reddit
From the StableDiffusion community on Reddit
Explore this post and more from the StableDiffusion community
This media is not supported in your browser
VIEW IN TELEGRAM
I Just Open-Sourced 10 Camera Control Wan LoRAs & made a free HuggingFace Space
https://redd.it/1kviphp
@rStableDiffusion
https://redd.it/1kviphp
@rStableDiffusion
Simple, uncensored model sharing site like early Civitai. Would you use it?
What do you guys think about a simple, uncensored model sharing site like early Civitai with no generation, no paywalls, just filters, tags, and clean search for models?
I want to build it, but to be honest, I can’t fund it right now. I’ve been unemployed for about 3 years and currently live on food stamps.
Estimated costs:
- ~$15/month for 1TB on Cloudflare R2
- Bandwidth depends on traffic
- $10/year for a Namecheap domain
- Everything else runs on free tiers
I plan to keep it fully transparent, with a /costs page showing real usage, bills, and how long donations cover it. If it grows too big, I’ll ask the community for help or add optional premium features (like faster downloads), but core browsing and downloading will always stay free.
I believe people respect honesty more than surprise paywalls, so I want to be upfront from the start.
Still figuring out if there’s enough interest to make it worth doing. If you’d use it or want to help, please let me know.
PS. Very ashamed self promo: you can check out what I made here: http://lucyradio.com/get. It’s a tiny music app for lofi, jazz, deep house, etc., with all images AI generated locally. So yeah, I owe a debt to the community for the checkpoints and LoRAs I used.
Thanks u/Fresh_Diffusor for encouraging me to post here.
https://redd.it/1kvjt40
@rStableDiffusion
What do you guys think about a simple, uncensored model sharing site like early Civitai with no generation, no paywalls, just filters, tags, and clean search for models?
I want to build it, but to be honest, I can’t fund it right now. I’ve been unemployed for about 3 years and currently live on food stamps.
Estimated costs:
- ~$15/month for 1TB on Cloudflare R2
- Bandwidth depends on traffic
- $10/year for a Namecheap domain
- Everything else runs on free tiers
I plan to keep it fully transparent, with a /costs page showing real usage, bills, and how long donations cover it. If it grows too big, I’ll ask the community for help or add optional premium features (like faster downloads), but core browsing and downloading will always stay free.
I believe people respect honesty more than surprise paywalls, so I want to be upfront from the start.
Still figuring out if there’s enough interest to make it worth doing. If you’d use it or want to help, please let me know.
PS. Very ashamed self promo: you can check out what I made here: http://lucyradio.com/get. It’s a tiny music app for lofi, jazz, deep house, etc., with all images AI generated locally. So yeah, I owe a debt to the community for the checkpoints and LoRAs I used.
Thanks u/Fresh_Diffusor for encouraging me to post here.
https://redd.it/1kvjt40
@rStableDiffusion
AccVideo released their weights for Wan 14b. Kijai has already made a FP8 version too.
https://github.com/aejion/AccVideo
https://redd.it/1kvrfuq
@rStableDiffusion
https://github.com/aejion/AccVideo
https://redd.it/1kvrfuq
@rStableDiffusion
GitHub
GitHub - aejion/AccVideo: Official code for AccVideo: Accelerating Video Diffusion Model with Synthetic Dataset
Official code for AccVideo: Accelerating Video Diffusion Model with Synthetic Dataset - aejion/AccVideo
"Decay" AI generated music video
https://youtu.be/dVEQRAAYo94
https://redd.it/1kvu8fs
@rStableDiffusion
https://youtu.be/dVEQRAAYo94
https://redd.it/1kvu8fs
@rStableDiffusion
YouTube
162 | "Decay" | Music Video (Ray2 text2video) [4K]
"Decay" AI generated music video
Video - Luma Dream Machine Ray2 text2video
Music - Udio v1.5 Allegro
100% AI Generated
____________________________________________
Let me know your feedback in the comments, also consider giving a Like, Subscribe and share…
Video - Luma Dream Machine Ray2 text2video
Music - Udio v1.5 Allegro
100% AI Generated
____________________________________________
Let me know your feedback in the comments, also consider giving a Like, Subscribe and share…
WAN VACE 14B in ComfyUI: The Ultimate T2V, I2V & V2V Video Model
https://youtu.be/EsOmNkPTl1g
https://redd.it/1kvum9w
@rStableDiffusion
https://youtu.be/EsOmNkPTl1g
https://redd.it/1kvum9w
@rStableDiffusion
YouTube
WAN VACE 14B in ComfyUI: The Ultimate T2V, I2V & V2V Video Model
WAN VACE 14B in ComfyUI: The Ultimate T2V, I2V & V2V Video Model
Workflow -http://www.aiverseblog.site/2025/05/blog-post_26.html
In this video, I dive deep into the powerful WAN VACE 14B model and show you how to use it in ComfyUI for text-to-video (T2V)…
Workflow -http://www.aiverseblog.site/2025/05/blog-post_26.html
In this video, I dive deep into the powerful WAN VACE 14B model and show you how to use it in ComfyUI for text-to-video (T2V)…
Can i make my SD run slower on purpose?
My GPU is very loud when running Stable Diffusion. SD takes like 30 sec to finish an image.
Is it possible to make SD run normaly, like i'm playing a game, thus maybe making it longer to finish an image?
I don't mind waiting longer.
Thanks a lot!
https://redd.it/1kvu922
@rStableDiffusion
My GPU is very loud when running Stable Diffusion. SD takes like 30 sec to finish an image.
Is it possible to make SD run normaly, like i'm playing a game, thus maybe making it longer to finish an image?
I don't mind waiting longer.
Thanks a lot!
https://redd.it/1kvu922
@rStableDiffusion
Reddit
From the StableDiffusion community on Reddit
Explore this post and more from the StableDiffusion community
The censorship and paywall gatekeeping behind Video Generative AI is really depressing. So much potential, so little freedom
We live in a world where every corporation desires utmost control over their product. We also live in a world where for every person who sees that as wrong, we have 10-20 people defending these practices and another 100-200 on top of that who neither understand nor notice what is going on.
Google, Kling, Vidu, they all have such amazingly powerful tools, yet all these tools keep getting more and more censored, they keep getting more and more out of reach for the average consumer.
My take is that, so what if somebody uses these tools to make illegal "porn" for personal satisfaction? It's all fake, no real human beings are harmed, no the training data isn't equal to taking images of existing people and putting them in compromising positions or situations unless celebrity LORAs are being used with 100% likeness or loras/images of existing people are used. This is difficult to control sure, but ultimately it's a small price to pay for having complete and absolute freedom of choice, freedom of creativity and freedom of expression.
Artists capable of photorealistic art can still draw photorealism, if they have twisted desires they will take the time to draw themselves something twisted. IF they don't they won't. But regardless, paint, brushes, paper, canvas, other art tools, none of that is censored.
AI might have a lower skill entry on the surface, but creating cohesive, long, well put together videos or images that have custom framing, colors, lighting, individual and specific positions and expressions for each character requires time and skill too.
I don't like where AI is going
it's just another amazing thing that is slowly taken away and destroyed by corporate greed and corporate control.
I have zero interest in people's statements who defend these practices, not a single word you say interests me or will I accept it. All I see is how wonderfully creative tools are being dangled in front of us, then taken away while the local and free alternatives are starting to severely lag behind.
https://redd.it/1kw28p7
@rStableDiffusion
We live in a world where every corporation desires utmost control over their product. We also live in a world where for every person who sees that as wrong, we have 10-20 people defending these practices and another 100-200 on top of that who neither understand nor notice what is going on.
Google, Kling, Vidu, they all have such amazingly powerful tools, yet all these tools keep getting more and more censored, they keep getting more and more out of reach for the average consumer.
My take is that, so what if somebody uses these tools to make illegal "porn" for personal satisfaction? It's all fake, no real human beings are harmed, no the training data isn't equal to taking images of existing people and putting them in compromising positions or situations unless celebrity LORAs are being used with 100% likeness or loras/images of existing people are used. This is difficult to control sure, but ultimately it's a small price to pay for having complete and absolute freedom of choice, freedom of creativity and freedom of expression.
Artists capable of photorealistic art can still draw photorealism, if they have twisted desires they will take the time to draw themselves something twisted. IF they don't they won't. But regardless, paint, brushes, paper, canvas, other art tools, none of that is censored.
AI might have a lower skill entry on the surface, but creating cohesive, long, well put together videos or images that have custom framing, colors, lighting, individual and specific positions and expressions for each character requires time and skill too.
I don't like where AI is going
it's just another amazing thing that is slowly taken away and destroyed by corporate greed and corporate control.
I have zero interest in people's statements who defend these practices, not a single word you say interests me or will I accept it. All I see is how wonderfully creative tools are being dangled in front of us, then taken away while the local and free alternatives are starting to severely lag behind.
https://redd.it/1kw28p7
@rStableDiffusion
Reddit
From the StableDiffusion community on Reddit
Explore this post and more from the StableDiffusion community