encoder-only version of T5-XL

Kinda old tech by now, but figure it still deserves an announcement...

I just made an "encoder-only" slimmed down version of the T5-XL text encoder model.

Use with

from transformers import T5EncoderModel

encoder = T5EncoderModel.from_pretrained("opendiffusionai/t5-v1_1-xl-encoder-only")


I had previously found that a version of T5-XXL is available in encoder-only form. But surprisingly, not T5-XL.

This may be important to some folks doing their own models, because while T5-XXL outputs Size(4096) embeddings, T5-XL outputs Size(2048) embeddings.

And unlike many other models... T5 has an apache2.0 license.

Fair warning: The T5-XL encoder itself is also smaller. 4B params vs 11B or something like that. But if you want it.. it is now available as above.



https://redd.it/1lbquj7
@rStableDiffusion
Chroma V37 is out (+ detail calibrated)
https://redd.it/1lbvooi
@rStableDiffusion
Best Open Source Model for text to video generation?

Hey. When I looked it up, the last time this question was asked on the subreddit was 2 months ago. Since the space is fast moving, I thought it's appropriate to ask again.

What is the best open source text to video model currently? The opinion from the last post on this subject was that it's WAN 2.1. What do you think?

https://redd.it/1lbw9e2
@rStableDiffusion
Where is FLUX.1 Kontextdev?

Did I miss the "open" weights version or did they forget to release it? I understand we are not entitled to anything and they can just not release at all if they don't want, that's fine by me. But when you announce it is coming "soon" and 2 weeks later there is no model, I feel the community is being used to hype closed models for free.

And no, being able to use an API through a node/app is not local. It is online generation with hidden/extra steps.

https://redd.it/1lbxjrr
@rStableDiffusion
Wan 2.1 lora's working with Self Forcing DMT would be something incredible

I have been absolutely losing sleep the last day playing with Sef Forcing DMT. This thing is beyond amazing and major respect to the creator. I quickly gave up trying to figure out how to use Lora's. I am hoping(and praying) somebody here on Reddit is trying to figure out how to do this. I am not sure which Wan forcing is trained on (I'm guessing 1.3b) If anybody up here has the scoop on this being a possibility soon, or I just missed the boat on it already being possible. Please spill the beans.

https://redd.it/1lc0pab
@rStableDiffusion
any interest in a comfyui for dummies? (web/mobile app)

hey everyone! I am tinkering on GiraffeDesigner. tldr is "comfyui for dummies" that works pretty well on web and mobile.

Gemini is free to use, for openai and fal.ai you can just insert your API key.

Curious from the community if this is interesting? What features would you like to see? I plan to keep the core product free, any feedback appreciated :)

https://redd.it/1lc5qbf
@rStableDiffusion
SD 3.5 is apparently fast now, good for SFW images?

With the recent announcements about SD 3.5 on new Nvidia cards getting a speed boost and memory requirement decrease, is it worth looking into for SFW gens? I know this community was down on it, but is there any upside with the faster / bigger models being more accessible?

https://redd.it/1lcaj3w
@rStableDiffusion
Media is too big
VIEW IN TELEGRAM
Vace FusionX + background img + reference img + controlnet + 20 x (video extension with Vace FusionX + reference img). Just to see what would happen...

https://redd.it/1lccl41
@rStableDiffusion