New on Replicate: FLUX.1 Kontext – edit images with just a text prompt

We just launched FLUX.1 Kontext, an image editing model where you *describe* the edit you want, and it applies it directly to your image. You can:

* Change a person’s hairstyle, outfit, or expression
* Transform a photo into a pencil sketch or pop art
* Edit signs and labels by quoting the exact text
* Change scenes while keeping subjects in place
* Maintain character identity across multiple edits

Demo and API available now: [https://replicate.com/black-forest-labs/flux-kontext-pro](https://replicate.com/black-forest-labs/flux-kontext-pro) 
Blog with examples: [https://replicate.com/blog/flux-kontext](https://replicate.com/blog/flux-kontext)

https://redd.it/1kyugrb
@rStableDiffusion
Unpopular Opinion: Why I am not holding my breath for Flux Kontext

There are reasons why Google and OpenAI are using autoregressive models for their image editing process. Image editing requires multimodal capacity and alignment. To edit an image, it requires LLM capability to understand the editing task and an image processing AI to identify what is in the image. However, that isn't enough, as there are hurdles to pass their understanding accurately enough for the image generation AI to translate and complete the task. Since other modals are autoregressive, an autoregressive image generation AI makes it easier to align the editing task.

Let's consider the case of Ghiblify an image. The image processing may identify what's in the picture. But how do you translate that into a condition? It can generate a detailed prompt. However, many details, such as character appearances, clothes, poses, and background objects, are hard to describe or to accurately project in a prompt. This is where the autoregressive model comes in, as it predicts pixel by pixel for the task.

Given the fact that Flux is a diffusion model with no multimodal capability. This seems to imply that there are other models, such as an image processing model, an editing task model (Lora possibly), in addition to the finetuned Flux model and the deployed toolset.

So, releasing a Dev model is only half the story. I am curious what they are going to do. Lump everything and distill it? Also, image editing requires a much greater latitude of flexibility, far greater than image generation models. So, what is a distilled model going to do? Pretend that it can do it?

To me, a distlled dev model is just a marketing gimmick to bring people over to their paid service. And that could potentially work as people will be so frustrated with the model that they may be willing to fork over money for something better. This is the reason I am not going to waste a second of my time on this model.

I expect this to be downvoted to oblivion, and that's fine. However, if you don't like what I have to say, would it be too much to ask you to point out where things are wrong?

https://redd.it/1kyyast
@rStableDiffusion
Finally!! DreamO now has a ComfyUI native implementation.
https://redd.it/1kz2qa0
@rStableDiffusion
Do not buy API credits inside ComfyUI. My card got charged $380 later.

I bought API credits inside ComfyUI Desktop today to test out the new Flux Kontext model. I clicked the "Purchase Credits" button in Comfy's settings and paid through the pop-up window (Stripe), got $5 worth of credits, and everything worked fine. Stripe

A couple of hours later, my card was charged $380 with a nonsense description like "LinkedIn.com IE something something." I don’t have my card linked to any LinkedIn account, and obviously, there’s no reason they’d randomly charge $380 for something.

I’m not the type of person who gives out card info easily. I haven’t bought anything else with this card recently, and nothing like this has ever happened to me before. The only purchase I made today was for those credits—and a few hours later, this happened. Coincidence? Idk, but be careful. I filed a report with my bank, let’s see what they say.

Has anyone had this happen?

https://redd.it/1kz89d0
@rStableDiffusion