Comparing a Few Different Upscalers in 2025

I find upscalers quite interesting, as their intent can be both to restore an image while also making it larger. Of course, many folks are familiar with SUPIR, and it is widely considered the gold standard—I wanted to test out a few different closed- and open-source alternatives to see where things stand at the current moment. Now including UltraSharpV2, Recraft, Topaz, Clarity Upscaler, and others.

The way I wanted to evaluate this was by testing 3 different types of images: portrait, illustrative, and landscape, and seeing which general upscaler was the best across all three.

Source Images:

Portrait: [https://unsplash.com/photos/smiling-man-wearing-black-turtleneck-shirt-holding-camrea-4Yv84VgQkRM](https://unsplash.com/photos/smiling-man-wearing-black-turtleneck-shirt-holding-camrea-4Yv84VgQkRM)
Illustration: https://pixabay.com/illustrations/spiderman-superhero-hero-comic-8424632/
Landscape: [https://unsplash.com/photos/three-brown-wooden-boat-on-blue-lake-water-taken-at-daytime-T7K4aEPoGGk](https://unsplash.com/photos/three-brown-wooden-boat-on-blue-lake-water-taken-at-daytime-T7K4aEPoGGk)

To try and control this, I am effectively taking a large-scale image, shrinking it down, then blowing it back up with an upscaler. This way, I can see how the upscaler alters the image in this process.

UltraSharpV2:

Portrait: https://compare.promptingpixels.com/a/LhJANbh
Illustration: [https://compare.promptingpixels.com/a/hSwBOrb](https://compare.promptingpixels.com/a/hSwBOrb)
Landscape: https://compare.promptingpixels.com/a/sxLuZ5y

Notes: Using a simple ComfyUI workflow to upscale the image 4x and that's it—no sampling or using Ultimate SD Upscale. It's free, local, and quick—about 10 seconds per image on an RTX 3060. Portrait and illustrations look phenomenal and are fairly close to the original full-scale image (portrait original vs upscale).

However, the upscaled landscape output looked painterly compared to the original. Details are lost and a bit muddied. Here's an original vs upscaled comparison.

UltraShaperV2 (w/ Ultimate SD Upscale + Juggernaut-XL-v9):

Portrait: [https://compare.promptingpixels.com/a/DwMDv2P](https://compare.promptingpixels.com/a/DwMDv2P)
Illustration: https://compare.promptingpixels.com/a/OwOSvdM
Landscape: [https://compare.promptingpixels.com/a/EQ1Iela](https://compare.promptingpixels.com/a/EQ1Iela)

Notes: Takes nearly 2 minutes per image (depending on input size) to scale up to 4x. Quality is slightly better compared to just an upscale model. However, there's a very small difference given the inference time. The original upscaler model seems to keep more natural details, whereas Ultimate SD Upscaler may smooth out textures—however, this is very much model and prompt dependent, so it's highly variable.

Using [Juggernaut-XL-v9 (SDXL)](
https://huggingface.co/RunDiffusion/Juggernaut-XL-v9), set the denoise to 0.20, 20 steps in [Ultimate SD Upscale](https://github.com/ssitu/ComfyUI_UltimateSDUpscale).
[Workflow Link (Simple Ultimate SD Upscale)](
https://raw.githubusercontent.com/content-and-code/promptingpixels/refs/heads/main/docs/comfy_workflows/Upscaling-with-Ultimate-SD-Upscale.json)

Remacri:

Portrait: https://compare.promptingpixels.com/a/Iig0DyG
Illustration: [https://compare.promptingpixels.com/a/rUU0jnI](https://compare.promptingpixels.com/a/rUU0jnI)
Landscape: https://compare.promptingpixels.com/a/7nOaAfu

Notes: For portrait and illustration, it really looks great. The landscape image looks fried—particularly for elements in the background. Took about 3–8 seconds per image
Comparing a Few Different Upscalers in 2025

I find upscalers quite interesting, as their intent can be both to restore an image while also making it larger. Of course, many folks are familiar with SUPIR, and it is widely considered the gold standard—I wanted to test out a few different closed- and open-source alternatives to see where things stand at the current moment. Now including UltraSharpV2, Recraft, Topaz, Clarity Upscaler, and others.

The way I wanted to evaluate this was by testing 3 different types of images: portrait, illustrative, and landscape, and seeing which general upscaler was the best across all three.

**Source Images:**

* Portrait: [https://unsplash.com/photos/smiling-man-wearing-black-turtleneck-shirt-holding-camrea-4Yv84VgQkRM](https://unsplash.com/photos/smiling-man-wearing-black-turtleneck-shirt-holding-camrea-4Yv84VgQkRM)
* Illustration: [https://pixabay.com/illustrations/spiderman-superhero-hero-comic-8424632/](https://pixabay.com/illustrations/spiderman-superhero-hero-comic-8424632/)
* Landscape: [https://unsplash.com/photos/three-brown-wooden-boat-on-blue-lake-water-taken-at-daytime-T7K4aEPoGGk](https://unsplash.com/photos/three-brown-wooden-boat-on-blue-lake-water-taken-at-daytime-T7K4aEPoGGk)

To try and control this, I am effectively taking a large-scale image, shrinking it down, then blowing it back up with an upscaler. This way, I can see how the upscaler alters the image in this process.

**UltraSharpV2:**

* Portrait: [https://compare.promptingpixels.com/a/LhJANbh](https://compare.promptingpixels.com/a/LhJANbh)
* Illustration: [https://compare.promptingpixels.com/a/hSwBOrb](https://compare.promptingpixels.com/a/hSwBOrb)
* Landscape: [https://compare.promptingpixels.com/a/sxLuZ5y](https://compare.promptingpixels.com/a/sxLuZ5y)

**Notes:** Using a simple ComfyUI workflow to upscale the image 4x and that's it—no sampling or using Ultimate SD Upscale. It's free, local, and quick—about 10 seconds per image on an RTX 3060. Portrait and illustrations look phenomenal and are fairly close to the original full-scale image ([portrait original vs upscale](https://compare.promptingpixels.com/a/5zTVu25)).

However, the upscaled landscape output looked painterly compared to the original. Details are lost and a bit muddied. Here's an [original vs upscaled comparison](https://compare.promptingpixels.com/a/M4JYvjh).

**UltraShaperV2 (w/ Ultimate SD Upscale + Juggernaut-XL-v9):**

* Portrait: [https://compare.promptingpixels.com/a/DwMDv2P](https://compare.promptingpixels.com/a/DwMDv2P)
* Illustration: [https://compare.promptingpixels.com/a/OwOSvdM](https://compare.promptingpixels.com/a/OwOSvdM)
* Landscape: [https://compare.promptingpixels.com/a/EQ1Iela](https://compare.promptingpixels.com/a/EQ1Iela)

**Notes:** Takes nearly 2 minutes per image (depending on input size) to scale up to 4x. Quality is slightly better compared to just an upscale model. However, there's a very small difference given the inference time. The original upscaler model seems to keep more natural details, whereas Ultimate SD Upscaler may smooth out textures—however, this is very much model and prompt dependent, so it's highly variable.

Using [Juggernaut-XL-v9 (SDXL)](https://huggingface.co/RunDiffusion/Juggernaut-XL-v9), set the denoise to 0.20, 20 steps in [Ultimate SD Upscale](https://github.com/ssitu/ComfyUI_UltimateSDUpscale).
[Workflow Link (Simple Ultimate SD Upscale)](https://raw.githubusercontent.com/content-and-code/promptingpixels/refs/heads/main/docs/comfy_workflows/Upscaling-with-Ultimate-SD-Upscale.json)

**Remacri:**

* Portrait: [https://compare.promptingpixels.com/a/Iig0DyG](https://compare.promptingpixels.com/a/Iig0DyG)
* Illustration: [https://compare.promptingpixels.com/a/rUU0jnI](https://compare.promptingpixels.com/a/rUU0jnI)
* Landscape: [https://compare.promptingpixels.com/a/7nOaAfu](https://compare.promptingpixels.com/a/7nOaAfu)

**Notes:** For portrait and illustration, it really looks great. The landscape image looks fried—particularly for elements in the background. Took about 3–8 seconds per image
on an RTX 3060 (time varies on original image size). Like UltraShaperV2: free, local, and quick. I prefer the outputs of UltraShaperV2 over Remacri.

**Recraft Crisp Upscale:**

* Portrait: [https://compare.promptingpixels.com/a/yk699SV](https://compare.promptingpixels.com/a/yk699SV)
* Illustration: [https://compare.promptingpixels.com/a/FWXp2Oe](https://compare.promptingpixels.com/a/FWXp2Oe)
* Landscape: [https://compare.promptingpixels.com/a/RHZmZz2](https://compare.promptingpixels.com/a/RHZmZz2)

**Notes:** Super fast execution at a relatively low cost ($0.006 per image) makes it good for web apps and such. As with other upscale models, for portrait and illustration it performs well.

Landscape is perhaps the most notable difference in quality. There is a graininess in some areas that is more representative of a picture than a painting—which I think is good. However, detail enhancement in complex areas, such as the foreground subjects and water texture, is pretty bad.

Portrait, the image facial features look too soft. Details on the wrists and writing on the camera though are quite good.

**SUPIR:**

* Portrait: [https://compare.promptingpixels.com/a/0F4O2Cq](https://compare.promptingpixels.com/a/0F4O2Cq)
* Illustration: [https://compare.promptingpixels.com/a/EltkjVb](https://compare.promptingpixels.com/a/EltkjVb)
* Landscape: [https://compare.promptingpixels.com/a/6i5d6Sb](https://compare.promptingpixels.com/a/6i5d6Sb)

**Notes:** SUPIR is a great generalist upscaling model. However, given the price ($.10 per run on Replicate: https://replicate.com/zust-ai/supir), it is quite expensive. It's tough to compare, but when comparing the output of SUPIR to Recraft ([comparison](https://compare.promptingpixels.com/a/AuIMunX)), SUPIR scrambles the branding on the camera (MINOLTA is no longer legible) and alters the watch face on the wrist significantly. However, Recraft smooths and flattens the face and makes it look more illustrative, whereas SUPIR stays closer to the original.

While I like some of the creative liberties that SUPIR applies to the images—particularly in the illustrative example—within the portrait comparison, it makes some significant adjustments to the subject, particularly to the details in the glasses, watch/bracelet, and "MINOLTA" on the camera. Landscape, though, I think SUPIR delivered the best upscaling output.

**Clarity Upscaler:**

* Portrait: [https://compare.promptingpixels.com/a/1CB1RNE](https://compare.promptingpixels.com/a/1CB1RNE)
* Illustration: [https://compare.promptingpixels.com/a/qxnMZ4V](https://compare.promptingpixels.com/a/qxnMZ4V)
* Landscape: [https://compare.promptingpixels.com/a/ubrBNPC](https://compare.promptingpixels.com/a/ubrBNPC)

**Notes:** Running at default settings, Clarity Upscaler can really clean up an image and add a plethora of new details—it's somewhat like a "hires fix." To try and tone down the creativeness of the model, I changed creativity to 0.1 and resemblance to 1.5, and it cleaned up the image a bit better ([example](https://compare.promptingpixels.com/a/SlUtEsx)). However, it still smoothed and flattened the face—similar to what Recraft did in earlier tests.

Outputs will only cost about $0.012 per run.

**Topaz:**

* Portrait: [https://compare.promptingpixels.com/a/B5Z00JJ](https://compare.promptingpixels.com/a/B5Z00JJ)
* Illustration: [https://compare.promptingpixels.com/a/vQ9ryRL](https://compare.promptingpixels.com/a/vQ9ryRL)
* Landscape: [https://compare.promptingpixels.com/a/i50rVxV](https://compare.promptingpixels.com/a/i50rVxV)

**Notes:** Topaz has a few interesting dials that make it a bit trickier to compare. When first upscaling the landscape image, the output looked downright bad with default settings ([example](https://compare.promptingpixels.com/a/hyyy5Ow)). They provide a subject\_detection field where you can set it to all, foreground, or background, so you can be more specific about what you want to adjust in the upscale. In the example above, I selected "all" and the results were quite good. Here's a [comparison of Topaz (all subjects)
vs SUPIR](https://compare.promptingpixels.com/a/7ibDm4u) so you can compare for yourself.

Generations are $0.05 per image and will take roughly 6 seconds per image at a 4x scale factor. Half the price of SUPIR but significantly more than other options.

**Final thoughts:** SUPIR is still damn good and is hard to compete with. However, Recraft Crisp Upscale does better with words and details and is cheaper but definitely takes a bit too much creative liberty. I think Topaz edges it out just a hair, but comes at a significant increase in cost ($0.006 vs $0.05 per run - or $0.60 vs $5.00 per 100 images)

UltraSharpV2 is a terrific general-use local model - kudos to /u/[Kim2091](https://www.reddit.com/user/Kim2091/).

I know there are a ton of different upscalers over on [https://openmodeldb.info/](https://openmodeldb.info/), so it may be best practice to use a different upscaler for different types of images or specific use cases. However, I don't like to get this into the weeds on the settings for each image, as it can become quite time-consuming.

After comparing all of these, still curious what everyone prefers as a general use upscaling model?

https://redd.it/1kz9q84
@rStableDiffusion
Mod of Chatterbox TTS - now accepts text files as input, etc.

So yesterday this was released.

So I messed with it and made some modifications and this is my modified fork of Chatterbox TTS.

https://github.com/petermg/Chatterbox-TTS-Extended

I added the following features:

1. Accepts a text file as input.
2. Each sentence is processed separately, written to a temp folder, then after all sentences have been written, they are concatenated into a single audio file.
3. Outputs audio files to "outputs" folder.

https://redd.it/1kzedue
@rStableDiffusion
Which good model can be freely used commercially?

I was using juggernaut XL and just read on their website that you need a license for commercial use, and of course it's a damn subscription. What are good alternatives that are either free or one time payment? Subscriptions are out of control in the AI world

https://redd.it/1kzbcxc
@rStableDiffusion
New PhantomWan14B-GGUFs 🚀🚀🚀

https://huggingface.co/QuantStack/Phantom\_Wan\_14B-GGUF

This is a GGUF version of Phantom_Wan that works in native workflows!

Phantom allows to use multiple reference images that then with some prompting will appear in the video you generate, an example generation is below.

A basic workflow is here:

https://huggingface.co/QuantStack/Phantom\_Wan\_14B-GGUF/blob/main/Phantom\_example\_workflow.json

This video is the result from the two reference pictures below and this prompt:

"A woman with blond hair, silver headphones and mirrored sunglasses is wearing a blue and red VINTAGE 1950s TEA DRESS, she is walking slowly through the desert, and the shot pulls slowly back to reveal a full length body shot."

The video was generated in 720x720@81f in 6 steps with causvid lora on the Q8_0 GGUF.

https://reddit.com/link/1kzkch4/video/i22s6ypwk04f1/player

https://preview.redd.it/37cj5j9xk04f1.png?width=512&format=png&auto=webp&s=e6a0b1142685ccbf562b80859d79566104996d8d

https://preview.redd.it/cmdln9lxk04f1.jpg?width=236&format=pjpg&auto=webp&s=e985762beb6210be518236d2c6c48858ad67a229



https://redd.it/1kzkch4
@rStableDiffusion
I wanna use this photo as reference, but depth or canny or openpose all not working, help.
https://redd.it/1kzlzq0
@rStableDiffusion
Hey guys, is there any tutorial on how to make a GOOD LoRA? I'm trying to make one for Illustrious. Should I remove the background like this, or is it better to keep it?

https://redd.it/1kzn560
@rStableDiffusion
T5-SD(1.5)

\\"a misty Tokyo alley at night\\"

Things have been going poorly with my efforts to train the model I announced at https://www.reddit.com/r/StableDiffusion/comments/1kwbu2f/the\_first\_step\_in\_t5sdxl/

not because it is in principle untrainable.... but because I'm having difficulty coming up with a Working Training Script.
(if anyone wants to help me out with that part, I'll then try the longer effort of actually running the training!)

Meanwhile.... I decided to do the same thing for SD1.5 --
replace CLIP with T5 text encoder


Because in theory, the training script should be easier, and then certainly the training TIME should be shorter. by a lot.


Huggingface raw model: https://huggingface.co/opendiffusionai/stablediffusion\_t5

Demo code: https://huggingface.co/opendiffusionai/stablediffusion\_t5/blob/main/demo.py



PS: The difference between this, and ELLA, is that I believe ELLA was an attempt to enhance the existing SD1.5 base, without retraining? So it had a buncha adaptations to make that work.

Whereas this is just a pure T5 text encoder, with intent to train up the unet to match it.

I'm kinda expecting it to be not as good as ELLA, to be honest :-} But I want to see for myself.

https://redd.it/1kzoqd2
@rStableDiffusion