Qwen-Image-Edit-2511-Multiple-Angles-LoRA
An interesting tool for camera angles, equipped with a full ControlNet.
On the downside, the image quality isn’t great — the idea is cool, but the execution falls short.
https://huggingface.co/spaces/multimodalart/qwen-image-multiple-angles-3d-camera
An interesting tool for camera angles, equipped with a full ControlNet.
On the downside, the image quality isn’t great — the idea is cool, but the execution falls short.
https://huggingface.co/spaces/multimodalart/qwen-image-multiple-angles-3d-camera
👍648🔥246🎉239❤229
This media is not supported in your browser
VIEW IN TELEGRAM
Higgsfield “What’s Next?”
Higgsfield seem to be aiming to completely remove the traditional scripting component from content creation. That is, there will still be a “script,” but it will be written—or rather assembled—from AI-generated fragments. And not in text form, but directly as video snippets.
Higgs’s new feature, “What Happens Next,” lets you upload a SINGLE image, after which the AI suggests EIGHT video (!) variations of how the events could unfold. You choose the one you like, watch it to the end, and then once again pick one of eight possible continuations.
Higgsfield seem to be aiming to completely remove the traditional scripting component from content creation. That is, there will still be a “script,” but it will be written—or rather assembled—from AI-generated fragments. And not in text form, but directly as video snippets.
Higgs’s new feature, “What Happens Next,” lets you upload a SINGLE image, after which the AI suggests EIGHT video (!) variations of how the events could unfold. You choose the one you like, watch it to the end, and then once again pick one of eight possible continuations.
👍956🔥9❤8🎉8
GLM-Image
We’ve got a new open-source image generator, and technically it’s quite interesting. Earlier, Zhipu released the open-source LLM GLM, which crushed benchmarks and impressed many (you can try it at https://chat.z.ai/). Rumors of an image model followed — and now it’s here.
It’s already available on FAL: https://fal.ai/models/fal-ai/glm-image https://fal.ai/models/fal-ai/glm-image/image-to-image
The key idea is separating “thinking” from rendering. A 9B-parameter autoregressive model interprets complex, knowledge-heavy prompts, then passes them to a 7B-parameter diffusion decoder for rendering. With a custom Glyph Encoder, it aims to render text accurately inside images. Editing and style transfer are included out of the box. They claim quality on par with top diffusion models and better performance on complex tasks.
In practice, results so far look modest. Editing features need more testing and don’t seem very strong yet.
We’ve got a new open-source image generator, and technically it’s quite interesting. Earlier, Zhipu released the open-source LLM GLM, which crushed benchmarks and impressed many (you can try it at https://chat.z.ai/). Rumors of an image model followed — and now it’s here.
It’s already available on FAL: https://fal.ai/models/fal-ai/glm-image https://fal.ai/models/fal-ai/glm-image/image-to-image
The key idea is separating “thinking” from rendering. A 9B-parameter autoregressive model interprets complex, knowledge-heavy prompts, then passes them to a 7B-parameter diffusion decoder for rendering. With a custom Glyph Encoder, it aims to render text accurately inside images. Editing and style transfer are included out of the box. They claim quality on par with top diffusion models and better performance on complex tasks.
In practice, results so far look modest. Editing features need more testing and don’t seem very strong yet.
👍925🔥301❤276🎉273
This media is not supported in your browser
VIEW IN TELEGRAM
Hunyuan3D has been updated to version 3.1.
You need to take a look at the mesh, but it looks really polished.
Probably the most advanced 3D generator available today.
You need to take a look at the mesh, but it looks really polished.
Probably the most advanced 3D generator available today.
👍336🔥125🎉110❤100
This media is not supported in your browser
VIEW IN TELEGRAM
Wan 2.6 Image to Video Flash
So far, it works only from the first frame.
Video length: up to 15 seconds.
You can upload your own audio / audio generation is also available.
There is a shot_type option — single shot or multiple shots within one video.
Very fast.
https://fal.ai/models/wan/v2.6/image-to-video/flash https://wavespeed.ai/models/alibaba/wan-2.6/image-to-video-flash
So far, it works only from the first frame.
Video length: up to 15 seconds.
You can upload your own audio / audio generation is also available.
There is a shot_type option — single shot or multiple shots within one video.
Very fast.
https://fal.ai/models/wan/v2.6/image-to-video/flash https://wavespeed.ai/models/alibaba/wan-2.6/image-to-video-flash
👍385🔥129🎉125❤122
This media is not supported in your browser
VIEW IN TELEGRAM
Runway 4.5 Image to Video
A few days ago, Runway released an update. The main focus is on the Image-to-Video model. On their Twitter and website they show the best examples, but I took real generations and even found a comparison with Kling and Seedance.
I can’t say it’s some kind of revolution. The quality is not better than Kling. Length: 5–10 seconds. 720p.
A few days ago, Runway released an update. The main focus is on the Image-to-Video model. On their Twitter and website they show the best examples, but I took real generations and even found a comparison with Kling and Seedance.
I can’t say it’s some kind of revolution. The quality is not better than Kling. Length: 5–10 seconds. 720p.
👍370❤117🔥107🎉106
And a bit of rankings from Video LMArena.
Veo wipes the floor with everyone, especially in Text-to-Video.
In Image-to-Video, wan2.5 takes 3rd place, Seedance is 6th, and Kling 2.6 is 7th.
You can see that the amount of data is still pretty limited. Runway 4 is hanging out somewhere near the bottom, and for some reason mochi-1 from a year ago has snuck into the rankings.
But Veo’s hegemony will be very hard to beat.
LTX doesn’t show up in the charts at all.
https://lmarena.ai/ru/leaderboard/text-to-video https://lmarena.ai/ru/leaderboard/image-to-video
Veo wipes the floor with everyone, especially in Text-to-Video.
In Image-to-Video, wan2.5 takes 3rd place, Seedance is 6th, and Kling 2.6 is 7th.
You can see that the amount of data is still pretty limited. Runway 4 is hanging out somewhere near the bottom, and for some reason mochi-1 from a year ago has snuck into the rankings.
But Veo’s hegemony will be very hard to beat.
LTX doesn’t show up in the charts at all.
https://lmarena.ai/ru/leaderboard/text-to-video https://lmarena.ai/ru/leaderboard/image-to-video
👍942🔥291🎉282❤268
This media is not supported in your browser
VIEW IN TELEGRAM
Suno Sounds
Suno quietly, announced the beta of its SFX and Loops — creating sound effects that go beyond music. The model is still rough, which is why it’s in beta and available only to Pro and Premier users.
How to find it: on Desktop, when choosing between the Simple and Custom Create modes, there should be a dropdown under Custom that lets you select Sounds (Beta).
It’s interesting that they’re stepping into territory usually occupied by completely different startups with features like these.
Suno quietly, announced the beta of its SFX and Loops — creating sound effects that go beyond music. The model is still rough, which is why it’s in beta and available only to Pro and Premier users.
How to find it: on Desktop, when choosing between the Simple and Custom Create modes, there should be a dropdown under Custom that lets you select Sounds (Beta).
It’s interesting that they’re stepping into territory usually occupied by completely different startups with features like these.
👍250🎉83🔥74❤64
This media is not supported in your browser
VIEW IN TELEGRAM
Lucy 2.0 — fire, real-time, and censorship (none).
The idea itself isn’t exactly new — we’ve already seen it in various Live Portraits, Infinitoks, and of course Kling’s Motion Control. You upload an image of a character, take a video where you (or a more talented actor/character) mug for the camera, and boom — your image starts mugging in the same way. In 3D this is called retargeting.
But!
Here all of this happens in REAL TIME. That is: you take an image, a webcam, and off you go streaming at 24–30 FPS with minimal latency (they claim near-zero latency, but in reality, factoring in the internet, I’d guess 1–2 seconds).
Check out the videos — and remember, this is real time.
Try it here: https://lucy.decart.ai/
The idea itself isn’t exactly new — we’ve already seen it in various Live Portraits, Infinitoks, and of course Kling’s Motion Control. You upload an image of a character, take a video where you (or a more talented actor/character) mug for the camera, and boom — your image starts mugging in the same way. In 3D this is called retargeting.
But!
Here all of this happens in REAL TIME. That is: you take an image, a webcam, and off you go streaming at 24–30 FPS with minimal latency (they claim near-zero latency, but in reality, factoring in the internet, I’d guess 1–2 seconds).
Check out the videos — and remember, this is real time.
Try it here: https://lucy.decart.ai/
👍253🔥94❤92🎉81
As I understand it, there won’t be any breathing room. Kling has been on a roll since the end of last year and clearly has no plans to slow down. We’re all running here like hamsters in an AI wheel. Not even a wheel, really, but a sphere with constantly changing radius, color, and axis of rotation. And there’s no end to the run in sight.
Back in the day, versions were updated once a year. Everything was simple and clear.
Now we live in a tangled domain of “someone, somewhere, sometimes is better at something — but not always, and it’s not exact.”
And on top of that, absolutely all closed generators hide the seed from the user. Just to add even more chaos and make the tokens fly away even faster.
We’ve quietly found ourselves in a world of hallucinations, all waiting for the next version of something to be the best ever — and for everything to finally become simple and clear.
Back in the day, versions were updated once a year. Everything was simple and clear.
Now we live in a tangled domain of “someone, somewhere, sometimes is better at something — but not always, and it’s not exact.”
And on top of that, absolutely all closed generators hide the seed from the user. Just to add even more chaos and make the tokens fly away even faster.
We’ve quietly found ourselves in a world of hallucinations, all waiting for the next version of something to be the best ever — and for everything to finally become simple and clear.
👍399🎉142❤139🔥121
🚀 Stop Using Just ONE AI
Why limit yourself to a single model? Integri lets you tap into multiple AI models at once—so you get better answers, faster workflows, and way more flexibility.
▶️ Watch the breakdown here:
https://youtu.be/0PtCiyzvrr0
If you’re serious about using AI efficiently, this one’s worth your time.
Why limit yourself to a single model? Integri lets you tap into multiple AI models at once—so you get better answers, faster workflows, and way more flexibility.
▶️ Watch the breakdown here:
https://youtu.be/0PtCiyzvrr0
If you’re serious about using AI efficiently, this one’s worth your time.
YouTube
Stop Using Just ONE AI; Integri Lets You Use ALL Models at Once
What if you didn’t have to choose between AI models - ChatGPT, Claude, Gemini, Llama - and could use all of them side by side?
That’s exactly what Integri AI does.
Integri is an all-in-one AI platform where you can:
• Run one prompt through multiple models…
That’s exactly what Integri AI does.
Integri is an all-in-one AI platform where you can:
• Run one prompt through multiple models…
👍365🎉127❤124🔥105
Eleven v3 quietly came out of alpha. No big announcements — just “okay, this is usable now.”
It feels like one of those updates where the tool finally stops arguing with you. The error rate dropped hard — almost three times lower. The voice no longer stumbles over numbers, symbols, or all those weird edge cases that used to sound like improvisation.
The most noticeable change is stability. Not in a flashy way — just fewer things breaking. In testing, people picked this version far more often than the old alpha, and that lines up with how it actually feels to use.
This isn’t the kind of release that makes you tweet “we’re living in the future.” It’s the kind that saves you from constant fixes, second guesses, and cleanup.
Eleven v3 isn’t an event. It’s an update that makes work a bit easier. And these days, that might be the real benchmark.
It feels like one of those updates where the tool finally stops arguing with you. The error rate dropped hard — almost three times lower. The voice no longer stumbles over numbers, symbols, or all those weird edge cases that used to sound like improvisation.
The most noticeable change is stability. Not in a flashy way — just fewer things breaking. In testing, people picked this version far more often than the old alpha, and that lines up with how it actually feels to use.
This isn’t the kind of release that makes you tweet “we’re living in the future.” It’s the kind that saves you from constant fixes, second guesses, and cleanup.
Eleven v3 isn’t an event. It’s an update that makes work a bit easier. And these days, that might be the real benchmark.
👍352🔥127🎉114❤107
Kling 3.0
There used to be two models — Kling 2.6 and Kling O1 — and now there are two again!
Kling 2.6 is now called VIDEO 3.0 Kling O1 is now called VIDEO 3.0 Omni
You can access them in the interface via the Generate and Omni buttons respectively.
Omni is basically an editing-focused model that can take not only text as input, but also audio and video.
As far as I understand, access is currently limited to Ultra subscribers, but it’s worth checking carefully what promos they’re running right now.
Here you’ll find the full deep dive into all the new features of both models — and below I’ll highlight the main goodies for now.
For VIDEO 3.0:
• Video length is now from 3 to 15 seconds, and you can choose any duration within that range.
• Multi-shot support — up to 6 cuts in a single video.
• Start Frame + Element Reference — locking characters, objects, and scenes via Elements. Even when things change, consistency should be preserved.
• More languages (Chinese, English, Japanese, Korean, Spanish), plus authentic dialects and accents, and even multi-language dialogue within a single scene. Improved native audio.
• Improved text handling.
For VIDEO 3.0 Omni:
• Upload a 3–8 second input video (for example, featuring a character), and the model will extract key personality traits and the voice, preserving appearance and overall likeness.
• Voice input as a reference.
• Multitrack works in Omni as well.
• Storyboard Narrative 3.0: flexible duration, customizable shots, and precise control up to 15 seconds.
There used to be two models — Kling 2.6 and Kling O1 — and now there are two again!
Kling 2.6 is now called VIDEO 3.0 Kling O1 is now called VIDEO 3.0 Omni
You can access them in the interface via the Generate and Omni buttons respectively.
Omni is basically an editing-focused model that can take not only text as input, but also audio and video.
As far as I understand, access is currently limited to Ultra subscribers, but it’s worth checking carefully what promos they’re running right now.
Here you’ll find the full deep dive into all the new features of both models — and below I’ll highlight the main goodies for now.
For VIDEO 3.0:
• Video length is now from 3 to 15 seconds, and you can choose any duration within that range.
• Multi-shot support — up to 6 cuts in a single video.
• Start Frame + Element Reference — locking characters, objects, and scenes via Elements. Even when things change, consistency should be preserved.
• More languages (Chinese, English, Japanese, Korean, Spanish), plus authentic dialects and accents, and even multi-language dialogue within a single scene. Improved native audio.
• Improved text handling.
For VIDEO 3.0 Omni:
• Upload a 3–8 second input video (for example, featuring a character), and the model will extract key personality traits and the voice, preserving appearance and overall likeness.
• Voice input as a reference.
• Multitrack works in Omni as well.
• Storyboard Narrative 3.0: flexible duration, customizable shots, and precise control up to 15 seconds.
👍426🎉125❤121🔥110
Media is too big
VIEW IN TELEGRAM
A comparison of Kling 3.0 and Veo 3.1 in terms of lip-sync. It’s worth noting that Kling outputs up to 15 seconds, while Veo is still limited to 8 seconds. I have the impression that Veo hits the lip movements with rock-solid precision, and in this regard it outperforms Kling. At the same time, Kling layers more emotion on top of the speech. Very roughly speaking, Veo handles the lower part of the face better and more accurately, while Kling performs better in the upper part of the face, complementing the speech with emotions.
👍168🎉64❤62🔥55🏆1