Market map for browser agents
A new companies launch in the space every week, for both consumer and enterprise use cases. ManusAI is one of the most popular generalist consumer agents, and Athena Intelligence is already being used by companies like Anheuser-Busch.
Computer/browser use has become one of the most important frontiers for model capabilities, with OpenAI, Anthropic, and Google DeepMind having dedicated teams to Operator, Claude Computer Use, and Project Mariner.
Open source frameworks like Browser use and Stagehand have become some of the most popular repos on Github, with tens of thousands of stars.
AI-first browsers are poised to disrupt the massive web browser market, with highly anticipated releases like Comet from Perplexity on the way. It's yet to be seen how Google integrates Project Mariner and other AI tools within Chrome.
A new companies launch in the space every week, for both consumer and enterprise use cases. ManusAI is one of the most popular generalist consumer agents, and Athena Intelligence is already being used by companies like Anheuser-Busch.
Computer/browser use has become one of the most important frontiers for model capabilities, with OpenAI, Anthropic, and Google DeepMind having dedicated teams to Operator, Claude Computer Use, and Project Mariner.
Open source frameworks like Browser use and Stagehand have become some of the most popular repos on Github, with tens of thousands of stars.
AI-first browsers are poised to disrupt the massive web browser market, with highly anticipated releases like Comet from Perplexity on the way. It's yet to be seen how Google integrates Project Mariner and other AI tools within Chrome.
π62π62β€56π₯50
Sakana AI introduced the Darwin GΓΆdel Machine: Open-Ended Evolution of Self-Improving Agents
Researchers harness the power of open-ended algorithms to search for agentic systems that get better at coding, including improving their own code.
Itβs the Automated Design of Agentic Systems (ADAS), but where it also edits itself. Experiments show both the ability to self-improve and the open-ended search are essential. If done safely, DGMs could accelerate AI development and allow to reap its benefits much sooner.
Researchers harness the power of open-ended algorithms to search for agentic systems that get better at coding, including improving their own code.
Itβs the Automated Design of Agentic Systems (ADAS), but where it also edits itself. Experiments show both the ability to self-improve and the open-ended search are essential. If done safely, DGMs could accelerate AI development and allow to reap its benefits much sooner.
sakana.ai
Sakana AI
The Darwin GΓΆdel Machine: AI that improves itself by rewriting its own code
π192π165β€150π₯133
Flux Kontext as a clothes switcher β 3
Hack: you can upload a reference image thatβs a collage of clothing and a face (possibly also a body pose).
Flux Kontext will digest both elements from one picture, put the clothes on the body, and the head on the neck.
Also, they say thereβs a version of Kontext on fal.ai where you can upload multiple images.
Hack: you can upload a reference image thatβs a collage of clothing and a face (possibly also a body pose).
Flux Kontext will digest both elements from one picture, put the clothes on the body, and the head on the neck.
Also, they say thereβs a version of Kontext on fal.ai where you can upload multiple images.
π₯90π82π74β€72
Bing Video Creator - powered by Sora - is now live. It is available for free. Videos are 5 seconds long and can be created in cinematic resolution with portrait on the way.
Bing
Create videos with your words for free β Introducing Bing Video Creator
Introducing Bing Video Creator, allowing you to turn your ideas into videos, for free. Powered by Sora, Bing Video Creator transforms your text prompts into short videos. Just describe what you want to see and watch your vision come to life.
β€172π154π₯148π145
This media is not supported in your browser
VIEW IN TELEGRAM
Google has unveiled SignGemma, an advanced AI model that translates sign language into spoken language!
*οΈβ£With SignGemma, deaf and hard of hearing people will be able to:
*οΈβ£Communicate with their voice in real time
*οΈβ£Use voice assistants
*οΈβ£Easily interact with technology and the people around them
Key features:
*οΈβ£Support for multiple sign languages
*οΈβ£Accurate recognition of hand movements and facial expressions
*οΈβ£Translate into natural spoken language, not just text
Google emphasizes that its main goal is to make technology and AI accessible to everyone without exception, ensuring inclusivity.
*οΈβ£With SignGemma, deaf and hard of hearing people will be able to:
*οΈβ£Communicate with their voice in real time
*οΈβ£Use voice assistants
*οΈβ£Easily interact with technology and the people around them
Key features:
*οΈβ£Support for multiple sign languages
*οΈβ£Accurate recognition of hand movements and facial expressions
*οΈβ£Translate into natural spoken language, not just text
Google emphasizes that its main goal is to make technology and AI accessible to everyone without exception, ensuring inclusivity.
π31π22π₯18β€9
OpenAI has just announced some new stuff, and it's super useful for companies.
* Deep Research can now search GitHub, Google Docs, Gmail, Calendar, Microsoft SharePoint, Outlook, OneDrive, HubSpot, Dropbox, Box, and more, with permissions and secure storage.
* You can connect any chat to Google Docs, SharePoint, Dropbox, and Box.
* Initial MCP in Chat! And admins can add their own MCP for their corporate account.
* Recording mode in ChatGPT: Capture, transcribe, and summarize meetings right in the ChatGPT app. Structured output and full transcript with timestamps via the ChatGPT Mac app. Killed a hundred startups again.
* Team SSO in ChatGPT
* Credit pricing for ChatGPT Enterprise (and soon Team) so everyone can access features even when they exceed their limits.
Deep Research connectors are available for Plus and Pro users starting today, and MCP support will be available for Pro users.
https://openai.com/business/updates-to-chatgpt-business-plans-livestream-june-2025/
* Deep Research can now search GitHub, Google Docs, Gmail, Calendar, Microsoft SharePoint, Outlook, OneDrive, HubSpot, Dropbox, Box, and more, with permissions and secure storage.
* You can connect any chat to Google Docs, SharePoint, Dropbox, and Box.
* Initial MCP in Chat! And admins can add their own MCP for their corporate account.
* Recording mode in ChatGPT: Capture, transcribe, and summarize meetings right in the ChatGPT app. Structured output and full transcript with timestamps via the ChatGPT Mac app. Killed a hundred startups again.
* Team SSO in ChatGPT
* Credit pricing for ChatGPT Enterprise (and soon Team) so everyone can access features even when they exceed their limits.
Deep Research connectors are available for Plus and Pro users starting today, and MCP support will be available for Pro users.
https://openai.com/business/updates-to-chatgpt-business-plans-livestream-june-2025/
Openai
Updates to ChatGPT business plans
Announcing Updates to ChatGPT business plans including connectors to internal tools, new security controls, ChatGPT record mode, and flexible pricing.
π195β€186π₯180π171
This media is not supported in your browser
VIEW IN TELEGRAM
AI now turns simple sketches into full-fledged videos!
The Runway platform has introduced the Layout Sketches function β just make a simple sketch or diagram, and the neural network will create a short video with animation and realistic physics of movement.
Now even a child can easily embody their ideas in visual form β all the rest of the work will be done by AI. A new level of creativity without limits!
The Runway platform has introduced the Layout Sketches function β just make a simple sketch or diagram, and the neural network will create a short video with animation and realistic physics of movement.
Now even a child can easily embody their ideas in visual form β all the rest of the work will be done by AI. A new level of creativity without limits!
β€74π72π₯72π61
Media is too big
VIEW IN TELEGRAM
ποΈ ElevenLabs introduces Eleven v3 (alpha) β the most expressive text-to-speech model
The most expressive text-to-speech model to date.
Supports 70+ languages, multi-voice mode, and now β audio tags that set intonation, emotion, and even pauses in speech.
π§ The new architecture better understands text and context, creating natural, "live" audio.
π£οΈ What Eleven v3 can do:
β’ Generate realistic dialogue with multiple voices
β’ Read emotional transitions
β’ React to context and change tone during speech
π The model is controlled via tags:
- Emotions: [sad], [angry], [happily]
- Delivery: [whispers], [shouts]
- Reactions: [laughs], [sighs], [clears throat]
π‘ They promise to roll out the public API very soon.
β οΈ This is a preview version - it may require fine-tuning of prompts. But the result is really impressive
The most expressive text-to-speech model to date.
Supports 70+ languages, multi-voice mode, and now β audio tags that set intonation, emotion, and even pauses in speech.
π§ The new architecture better understands text and context, creating natural, "live" audio.
π£οΈ What Eleven v3 can do:
β’ Generate realistic dialogue with multiple voices
β’ Read emotional transitions
β’ React to context and change tone during speech
π The model is controlled via tags:
- Emotions: [sad], [angry], [happily]
- Delivery: [whispers], [shouts]
- Reactions: [laughs], [sighs], [clears throat]
π‘ They promise to roll out the public API very soon.
β οΈ This is a preview version - it may require fine-tuning of prompts. But the result is really impressive
π₯112π107β€94π77
Gemini 2.5 Pro update is now in preview.
Google
Try the latest Gemini 2.5 Pro before general availability.
Weβre introducing an upgraded preview of Gemini 2.5 Pro, our most intelligent model yet. Building on the version we released in May and showed at I/O, this model will beβ¦
π168β€161π₯150π148
the day before yesterday, OpenAI quietly upgraded Voice Mode β it now speaks even more naturally, with subtler intonations, more realistic cadence (including pauses and emphasis), and more accurate expression of certain emotions, including empathy, sarcasm, and much more.
Voice also offers intuitive and efficient language translation. Just ask Voice to translate from one language to another, and it will continue translating throughout the conversation until you ask it to stop or switch.
Only for paid subscribers, sorry:)
https://help.openai.com/en/articles/6825453-chatgpt-release-notes
Voice also offers intuitive and efficient language translation. Just ask Voice to translate from one language to another, and it will continue translating throughout the conversation until you ask it to stop or switch.
Only for paid subscribers, sorry:)
https://help.openai.com/en/articles/6825453-chatgpt-release-notes
OpenAI Help Center
ChatGPT β Release Notes | OpenAI Help Center
A changelog of the latest updates and release notes for ChatGPT
π214β€180π180π₯164
This media is not supported in your browser
VIEW IN TELEGRAM
Krea has released its image generation model.
And it's called Krea 1.
Here's what they write:
And it's called Krea 1.
Here's what they write:
Krea 1 is our answer to the problem of "typical AI look".
Most AI models suffer from soft textures, excessive contrast, and produce boring compositions or styles.
Krea 1 delivers highly realistic, sharp textures, a wide range of styles, and deep artistic knowledge - making AI images no longer look un-AI.
β€135π125π123π₯118
Media is too big
VIEW IN TELEGRAM
Veo3 can generate 360 ββvideos.
You just need to add "360Β°" or "360Β° video" to the prompt and you'll get it.
Yes, of course, the resolution is not enough, but give the AI ββtime!
You just need to add "360Β°" or "360Β° video" to the prompt and you'll get it.
Yes, of course, the resolution is not enough, but give the AI ββtime!
π₯276π242β€232π230
OpenAI Introduced o3-pro - a new level of AI π€
Advantages of o3-pro:
- Bypasses o3 in almost all tasks (70% increase!)
- Leads in all benchmarks, leaving competitors far behind.
- Ideal for scientific research, coding, complex analytics and mathematical calculations.
- Processes files, takes into account the context and works like a whole team of experts.
- And most importantly - TWICE CHEAPER than o1-pro!
Advantages of o3-pro:
- Bypasses o3 in almost all tasks (70% increase!)
- Leads in all benchmarks, leaving competitors far behind.
- Ideal for scientific research, coding, complex analytics and mathematical calculations.
- Processes files, takes into account the context and works like a whole team of experts.
- And most importantly - TWICE CHEAPER than o1-pro!
π197π185π₯183β€164
Media is too big
VIEW IN TELEGRAM
Video generators from Bytedance - let's figure it out.
1. At the end of last week, a new video generator from Bytedance called Seedance 1.0 burst onto the video arena
It looks great and beats even Veo3. You can't try it in its "pure adult form" yet.
But on the site https://dreamina.capcut.com/ai-tool/home there is Seedance 1.0 Mini version (also known as Video 3.0), 120 credits per day for free
Also yesterday, Fal.ai announced that you can test the Lite version of SeeDance 1.0.
Where and when the "full version" will be available is still unknown.
1. At the end of last week, a new video generator from Bytedance called Seedance 1.0 burst onto the video arena
It looks great and beats even Veo3. You can't try it in its "pure adult form" yet.
But on the site https://dreamina.capcut.com/ai-tool/home there is Seedance 1.0 Mini version (also known as Video 3.0), 120 credits per day for free
Also yesterday, Fal.ai announced that you can test the Lite version of SeeDance 1.0.
Where and when the "full version" will be available is still unknown.
β€204π194π₯189π171
This media is not supported in your browser
VIEW IN TELEGRAM
Midjourney: Midjourney Video V1 video generator released.
Works in img-2-video mode. Select a generated image or upload your own and click Animate. Prompt is automatically improved. There is also a manual mode where you can describe what exactly needs to be changed/animated. There is also a βlow motionβ and βhigh motionβ setting. The first is for static slow frames, the second is for dynamic scenes. You can also extend the video by 4 seconds with a limit of 16 seconds.
The output image is aesthetic, typical for Midjourney. It works well with dynamics, and even seems to be good with anatomy. But there are questions about detail: artifacts and image compression are noticeable. People write that the output is 480p, which is not comparable even to open source generators like Wan and today's Hailuo 02.
Works in img-2-video mode. Select a generated image or upload your own and click Animate. Prompt is automatically improved. There is also a manual mode where you can describe what exactly needs to be changed/animated. There is also a βlow motionβ and βhigh motionβ setting. The first is for static slow frames, the second is for dynamic scenes. You can also extend the video by 4 seconds with a limit of 16 seconds.
The output image is aesthetic, typical for Midjourney. It works well with dynamics, and even seems to be good with anatomy. But there are questions about detail: artifacts and image compression are noticeable. People write that the output is 480p, which is not comparable even to open source generators like Wan and today's Hailuo 02.
π156β€138π₯135π110
This media is not supported in your browser
VIEW IN TELEGRAM
The vaunted Seedance, which beats Veo3, was brought to Krea.ai
20 times cheaper than Veo3 and generates in 2 minutes.
However, Krea stubbornly keeps silent about what model it is: Lite or Full.
20 times cheaper than Veo3 and generates in 2 minutes.
However, Krea stubbornly keeps silent about what model it is: Lite or Full.
β€214π₯208π185π173
Chinese Lab MiniMax introduced in this week:
1. open-sourcing LLM MiniMax-M1 β setting new standards in long-context reasoning.
- Worldβs longest context window: 1M-token input, 80k-token output
- State-of-the-art agentic use among open-source models
- RL at unmatched efficiency: trained with just $534,700.
2. Hailuo 02, World-Class Quality, Record-Breaking Cost Efficiency
- Best-in-class instruction following
- Handles extreme physics
- Native 1080p
3. MiniMax Audio:
- Any prompt, any voice, any emotion
- Fully customizable and multilingual.
4. Hailuo Video Agent in Beta, Vibe Videoing with Zero-touch.
MiniMax plan to achieve end-to-end Hailuo Video Agent via 3 stages:
Stage 1: Prebuilt video Agent templates for high-quality creative videos. Users simply follow instructions and input text or images β with one click, a polished video is generated.
Stage 2: Semi-customizable video Agent. Users gain the flexibility to edit any part of the video creation process, from script to visuals to voiceover.
Stage 3: Fully autonomous, end-to-end video Agent. A complete, intelligent workflow that turns creative input into final-cut video with minimal manual effort.
This summer, team plan to gradually roll out Stage Two of Agent creation tools.
5. MiniMax Agent, a general intelligent agent designed to tackle long-horizon, complex tasks.
From expert-level multi-step planning to flexible task breakdown and end-to-end execution β itβs designed to act like a reliable teammate, with strengths in:
-Programming & tool use
-Multimodal understanding & generation
-Seamless MCP integration
1. open-sourcing LLM MiniMax-M1 β setting new standards in long-context reasoning.
- Worldβs longest context window: 1M-token input, 80k-token output
- State-of-the-art agentic use among open-source models
- RL at unmatched efficiency: trained with just $534,700.
2. Hailuo 02, World-Class Quality, Record-Breaking Cost Efficiency
- Best-in-class instruction following
- Handles extreme physics
- Native 1080p
3. MiniMax Audio:
- Any prompt, any voice, any emotion
- Fully customizable and multilingual.
4. Hailuo Video Agent in Beta, Vibe Videoing with Zero-touch.
MiniMax plan to achieve end-to-end Hailuo Video Agent via 3 stages:
Stage 1: Prebuilt video Agent templates for high-quality creative videos. Users simply follow instructions and input text or images β with one click, a polished video is generated.
Stage 2: Semi-customizable video Agent. Users gain the flexibility to edit any part of the video creation process, from script to visuals to voiceover.
Stage 3: Fully autonomous, end-to-end video Agent. A complete, intelligent workflow that turns creative input into final-cut video with minimal manual effort.
This summer, team plan to gradually roll out Stage Two of Agent creation tools.
5. MiniMax Agent, a general intelligent agent designed to tackle long-horizon, complex tasks.
From expert-level multi-step planning to flexible task breakdown and end-to-end execution β itβs designed to act like a reliable teammate, with strengths in:
-Programming & tool use
-Multimodal understanding & generation
-Seamless MCP integration
MiniMax
Building AGI with our mission Intelligence with Everyone. Global leader in multi-modal models and AI-native products with over 200 million users.
π₯73π71β€67π67
This media is not supported in your browser
VIEW IN TELEGRAM
The new Minimax Hailuo 02 has already arrived at Krea.ai.
π76π₯73π72β€69
This media is not supported in your browser
VIEW IN TELEGRAM
Minimax
Create Lifelike Audio - the fifth release from Minimax in the last 7 days
Text2Speech, Voice Clonining, Voice Design and even music generation (no miracles here, Suno is the best).
Create Lifelike Audio - the fifth release from Minimax in the last 7 days
Text2Speech, Voice Clonining, Voice Design and even music generation (no miracles here, Suno is the best).
π80π₯79π77β€63
Breaking! New Hailuo Video 02's Beats Google Veo 3 and Kling?
What if the best AI video model right nowβ¦ wasnβt from Google or OpenAI? Thereβs a new beast climbing the leaderboard, and it just knocked out Veo 3 and Kling 2.0. This is Hailuo Video 02 β and it's shaking up the whole AI video scene. Let me show you why creators are freaking out.
Get Hailuo 02 here and try their new video agents! 500 credits bonus to new signups!
Get 500 bonus credits https://hailuoai.video/?utm_source=YT&utm_medium=kol&utm_campaign=Vortex
https://www.youtube.com/watch?v=JOtM8UE0SKA
What if the best AI video model right nowβ¦ wasnβt from Google or OpenAI? Thereβs a new beast climbing the leaderboard, and it just knocked out Veo 3 and Kling 2.0. This is Hailuo Video 02 β and it's shaking up the whole AI video scene. Let me show you why creators are freaking out.
Get Hailuo 02 here and try their new video agents! 500 credits bonus to new signups!
Get 500 bonus credits https://hailuoai.video/?utm_source=YT&utm_medium=kol&utm_campaign=Vortex
https://www.youtube.com/watch?v=JOtM8UE0SKA
hailuoai.video
Hailuo AI: AI Video Generator from Text & Image
Turn text & images into great videos with Hailuo AI's video maker. Our tool also includes an AI image generator to create stunning posts, memes & more in seconds.
π74π57β€56π₯51