OpenAI introduced AI agent codex.
it is a software engineering agent that runs in the cloud and does tasks for you, like writing a new feature of fixing a bug.
U can run many tasks in parallel. Starting to roll out today to ChatGPT pro, enterprise, and team users.
it is a software engineering agent that runs in the cloud and does tasks for you, like writing a new feature of fixing a bug.
U can run many tasks in parallel. Starting to roll out today to ChatGPT pro, enterprise, and team users.
Openai
Introducing Codex
Introducing Codex: a cloud-based software engineering agent that can work on many tasks in parallel, powered by codex-1. With Codex, developers can simultaneously deploy multiple agents to independently handle coding tasks such as writing features, answeringβ¦
π₯219β€193π191π174
This media is not supported in your browser
VIEW IN TELEGRAM
AI now makes 4K video: in the SkyReels V2 update, you can make videos in the highest quality
Voice, script, lip sync, music and video - all in one service and without restrictions.
https://www.skyreels.ai/home
Voice, script, lip sync, music and video - all in one service and without restrictions.
https://www.skyreels.ai/home
β€152π₯149π144π123
This media is not supported in your browser
VIEW IN TELEGRAM
RECURSE is the world's first track born from a collaboration between a human and quantum AI.
British artist ILΔ and startup Moth used the Archaeo machine learning platform, which runs on a quantum computer. Unlike common AI generators such as Suno and Udio, which are trained on huge amounts of data, Archaeo trained only on ILΔ's music. This allowed quantum algorithms to capture the unique nuances of the artist's work and create more "human" music, avoiding issues with ethics and copyright.
British artist ILΔ and startup Moth used the Archaeo machine learning platform, which runs on a quantum computer. Unlike common AI generators such as Suno and Udio, which are trained on huge amounts of data, Archaeo trained only on ILΔ's music. This allowed quantum algorithms to capture the unique nuances of the artist's work and create more "human" music, avoiding issues with ethics and copyright.
β€181π165π163π₯161
Adobe introduced HUMOTO, 4D dataset for human-object interaction, developed with a combination of wearable motion capture, SOTA 6D pose estimation vision models, LLM, and the professional refining works of multiple animation studios.
HUMOTO features:
1. Over 700 diverse daily activities
2. Interactions with 60+ objects, 70+ articulated parts.
3. Fine-grained text annotations
4. Detailed hand and finger movements.
HUMOTO features:
1. Over 700 diverse daily activities
2. Interactions with 60+ objects, 70+ articulated parts.
3. Fine-grained text annotations
4. Detailed hand and finger movements.
π168π₯165π164β€159
Google is launching their own coding agent, Jules, at I/O today
It lets you make changes to your GitHub repos with English prompts in a VM using Gemini 2.5 Pro. It's Googleβs version of Devin.
Here's the leaked official repo of prompts it can do.
It lets you make changes to your GitHub repos with English prompts in a VM using Gemini 2.5 Pro. It's Googleβs version of Devin.
Here's the leaked official repo of prompts it can do.
jules.google
Jules - An Autonomous Coding Agent
Jules is an Autonomous agent that gets out of your way. It lets you focus on the coding you want to do, meanwhile picking up all the other random tasks that you rather not do.
π66π₯61π58β€54
Media is too big
VIEW IN TELEGRAM
What if I told you this viral music video was made with zero filming, zero crew⦠and just one AI tool? See full tutorial!
π₯122β€105π101π100
This media is not supported in your browser
VIEW IN TELEGRAM
βΌοΈ HOLLYWOOD IS IN IT! Google's new Veo 3 neural network can now not only create cinematic videos, but also voice them with real people's voices
βΈ The recently updated Veo 3 can now generate sound effects, background noise, and even dialogue based on your requests
βΈ The model also perfectly perceives complex instructions and accurately reproduces the sequence of events, simplifying the creation of realistic scenes with reliable movement and sound
Considering the rapid pace of development of neural networks, it may be possible to make films in the future at all... π€
βΈ The recently updated Veo 3 can now generate sound effects, background noise, and even dialogue based on your requests
βΈ The model also perfectly perceives complex instructions and accurately reproduces the sequence of events, simplifying the creation of realistic scenes with reliable movement and sound
Considering the rapid pace of development of neural networks, it may be possible to make films in the future at all... π€
π59π₯55π54β€40
Elon Musk's xAI announced Live Search in the API
The new beta (free for a limited time) feature allows apps leveraging Grok models to search real-time info from X and the internet, including news.
Here's how easy it is to try out Grok 3's new live search:
1/ Grab a key from xAI
2/ Remix our template
3/ Add your API key to Secrets
4/ Click Run and start chatting with Grok.
Since it's built with Agent, you can remix and keep editing with Agent.
The new beta (free for a limited time) feature allows apps leveraging Grok models to search real-time info from X and the internet, including news.
Here's how easy it is to try out Grok 3's new live search:
1/ Grab a key from xAI
2/ Remix our template
3/ Add your API key to Secrets
4/ Click Run and start chatting with Grok.
Since it's built with Agent, you can remix and keep editing with Agent.
π₯130π124π116β€100
This media is not supported in your browser
VIEW IN TELEGRAM
AI showed what a CS map would look like in real life. π«
As if the role of the chickens is not fully revealed:)
As if the role of the chickens is not fully revealed:)
π254π247π₯238β€221
Media is too big
VIEW IN TELEGRAM
Everything you see in this video is entirely generated by a neural network π
The video and sound were created using text prompts for Google's Veo 3 tool. The author first wrote out each clip separately, and then edited them together.
Now even actors will be sent into retirement - everything is done by AI π
The video and sound were created using text prompts for Google's Veo 3 tool. The author first wrote out each clip separately, and then edited them together.
Now even actors will be sent into retirement - everything is done by AI π
π105β€102π101π₯90
Singapore's Sharpa unveiled SharpaWave, a lifelike robotic hand
βFeatures 22 DOF to balance for dexterity and strength
βEach fingertip has 1,000+ tactile sensing pixels and 5 mN pressure sensitivity
βAI models adapt the hand's grip and modulate force
βFeatures 22 DOF to balance for dexterity and strength
βEach fingertip has 1,000+ tactile sensing pixels and 5 mN pressure sensitivity
βAI models adapt the hand's grip and modulate force
β€61π55π49π₯46
Media is too big
VIEW IN TELEGRAM
Pavel Durov announced the integration of GROK into Telegram π
Telegram and XAI agreed to cooperate for 1 year. The GROK neural network integrates into all applications in Telegram.
Telegram will also receive $ 300 million from the XII and will receive 50% of the income from the subscriptions sold through the messenger.
Ton added 18% π€
Telegram and XAI agreed to cooperate for 1 year. The GROK neural network integrates into all applications in Telegram.
Telegram will also receive $ 300 million from the XII and will receive 50% of the income from the subscriptions sold through the messenger.
Ton added 18% π€
π₯119π117β€110π104
Market map for browser agents
A new companies launch in the space every week, for both consumer and enterprise use cases. ManusAI is one of the most popular generalist consumer agents, and Athena Intelligence is already being used by companies like Anheuser-Busch.
Computer/browser use has become one of the most important frontiers for model capabilities, with OpenAI, Anthropic, and Google DeepMind having dedicated teams to Operator, Claude Computer Use, and Project Mariner.
Open source frameworks like Browser use and Stagehand have become some of the most popular repos on Github, with tens of thousands of stars.
AI-first browsers are poised to disrupt the massive web browser market, with highly anticipated releases like Comet from Perplexity on the way. It's yet to be seen how Google integrates Project Mariner and other AI tools within Chrome.
A new companies launch in the space every week, for both consumer and enterprise use cases. ManusAI is one of the most popular generalist consumer agents, and Athena Intelligence is already being used by companies like Anheuser-Busch.
Computer/browser use has become one of the most important frontiers for model capabilities, with OpenAI, Anthropic, and Google DeepMind having dedicated teams to Operator, Claude Computer Use, and Project Mariner.
Open source frameworks like Browser use and Stagehand have become some of the most popular repos on Github, with tens of thousands of stars.
AI-first browsers are poised to disrupt the massive web browser market, with highly anticipated releases like Comet from Perplexity on the way. It's yet to be seen how Google integrates Project Mariner and other AI tools within Chrome.
π62π62β€56π₯50
Sakana AI introduced the Darwin GΓΆdel Machine: Open-Ended Evolution of Self-Improving Agents
Researchers harness the power of open-ended algorithms to search for agentic systems that get better at coding, including improving their own code.
Itβs the Automated Design of Agentic Systems (ADAS), but where it also edits itself. Experiments show both the ability to self-improve and the open-ended search are essential. If done safely, DGMs could accelerate AI development and allow to reap its benefits much sooner.
Researchers harness the power of open-ended algorithms to search for agentic systems that get better at coding, including improving their own code.
Itβs the Automated Design of Agentic Systems (ADAS), but where it also edits itself. Experiments show both the ability to self-improve and the open-ended search are essential. If done safely, DGMs could accelerate AI development and allow to reap its benefits much sooner.
sakana.ai
Sakana AI
The Darwin GΓΆdel Machine: AI that improves itself by rewriting its own code
π192π165β€150π₯133
Flux Kontext as a clothes switcher β 3
Hack: you can upload a reference image thatβs a collage of clothing and a face (possibly also a body pose).
Flux Kontext will digest both elements from one picture, put the clothes on the body, and the head on the neck.
Also, they say thereβs a version of Kontext on fal.ai where you can upload multiple images.
Hack: you can upload a reference image thatβs a collage of clothing and a face (possibly also a body pose).
Flux Kontext will digest both elements from one picture, put the clothes on the body, and the head on the neck.
Also, they say thereβs a version of Kontext on fal.ai where you can upload multiple images.
π₯90π82π74β€72
Bing Video Creator - powered by Sora - is now live. It is available for free. Videos are 5 seconds long and can be created in cinematic resolution with portrait on the way.
Bing
Create videos with your words for free β Introducing Bing Video Creator
Introducing Bing Video Creator, allowing you to turn your ideas into videos, for free. Powered by Sora, Bing Video Creator transforms your text prompts into short videos. Just describe what you want to see and watch your vision come to life.
β€172π154π₯148π145
This media is not supported in your browser
VIEW IN TELEGRAM
Google has unveiled SignGemma, an advanced AI model that translates sign language into spoken language!
*οΈβ£With SignGemma, deaf and hard of hearing people will be able to:
*οΈβ£Communicate with their voice in real time
*οΈβ£Use voice assistants
*οΈβ£Easily interact with technology and the people around them
Key features:
*οΈβ£Support for multiple sign languages
*οΈβ£Accurate recognition of hand movements and facial expressions
*οΈβ£Translate into natural spoken language, not just text
Google emphasizes that its main goal is to make technology and AI accessible to everyone without exception, ensuring inclusivity.
*οΈβ£With SignGemma, deaf and hard of hearing people will be able to:
*οΈβ£Communicate with their voice in real time
*οΈβ£Use voice assistants
*οΈβ£Easily interact with technology and the people around them
Key features:
*οΈβ£Support for multiple sign languages
*οΈβ£Accurate recognition of hand movements and facial expressions
*οΈβ£Translate into natural spoken language, not just text
Google emphasizes that its main goal is to make technology and AI accessible to everyone without exception, ensuring inclusivity.
π31π22π₯18β€9
OpenAI has just announced some new stuff, and it's super useful for companies.
* Deep Research can now search GitHub, Google Docs, Gmail, Calendar, Microsoft SharePoint, Outlook, OneDrive, HubSpot, Dropbox, Box, and more, with permissions and secure storage.
* You can connect any chat to Google Docs, SharePoint, Dropbox, and Box.
* Initial MCP in Chat! And admins can add their own MCP for their corporate account.
* Recording mode in ChatGPT: Capture, transcribe, and summarize meetings right in the ChatGPT app. Structured output and full transcript with timestamps via the ChatGPT Mac app. Killed a hundred startups again.
* Team SSO in ChatGPT
* Credit pricing for ChatGPT Enterprise (and soon Team) so everyone can access features even when they exceed their limits.
Deep Research connectors are available for Plus and Pro users starting today, and MCP support will be available for Pro users.
https://openai.com/business/updates-to-chatgpt-business-plans-livestream-june-2025/
* Deep Research can now search GitHub, Google Docs, Gmail, Calendar, Microsoft SharePoint, Outlook, OneDrive, HubSpot, Dropbox, Box, and more, with permissions and secure storage.
* You can connect any chat to Google Docs, SharePoint, Dropbox, and Box.
* Initial MCP in Chat! And admins can add their own MCP for their corporate account.
* Recording mode in ChatGPT: Capture, transcribe, and summarize meetings right in the ChatGPT app. Structured output and full transcript with timestamps via the ChatGPT Mac app. Killed a hundred startups again.
* Team SSO in ChatGPT
* Credit pricing for ChatGPT Enterprise (and soon Team) so everyone can access features even when they exceed their limits.
Deep Research connectors are available for Plus and Pro users starting today, and MCP support will be available for Pro users.
https://openai.com/business/updates-to-chatgpt-business-plans-livestream-june-2025/
Openai
Updates to ChatGPT business plans
Announcing Updates to ChatGPT business plans including connectors to internal tools, new security controls, ChatGPT record mode, and flexible pricing.
π195β€186π₯180π171
This media is not supported in your browser
VIEW IN TELEGRAM
AI now turns simple sketches into full-fledged videos!
The Runway platform has introduced the Layout Sketches function β just make a simple sketch or diagram, and the neural network will create a short video with animation and realistic physics of movement.
Now even a child can easily embody their ideas in visual form β all the rest of the work will be done by AI. A new level of creativity without limits!
The Runway platform has introduced the Layout Sketches function β just make a simple sketch or diagram, and the neural network will create a short video with animation and realistic physics of movement.
Now even a child can easily embody their ideas in visual form β all the rest of the work will be done by AI. A new level of creativity without limits!
β€74π72π₯72π61
Media is too big
VIEW IN TELEGRAM
ποΈ ElevenLabs introduces Eleven v3 (alpha) β the most expressive text-to-speech model
The most expressive text-to-speech model to date.
Supports 70+ languages, multi-voice mode, and now β audio tags that set intonation, emotion, and even pauses in speech.
π§ The new architecture better understands text and context, creating natural, "live" audio.
π£οΈ What Eleven v3 can do:
β’ Generate realistic dialogue with multiple voices
β’ Read emotional transitions
β’ React to context and change tone during speech
π The model is controlled via tags:
- Emotions: [sad], [angry], [happily]
- Delivery: [whispers], [shouts]
- Reactions: [laughs], [sighs], [clears throat]
π‘ They promise to roll out the public API very soon.
β οΈ This is a preview version - it may require fine-tuning of prompts. But the result is really impressive
The most expressive text-to-speech model to date.
Supports 70+ languages, multi-voice mode, and now β audio tags that set intonation, emotion, and even pauses in speech.
π§ The new architecture better understands text and context, creating natural, "live" audio.
π£οΈ What Eleven v3 can do:
β’ Generate realistic dialogue with multiple voices
β’ Read emotional transitions
β’ React to context and change tone during speech
π The model is controlled via tags:
- Emotions: [sad], [angry], [happily]
- Delivery: [whispers], [shouts]
- Reactions: [laughs], [sighs], [clears throat]
π‘ They promise to roll out the public API very soon.
β οΈ This is a preview version - it may require fine-tuning of prompts. But the result is really impressive
π₯112π107β€94π77
Gemini 2.5 Pro update is now in preview.
Google
Try the latest Gemini 2.5 Pro before general availability.
Weβre introducing an upgraded preview of Gemini 2.5 Pro, our most intelligent model yet. Building on the version we released in May and showed at I/O, this model will beβ¦
π168β€161π₯150π148