Media is too big
VIEW IN TELEGRAM
Meta Locate 3D: A model for accurately localizing objects in 3D environments.
Learn how Meta Locate 3D can help robots accurately understand their environments and interact more naturally with humans.
You can download the model and dataset at https://go.fb.me/je2t32
Learn how Meta Locate 3D can help robots accurately understand their environments and interact more naturally with humans.
You can download the model and dataset at https://go.fb.me/je2t32
β€131π129π₯126π102
OpenAI just released HealthBench β a new eval for AI systems for health.
Developed with 262 physicians who have practiced in 60 countries.
Developed with 262 physicians who have practiced in 60 countries.
Openai
Introducing HealthBench
HealthBench is a new evaluation benchmark for AI in healthcare which evaluates models in realistic scenarios. Built with input from 250+ physicians, it aims to provide a shared standard for model performance and safety in health.
π259π₯249β€234π228
This media is not supported in your browser
VIEW IN TELEGRAM
Create applications IN SECONDS β a new neural network Appalchemy has been released.
AI ITSELF will come up with the design and features of the application based on your description β all you have to do is admire the finished layout or ask for edits.
It can be used for free http://appalchemy.ai/
AI ITSELF will come up with the design and features of the application based on your description β all you have to do is admire the finished layout or ask for edits.
It can be used for free http://appalchemy.ai/
β€192π187π₯182π174
Meta just released new models, benchmarks, and datasets that will transform the way researchers approach molecular property prediction, language processing, and neuroscience.
1. Open Molecules 2025 (OMol25): A dataset for molecular discovery with simulations of large atomic systems.
2. Universal Model for Atoms: A machine learning interatomic potential for modeling atom interactions across a wide range of materials and molecules.
3. Adjoint Sampling: A scalable algorithm for training generative models based on scalar rewards.
4. FAIR and the Rothschild Foundation Hospital partnered on a large-scale study that reveals striking parallels between language development in humans and LLMs.
1. Open Molecules 2025 (OMol25): A dataset for molecular discovery with simulations of large atomic systems.
2. Universal Model for Atoms: A machine learning interatomic potential for modeling atom interactions across a wide range of materials and molecules.
3. Adjoint Sampling: A scalable algorithm for training generative models based on scalar rewards.
4. FAIR and the Rothschild Foundation Hospital partnered on a large-scale study that reveals striking parallels between language development in humans and LLMs.
π189π168π₯166β€162π1
Nvidia has made the world happy: the most powerful speech recognition model is absolutely FREE and right in your browser!
Forget about paid services! This new product will decipher anything without further ado: videos, podcasts, lectures, chatter with friends and even the cacophony of the streets. It understands sarcasm (well, almost), places commas like a hereditary philologist, and marks time periods like an experienced director.
It doesn't matter what the input is: song lyrics, a stream of numbers or a multi-hour discussion - this beast weighs only 2 GB and is ready to work.
Test it online in a browser or download it to your computer
https://huggingface.co/spaces/nvidia/parakeet-tdt-0.6b-v2
Forget about paid services! This new product will decipher anything without further ado: videos, podcasts, lectures, chatter with friends and even the cacophony of the streets. It understands sarcasm (well, almost), places commas like a hereditary philologist, and marks time periods like an experienced director.
It doesn't matter what the input is: song lyrics, a stream of numbers or a multi-hour discussion - this beast weighs only 2 GB and is ready to work.
Test it online in a browser or download it to your computer
https://huggingface.co/spaces/nvidia/parakeet-tdt-0.6b-v2
π276π275β€263π₯262
OpenAI introduced AI agent codex.
it is a software engineering agent that runs in the cloud and does tasks for you, like writing a new feature of fixing a bug.
U can run many tasks in parallel. Starting to roll out today to ChatGPT pro, enterprise, and team users.
it is a software engineering agent that runs in the cloud and does tasks for you, like writing a new feature of fixing a bug.
U can run many tasks in parallel. Starting to roll out today to ChatGPT pro, enterprise, and team users.
Openai
Introducing Codex
Introducing Codex: a cloud-based software engineering agent that can work on many tasks in parallel, powered by codex-1. With Codex, developers can simultaneously deploy multiple agents to independently handle coding tasks such as writing features, answeringβ¦
π₯219β€193π191π174
This media is not supported in your browser
VIEW IN TELEGRAM
AI now makes 4K video: in the SkyReels V2 update, you can make videos in the highest quality
Voice, script, lip sync, music and video - all in one service and without restrictions.
https://www.skyreels.ai/home
Voice, script, lip sync, music and video - all in one service and without restrictions.
https://www.skyreels.ai/home
β€152π₯149π144π123
This media is not supported in your browser
VIEW IN TELEGRAM
RECURSE is the world's first track born from a collaboration between a human and quantum AI.
British artist ILΔ and startup Moth used the Archaeo machine learning platform, which runs on a quantum computer. Unlike common AI generators such as Suno and Udio, which are trained on huge amounts of data, Archaeo trained only on ILΔ's music. This allowed quantum algorithms to capture the unique nuances of the artist's work and create more "human" music, avoiding issues with ethics and copyright.
British artist ILΔ and startup Moth used the Archaeo machine learning platform, which runs on a quantum computer. Unlike common AI generators such as Suno and Udio, which are trained on huge amounts of data, Archaeo trained only on ILΔ's music. This allowed quantum algorithms to capture the unique nuances of the artist's work and create more "human" music, avoiding issues with ethics and copyright.
β€181π165π163π₯161
Adobe introduced HUMOTO, 4D dataset for human-object interaction, developed with a combination of wearable motion capture, SOTA 6D pose estimation vision models, LLM, and the professional refining works of multiple animation studios.
HUMOTO features:
1. Over 700 diverse daily activities
2. Interactions with 60+ objects, 70+ articulated parts.
3. Fine-grained text annotations
4. Detailed hand and finger movements.
HUMOTO features:
1. Over 700 diverse daily activities
2. Interactions with 60+ objects, 70+ articulated parts.
3. Fine-grained text annotations
4. Detailed hand and finger movements.
π168π₯165π164β€159
Google is launching their own coding agent, Jules, at I/O today
It lets you make changes to your GitHub repos with English prompts in a VM using Gemini 2.5 Pro. It's Googleβs version of Devin.
Here's the leaked official repo of prompts it can do.
It lets you make changes to your GitHub repos with English prompts in a VM using Gemini 2.5 Pro. It's Googleβs version of Devin.
Here's the leaked official repo of prompts it can do.
jules.google
Jules - An Autonomous Coding Agent
Jules is an Autonomous agent that gets out of your way. It lets you focus on the coding you want to do, meanwhile picking up all the other random tasks that you rather not do.
π66π₯61π58β€54
Media is too big
VIEW IN TELEGRAM
What if I told you this viral music video was made with zero filming, zero crew⦠and just one AI tool? See full tutorial!
π₯122β€105π101π100
This media is not supported in your browser
VIEW IN TELEGRAM
βΌοΈ HOLLYWOOD IS IN IT! Google's new Veo 3 neural network can now not only create cinematic videos, but also voice them with real people's voices
βΈ The recently updated Veo 3 can now generate sound effects, background noise, and even dialogue based on your requests
βΈ The model also perfectly perceives complex instructions and accurately reproduces the sequence of events, simplifying the creation of realistic scenes with reliable movement and sound
Considering the rapid pace of development of neural networks, it may be possible to make films in the future at all... π€
βΈ The recently updated Veo 3 can now generate sound effects, background noise, and even dialogue based on your requests
βΈ The model also perfectly perceives complex instructions and accurately reproduces the sequence of events, simplifying the creation of realistic scenes with reliable movement and sound
Considering the rapid pace of development of neural networks, it may be possible to make films in the future at all... π€
π59π₯55π54β€40
Elon Musk's xAI announced Live Search in the API
The new beta (free for a limited time) feature allows apps leveraging Grok models to search real-time info from X and the internet, including news.
Here's how easy it is to try out Grok 3's new live search:
1/ Grab a key from xAI
2/ Remix our template
3/ Add your API key to Secrets
4/ Click Run and start chatting with Grok.
Since it's built with Agent, you can remix and keep editing with Agent.
The new beta (free for a limited time) feature allows apps leveraging Grok models to search real-time info from X and the internet, including news.
Here's how easy it is to try out Grok 3's new live search:
1/ Grab a key from xAI
2/ Remix our template
3/ Add your API key to Secrets
4/ Click Run and start chatting with Grok.
Since it's built with Agent, you can remix and keep editing with Agent.
π₯130π124π116β€100
This media is not supported in your browser
VIEW IN TELEGRAM
AI showed what a CS map would look like in real life. π«
As if the role of the chickens is not fully revealed:)
As if the role of the chickens is not fully revealed:)
π254π247π₯238β€221
Media is too big
VIEW IN TELEGRAM
Everything you see in this video is entirely generated by a neural network π
The video and sound were created using text prompts for Google's Veo 3 tool. The author first wrote out each clip separately, and then edited them together.
Now even actors will be sent into retirement - everything is done by AI π
The video and sound were created using text prompts for Google's Veo 3 tool. The author first wrote out each clip separately, and then edited them together.
Now even actors will be sent into retirement - everything is done by AI π
π105β€102π101π₯90
Singapore's Sharpa unveiled SharpaWave, a lifelike robotic hand
βFeatures 22 DOF to balance for dexterity and strength
βEach fingertip has 1,000+ tactile sensing pixels and 5 mN pressure sensitivity
βAI models adapt the hand's grip and modulate force
βFeatures 22 DOF to balance for dexterity and strength
βEach fingertip has 1,000+ tactile sensing pixels and 5 mN pressure sensitivity
βAI models adapt the hand's grip and modulate force
β€61π55π49π₯46
Media is too big
VIEW IN TELEGRAM
Pavel Durov announced the integration of GROK into Telegram π
Telegram and XAI agreed to cooperate for 1 year. The GROK neural network integrates into all applications in Telegram.
Telegram will also receive $ 300 million from the XII and will receive 50% of the income from the subscriptions sold through the messenger.
Ton added 18% π€
Telegram and XAI agreed to cooperate for 1 year. The GROK neural network integrates into all applications in Telegram.
Telegram will also receive $ 300 million from the XII and will receive 50% of the income from the subscriptions sold through the messenger.
Ton added 18% π€
π₯119π117β€110π104
Market map for browser agents
A new companies launch in the space every week, for both consumer and enterprise use cases. ManusAI is one of the most popular generalist consumer agents, and Athena Intelligence is already being used by companies like Anheuser-Busch.
Computer/browser use has become one of the most important frontiers for model capabilities, with OpenAI, Anthropic, and Google DeepMind having dedicated teams to Operator, Claude Computer Use, and Project Mariner.
Open source frameworks like Browser use and Stagehand have become some of the most popular repos on Github, with tens of thousands of stars.
AI-first browsers are poised to disrupt the massive web browser market, with highly anticipated releases like Comet from Perplexity on the way. It's yet to be seen how Google integrates Project Mariner and other AI tools within Chrome.
A new companies launch in the space every week, for both consumer and enterprise use cases. ManusAI is one of the most popular generalist consumer agents, and Athena Intelligence is already being used by companies like Anheuser-Busch.
Computer/browser use has become one of the most important frontiers for model capabilities, with OpenAI, Anthropic, and Google DeepMind having dedicated teams to Operator, Claude Computer Use, and Project Mariner.
Open source frameworks like Browser use and Stagehand have become some of the most popular repos on Github, with tens of thousands of stars.
AI-first browsers are poised to disrupt the massive web browser market, with highly anticipated releases like Comet from Perplexity on the way. It's yet to be seen how Google integrates Project Mariner and other AI tools within Chrome.
π62π62β€56π₯50
Sakana AI introduced the Darwin GΓΆdel Machine: Open-Ended Evolution of Self-Improving Agents
Researchers harness the power of open-ended algorithms to search for agentic systems that get better at coding, including improving their own code.
Itβs the Automated Design of Agentic Systems (ADAS), but where it also edits itself. Experiments show both the ability to self-improve and the open-ended search are essential. If done safely, DGMs could accelerate AI development and allow to reap its benefits much sooner.
Researchers harness the power of open-ended algorithms to search for agentic systems that get better at coding, including improving their own code.
Itβs the Automated Design of Agentic Systems (ADAS), but where it also edits itself. Experiments show both the ability to self-improve and the open-ended search are essential. If done safely, DGMs could accelerate AI development and allow to reap its benefits much sooner.
sakana.ai
Sakana AI
The Darwin GΓΆdel Machine: AI that improves itself by rewriting its own code
π192π165β€150π₯133
Flux Kontext as a clothes switcher β 3
Hack: you can upload a reference image thatβs a collage of clothing and a face (possibly also a body pose).
Flux Kontext will digest both elements from one picture, put the clothes on the body, and the head on the neck.
Also, they say thereβs a version of Kontext on fal.ai where you can upload multiple images.
Hack: you can upload a reference image thatβs a collage of clothing and a face (possibly also a body pose).
Flux Kontext will digest both elements from one picture, put the clothes on the body, and the head on the neck.
Also, they say thereβs a version of Kontext on fal.ai where you can upload multiple images.
π₯90π82π74β€72
Bing Video Creator - powered by Sora - is now live. It is available for free. Videos are 5 seconds long and can be created in cinematic resolution with portrait on the way.
Bing
Create videos with your words for free β Introducing Bing Video Creator
Introducing Bing Video Creator, allowing you to turn your ideas into videos, for free. Powered by Sora, Bing Video Creator transforms your text prompts into short videos. Just describe what you want to see and watch your vision come to life.
β€172π154π₯148π145