OpenAI announced o3 Model
Benchmark Achievements:
โข 2700+ rating on CodeForces (surpassing top programmers)
โข 96.7% accuracy on AIME 2024 mathematics
โข 87.7% on PhD-level GPQA Diamond questions
โข 71.7% on SWE-bench (software engineering)
โข 25.2% on ultra-hard EpochAI Frontier Math (up from 2%)
Reasoning Breakthrough:
โข 87.5% on ARC-AGI private evaluation
โข 3x performance improvement over o1
โข Verified performance on completely unseen tasks
โข No memorization - pure reasoning ability
Technical Highlights:
โข Built on scaled-up Reinforcement Learning (RL)
โข Most compute-intensive model at test-time
โข Introduces efficient o3-mini version
โข Sets new standards across all technical benchmarks
Industry Impact:
โข Opens new era in AI scaling
โข Demonstrates effectiveness of increased compute power
โข Expected reduction in token pricing
โข Available for safety testing
Benchmark Achievements:
โข 2700+ rating on CodeForces (surpassing top programmers)
โข 96.7% accuracy on AIME 2024 mathematics
โข 87.7% on PhD-level GPQA Diamond questions
โข 71.7% on SWE-bench (software engineering)
โข 25.2% on ultra-hard EpochAI Frontier Math (up from 2%)
Reasoning Breakthrough:
โข 87.5% on ARC-AGI private evaluation
โข 3x performance improvement over o1
โข Verified performance on completely unseen tasks
โข No memorization - pure reasoning ability
Technical Highlights:
โข Built on scaled-up Reinforcement Learning (RL)
โข Most compute-intensive model at test-time
โข Introduces efficient o3-mini version
โข Sets new standards across all technical benchmarks
Industry Impact:
โข Opens new era in AI scaling
โข Demonstrates effectiveness of increased compute power
โข Expected reduction in token pricing
โข Available for safety testing
๐672๐669๐ฅ656โค640
This media is not supported in your browser
VIEW IN TELEGRAM
Instagram is preparing to launch its AI video editor.
Adam Mosseri, the head of the company, announced the Movie Gen AI AI tool, which will allow editing videos based on text queries.
Adam Mosseri, the head of the company, announced the Movie Gen AI AI tool, which will allow editing videos based on text queries.
๐273๐ฅ270โค250๐240
Voice AI Market Map: 2024 Results and 2025 Forecasts, Cartesia Report
The cost of using language models has fallen dramatically: from $45 to $2.75 per million tokens. At the same time, the quality of speech recognition and synthesis has increased significantly.
Interest in voice technologies is growing rapidly - the number of startups in this field at Y Combinator increased by 70% between the winter and fall 2024 intakes. Voice AI assistants are actively being implemented in:
- Healthcare:
- Insurance:
- Logistics:
- Hotel business:
- Small business:
What awaits us in 2025?
1. More advanced speech-to-speech conversion systems with a delay of only 160 ms are expected to appear (for comparison: a person has 230 ms).
2. The development of compact models will allow voice assistants to be used without the Internet - on phones, in cars and various devices.
3. Voice assistants will begin to cope with complex tasks.
2025 promises to be the year of mass implementation of voice technologies.
The cost of using language models has fallen dramatically: from $45 to $2.75 per million tokens. At the same time, the quality of speech recognition and synthesis has increased significantly.
Interest in voice technologies is growing rapidly - the number of startups in this field at Y Combinator increased by 70% between the winter and fall 2024 intakes. Voice AI assistants are actively being implemented in:
- Healthcare:
- Insurance:
- Logistics:
- Hotel business:
- Small business:
What awaits us in 2025?
1. More advanced speech-to-speech conversion systems with a delay of only 160 ms are expected to appear (for comparison: a person has 230 ms).
2. The development of compact models will allow voice assistants to be used without the Internet - on phones, in cars and various devices.
3. Voice assistants will begin to cope with complex tasks.
2025 promises to be the year of mass implementation of voice technologies.
โค191๐185๐ฅ184๐173
A team of researchers unveiled Genesis, an open-source physics engine that combines generative AI with simulations
Generative physics engine able to generate 4D dynamical worlds powered by a physics simulation platform designed for general-purpose robotics and physical AI applications.
Genesis's physics engine is developed in pure Python, while being 10-80x faster than existing GPU-accelerated stacks like Isaac Gym and MJX. It delivers a simulation speed ~430,000 faster than in real-time, and takes only 26 seconds to train a robotic locomotion policy transferrable to the real world on a single RTX4090, see tutorial.
This is a significant step forward in the simulation ecosystem and could help accelerate robots' ability to understand our physical world.
Generative physics engine able to generate 4D dynamical worlds powered by a physics simulation platform designed for general-purpose robotics and physical AI applications.
Genesis's physics engine is developed in pure Python, while being 10-80x faster than existing GPU-accelerated stacks like Isaac Gym and MJX. It delivers a simulation speed ~430,000 faster than in real-time, and takes only 26 seconds to train a robotic locomotion policy transferrable to the real world on a single RTX4090, see tutorial.
This is a significant step forward in the simulation ecosystem and could help accelerate robots' ability to understand our physical world.
๐106๐101๐ฅ101โค98
Breaking Research: BiomedCLIP - A New Era in Medical AI
Microsoft Research and their partners have developed BiomedCLIP AI model that could transform how we process medical imaging and data.
Here's what makes it special:
1. Key Achievements:
- Created PMC-15M: The largest biomedical dataset with 15 million image-text pairs from 4.4 million scientific articles
- Significantly outperformed existing models in medical imaging tasks
- Even beat specialized models in their own domains (like radiology-specific AI systems)
2. What Makes It Different:
- 100x larger than existing datasets
- Covers diverse medical imaging types
- Fully open-access
- Domain-specific adaptations for biomedical data
3. Capabilities:
- Cross-modal retrieval: Finding matching images from text descriptions and vice versa
- Image classification: Identifying medical conditions from images
- Visual question answering: Responding to questions about medical images
4. Real-world Impact:
- Could help doctors analyze medical images more accurately
- Enables efficient searching through medical literature
- Supports medical research and education
- Offers privacy-preserving analysis of proprietary medical data
5. Availability:
The model and related resources will be available at aka.ms/biomedclip
Microsoft Research and their partners have developed BiomedCLIP AI model that could transform how we process medical imaging and data.
Here's what makes it special:
1. Key Achievements:
- Created PMC-15M: The largest biomedical dataset with 15 million image-text pairs from 4.4 million scientific articles
- Significantly outperformed existing models in medical imaging tasks
- Even beat specialized models in their own domains (like radiology-specific AI systems)
2. What Makes It Different:
- 100x larger than existing datasets
- Covers diverse medical imaging types
- Fully open-access
- Domain-specific adaptations for biomedical data
3. Capabilities:
- Cross-modal retrieval: Finding matching images from text descriptions and vice versa
- Image classification: Identifying medical conditions from images
- Visual question answering: Responding to questions about medical images
4. Real-world Impact:
- Could help doctors analyze medical images more accurately
- Enables efficient searching through medical literature
- Supports medical research and education
- Offers privacy-preserving analysis of proprietary medical data
5. Availability:
The model and related resources will be available at aka.ms/biomedclip
huggingface.co
microsoft/BiomedCLIP-PubMedBERT_256-vit_base_patch16_224 ยท Hugging Face
Weโre on a journey to advance and democratize artificial intelligence through open source and open science.
๐208๐ฅ196โค195๐174
Forwarded from BlockChainWORLD.ai - daily crypto and AI news and promos!
Looking for a reliable crypto card that works everywhere? Meet HyperCard by HyperPayโthe ultimate crypto spending solution! And right now, theyโve got an insane holiday event you donโt want to miss.
Hereโs the scoop: apply for a HyperCard and get a $15 reimbursement just for signing up. Plus, youโre automatically in the running for a 1,000 USDT prize pool. Spend $100, and you get a spin on their Lucky Spin Draw, with prizes as big as 1 ETH up for grabs. Invite a friend? Thatโs another spin. Itโs rewards on rewards, all delivered to your wallet within three days.
With no annual fees and the ability to spend your crypto anywhereโonline, in stores, or at ATMsโHyperCard is a no-brainer. Donโt waitโthis event ends January 19, 2025.
Get HyperPay and start winning big!
Hereโs the scoop: apply for a HyperCard and get a $15 reimbursement just for signing up. Plus, youโre automatically in the running for a 1,000 USDT prize pool. Spend $100, and you get a spin on their Lucky Spin Draw, with prizes as big as 1 ETH up for grabs. Invite a friend? Thatโs another spin. Itโs rewards on rewards, all delivered to your wallet within three days.
With no annual fees and the ability to spend your crypto anywhereโonline, in stores, or at ATMsโHyperCard is a no-brainer. Donโt waitโthis event ends January 19, 2025.
Get HyperPay and start winning big!
๐130โค123๐ฅ118๐117
OpenAI Announces Major Structural Changes for 2025
The company plans to transform its current for-profit arm into a Delaware Public Benefit Corporation (PBC), marking a significant evolution from its original 2015 structure.
Key Changes and Motivations:
1. The core reason behind this restructuring is the need for substantially more capital than initially anticipated. While OpenAI began in 2015 expecting that progress would mainly depend on research breakthroughs, they've since realized that developing advanced AI systems requires massive computing resources and corresponding financial investments.
2. Under the new structure, OpenAI will maintain both non-profit and for-profit elements, but with important changes:
- The for-profit entity will become a Delaware Public Benefit Corporation
- The non-profit will receive shares in the PBC at a fair market value
- This transformation aims to make the non-profit one of the best-resourced in history
3. The new PBC structure will allow OpenAI to raise capital with more conventional terms, similar to other major players in the AI space. This is crucial as the company faces competition from well-funded competitors investing hundreds of billions in AI development.
Progress and Impact:
OpenAI has come a long way from its initial research lab status. The company now serves over 300 million weekly ChatGPT users and has made significant strides in AI development, including recent breakthroughs with their o-series models showing new reasoning capabilities.
Looking Forward:
The company views this restructuring as essential for advancing its mission of ensuring artificial general intelligence (AGI) benefits all of humanity. The PBC will handle operations and business aspects, while the non-profit arm will focus on charitable initiatives in sectors like healthcare, education, and science.
The company plans to transform its current for-profit arm into a Delaware Public Benefit Corporation (PBC), marking a significant evolution from its original 2015 structure.
Key Changes and Motivations:
1. The core reason behind this restructuring is the need for substantially more capital than initially anticipated. While OpenAI began in 2015 expecting that progress would mainly depend on research breakthroughs, they've since realized that developing advanced AI systems requires massive computing resources and corresponding financial investments.
2. Under the new structure, OpenAI will maintain both non-profit and for-profit elements, but with important changes:
- The for-profit entity will become a Delaware Public Benefit Corporation
- The non-profit will receive shares in the PBC at a fair market value
- This transformation aims to make the non-profit one of the best-resourced in history
3. The new PBC structure will allow OpenAI to raise capital with more conventional terms, similar to other major players in the AI space. This is crucial as the company faces competition from well-funded competitors investing hundreds of billions in AI development.
Progress and Impact:
OpenAI has come a long way from its initial research lab status. The company now serves over 300 million weekly ChatGPT users and has made significant strides in AI development, including recent breakthroughs with their o-series models showing new reasoning capabilities.
Looking Forward:
The company views this restructuring as essential for advancing its mission of ensuring artificial general intelligence (AGI) benefits all of humanity. The PBC will handle operations and business aspects, while the non-profit arm will focus on charitable initiatives in sectors like healthcare, education, and science.
Openai
Why OpenAIโs structure must evolve to advance our mission
A stronger non-profit supported by the for-profitโs success.
๐ฅ78๐74โค72๐69
OpenAI and Microsoft have revealed their true understanding of AGI, and it's measured not in technological achievements but in dollars.
For a long time, the definition of AGI remained fuzzy and subjective.
OpenAI publicly described it as "automated systems that outperform humans at most economically valuable work." However, thanks to leaked documents, we now have a much more specific definition.
For OpenAI and Microsoft, achieving AGI has a clear financial criterion - the ability of AI systems to generate $100 billion in profits.
This is particularly significant given their partnership terms: once OpenAI reaches this milestone, the company can terminate its collaboration with Microsoft, and the tech giant will lose access to OpenAI's new developments.
This story perfectly illustrates how lofty ideals of creating technology "for the benefit of humanity" have transformed into purely commercial metrics.
AGI has evolved from a philosophical concept into a business indicator, and the question of its achievement has been reduced to a number on a bank account.
As stated in the leaked documents: "For OpenAI and Microsoft, AGI has a very specific definition: the point when OpenAI develops AI systems that can generate at least $100 billion in profits."
This revelation not only provides clarity about the companies' priorities but also raises questions about the future of AI development and the true meaning of technological progress in our increasingly profit-driven world.
For a long time, the definition of AGI remained fuzzy and subjective.
OpenAI publicly described it as "automated systems that outperform humans at most economically valuable work." However, thanks to leaked documents, we now have a much more specific definition.
For OpenAI and Microsoft, achieving AGI has a clear financial criterion - the ability of AI systems to generate $100 billion in profits.
This is particularly significant given their partnership terms: once OpenAI reaches this milestone, the company can terminate its collaboration with Microsoft, and the tech giant will lose access to OpenAI's new developments.
This story perfectly illustrates how lofty ideals of creating technology "for the benefit of humanity" have transformed into purely commercial metrics.
AGI has evolved from a philosophical concept into a business indicator, and the question of its achievement has been reduced to a number on a bank account.
As stated in the leaked documents: "For OpenAI and Microsoft, AGI has a very specific definition: the point when OpenAI develops AI systems that can generate at least $100 billion in profits."
This revelation not only provides clarity about the companies' priorities but also raises questions about the future of AI development and the true meaning of technological progress in our increasingly profit-driven world.
The Information
Microsoft and OpenAIโs Secret AGI Definition
Finally, a verifiable, numbers-based description of artificial general intelligence has arrived!Whether AGI has or hasnโt been โachievedโ by AI developers has been a hotly debated topic due to its fuzzy and subjective definition. OpenAI has publicly describedโฆ
โค200๐ฅ193๐191๐180
Media is too big
VIEW IN TELEGRAM
Ylla Zen music video โ "Light it up"
Created entirely with the help of AI: Runway and Kling AI. The music, lyrics, and vocals are all the work of artificial intelligence. However, the mixing and mastering was still handled by the music label's team. But overall, it turned out pretty cool.
Created entirely with the help of AI: Runway and Kling AI. The music, lyrics, and vocals are all the work of artificial intelligence. However, the mixing and mastering was still handled by the music label's team. But overall, it turned out pretty cool.
๐100๐ฅ98๐88โค85
This media is not supported in your browser
VIEW IN TELEGRAM
In the updated version of Canvas in ChatGPT, it is now possible to translate code from one programming language to another.
๐224๐ฅ221๐213โค186
Forwarded from BlockChainWORLD.ai - daily crypto and AI news and promos!
Will Bitcoin reach 250k in 2025? ๐
Anonymous Poll
74%
Yes, to the moon!๐
26%
No, it will stay around 100k
๐ฅ184โค171๐170๐166
The AI video revolution is here, and 2025 brings the most hyped tools to the forefront. From groundbreaking innovation to unexpected flops, here are the top 7 players reshaping visual storytellingโwarts and all.
Hailuo MiniMax
Pros: Exceptional text-to-video quality; excels in rendering detailed technical environments and is the best for 2D animation.
Cons: Limited to 6-second videos at 720p resolution; lacks advanced customization features.
Kling
Pros: Versatile text-to-video and video-to-video capabilities; offers high-resolution outputs with customizable settings.
Cons: Longer rendering times for complex scenes; advanced features may require a learning curve.
Sora
Pros: Generates highly realistic and imaginative videos; integrates unique features like Storyboard for enhanced control.
Cons: A flop; overpriced at $200 per month with limited practical use, and some issues with physics inaccuracies.
Pika
Pros: Affordable with features like lip-sync and pre-made VFX options; user-friendly interface.
Cons: Visual fidelity and performance lag behind competitors; issues with motion dynamics and text-to-video quality.
Vidu
Pros: Offers a wide range of services with many video generation controls, image generation, and extra apps; creative flexibility.
Cons: Quality not yet matching others; may require acclimation for new users.
Gen-3 Runway
Pros: High-quality, realistic video generation; fast processing times; versatile with features like image-to-video support.
Cons: Lazy and uninspired, only capable of slow zoom effects, and lacks proper animation features.
Luma
Pros: Generates high-quality, creative videos; includes canvas expansion and looping features.
Cons: A morphing mess with inconsistent results; slower motion compared to competitors.
Hailuo MiniMax
Pros: Exceptional text-to-video quality; excels in rendering detailed technical environments and is the best for 2D animation.
Cons: Limited to 6-second videos at 720p resolution; lacks advanced customization features.
Kling
Pros: Versatile text-to-video and video-to-video capabilities; offers high-resolution outputs with customizable settings.
Cons: Longer rendering times for complex scenes; advanced features may require a learning curve.
Sora
Pros: Generates highly realistic and imaginative videos; integrates unique features like Storyboard for enhanced control.
Cons: A flop; overpriced at $200 per month with limited practical use, and some issues with physics inaccuracies.
Pika
Pros: Affordable with features like lip-sync and pre-made VFX options; user-friendly interface.
Cons: Visual fidelity and performance lag behind competitors; issues with motion dynamics and text-to-video quality.
Vidu
Pros: Offers a wide range of services with many video generation controls, image generation, and extra apps; creative flexibility.
Cons: Quality not yet matching others; may require acclimation for new users.
Gen-3 Runway
Pros: High-quality, realistic video generation; fast processing times; versatile with features like image-to-video support.
Cons: Lazy and uninspired, only capable of slow zoom effects, and lacks proper animation features.
Luma
Pros: Generates high-quality, creative videos; includes canvas expansion and looping features.
Cons: A morphing mess with inconsistent results; slower motion compared to competitors.
๐ฅ874โค867๐865๐859
This media is not supported in your browser
VIEW IN TELEGRAM
AI enthusiasts have launched a modern neural network on a 26-year-old Windows 98 computer. They adapted the Llama language model to ancient software, and she was even able to write a story.
The future met the past ๐
The future met the past ๐
โค128๐127๐121๐ฅ117
Anthropic's Bold Vision: Building the HTTP of AI with Model Context Protocol
Anthropic published a near-term development roadmap for the model context protocol (MCP).
In a strategic move that could reshape the AI landscape, Anthropic has revealed its ambitious plans for the MCP - potentially laying the groundwork for how we'll interact with AI in the years to come.
Just as HTTP revolutionized the web by standardizing how we access and share information, MCP aims to become the universal language for AI interactions.
Anthropic's H1 2025 roadmap reveals a vision that extends far beyond developing individual AI models like Claude. Instead, they're architecting the fundamental infrastructure that could power the next generation of AI interactions.
Here's what makes this approach revolutionary:
1. Building an Open Ecosystem
- Development of an open protocol for standardized AI model interactions
- Inviting other AI providers to shape MCP as an industry standard
- Focus on community-led development and shared governance
2. Enabling Decentralization
- Support for remote MCP connections
- Secure cross-system AI interactions
- Infrastructure for distributed AI systems
3. Scaling for the Future
- Advanced support for hierarchical agent systems
- Preparation for multimodal interactions (text, audio, video)
- Standardized packaging and distribution mechanisms
4. Democratizing Access
- Simplified installation and usage processes
- Creation of a universal server registry
- Open community participation in protocol development
The HTTP Parallel
The comparison to HTTP is particularly apt. Just as HTTP provided the foundational protocol that enabled the modern web to flourish, MCP could serve as the standard protocol for AI interactions. This standardization could:
- Enable seamless communication between different AI systems
- Create a more accessible and interoperable AI ecosystem
- Foster innovation through standardized interfaces
Strategic Implications
This move positions Anthropic not just as an AI company, but as a potential architect of the fundamental infrastructure that could power the future of AI interactions. By focusing on building this foundation, they're taking a long-term view that could significantly influence how AI systems are developed, deployed, and integrated in the years to come.
The success of this initiative could establish MCP as the de facto standard for AI interactions, similar to how HTTP became the backbone of web communications. This would not only benefit the broader AI community but could also cement Anthropic's position as a key player in shaping the future of AI.
Anthropic published a near-term development roadmap for the model context protocol (MCP).
In a strategic move that could reshape the AI landscape, Anthropic has revealed its ambitious plans for the MCP - potentially laying the groundwork for how we'll interact with AI in the years to come.
Just as HTTP revolutionized the web by standardizing how we access and share information, MCP aims to become the universal language for AI interactions.
Anthropic's H1 2025 roadmap reveals a vision that extends far beyond developing individual AI models like Claude. Instead, they're architecting the fundamental infrastructure that could power the next generation of AI interactions.
Here's what makes this approach revolutionary:
1. Building an Open Ecosystem
- Development of an open protocol for standardized AI model interactions
- Inviting other AI providers to shape MCP as an industry standard
- Focus on community-led development and shared governance
2. Enabling Decentralization
- Support for remote MCP connections
- Secure cross-system AI interactions
- Infrastructure for distributed AI systems
3. Scaling for the Future
- Advanced support for hierarchical agent systems
- Preparation for multimodal interactions (text, audio, video)
- Standardized packaging and distribution mechanisms
4. Democratizing Access
- Simplified installation and usage processes
- Creation of a universal server registry
- Open community participation in protocol development
The HTTP Parallel
The comparison to HTTP is particularly apt. Just as HTTP provided the foundational protocol that enabled the modern web to flourish, MCP could serve as the standard protocol for AI interactions. This standardization could:
- Enable seamless communication between different AI systems
- Create a more accessible and interoperable AI ecosystem
- Foster innovation through standardized interfaces
Strategic Implications
This move positions Anthropic not just as an AI company, but as a potential architect of the fundamental infrastructure that could power the future of AI interactions. By focusing on building this foundation, they're taking a long-term view that could significantly influence how AI systems are developed, deployed, and integrated in the years to come.
The success of this initiative could establish MCP as the de facto standard for AI interactions, similar to how HTTP became the backbone of web communications. This would not only benefit the broader AI community but could also cement Anthropic's position as a key player in shaping the future of AI.
Model Context Protocol
Roadmap - Model Context Protocol
Our plans for evolving Model Context Protocol
๐ฅ236๐235โค233๐230
This media is not supported in your browser
VIEW IN TELEGRAM
Enthusiasts from ZooDev have assembled an AI innovation, namely a CAD system that can handle any (!) task.
According to the developers, AI is already speeding up design by 10 times, and soon โ by 50 (!) times.
โข Every little detail, hole, or optimization will be taken over by the built-in AI. All you need is a text promt.
โข The internal simulator allows you to immediately (!) test the model.
โข Everything you do in the editor turns into code โ you have changed a number and the model is updated with new parameters.
According to the developers, AI is already speeding up design by 10 times, and soon โ by 50 (!) times.
๐89๐ฅ88โค87๐84
Google released white paper on AI agents
It covers the basics of llm agents and a quick Langchain implementation.
It covers the basics of llm agents and a quick Langchain implementation.
โค211๐205๐ฅ190๐189
Wow! A real big humanoid robotics dataset just got open sourced: AgiBot World is the first large-scale robotic learning dataset designed to advance multi-purpose humanoid policies
With 1M+ trajectories from 100 robots, AgiBot World spans 100+ real-world scenarios across five target domains, tackling fine-grained manipulation, tool usage, and multi-robot collaboration.
Cutting-edge multimodal hardware features visual tactile sensors, durable 6-DoF dexterous hands, and mobile dual-arm robots with whole-body control, supporting research in imitation learning, multi-agent collaboration, and more.
Github.
HuggingFace
Dataset Highlights:
- Cutting-edge sensor and hardware design
- Wide-spectrum of scenario coverage
- Quality assurance with human-in-the-loop
With 1M+ trajectories from 100 robots, AgiBot World spans 100+ real-world scenarios across five target domains, tackling fine-grained manipulation, tool usage, and multi-robot collaboration.
Cutting-edge multimodal hardware features visual tactile sensors, durable 6-DoF dexterous hands, and mobile dual-arm robots with whole-body control, supporting research in imitation learning, multi-agent collaboration, and more.
Github.
HuggingFace
Dataset Highlights:
- Cutting-edge sensor and hardware design
- Wide-spectrum of scenario coverage
- Quality assurance with human-in-the-loop
GitHub
GitHub - OpenDriveLab/AgiBot-World: [IROS 2025 Best Paper Award Finalist & IEEE TRO 2026] The Large-scale Manipulation Platformโฆ
[IROS 2025 Best Paper Award Finalist & IEEE TRO 2026] The Large-scale Manipulation Platform for Scalable and Intelligent Embodied Systems - OpenDriveLab/AgiBot-World
โค69๐64๐ฅ60๐52