Another top notch open source model at OpenAI/Meta/Google levels from &MiniMax AI (Chinese lab, ex Sensetime, $850m raised). Massive MoE similar to Deep-seek.
Excels on long context (4m tokens!) which is really interesting, need to dig into their lighting attention variant.
Paper.
Excels on long context (4m tokens!) which is really interesting, need to dig into their lighting attention variant.
Paper.
agent.minimax.io
MiniMax Agent: Minimize Effort, Maximize Intelligence
Discover MiniMax Agent, your AI supercompanion, enhancing creativity and productivity with tools for meditation, podcast, coding, analysis, and more!
π235π226β€202π₯202
The Berkeley Sky computing lab just trained Sky-T1-32B-Preview, a GPT-o1 level reasoning model, spending only $450 to create the instruction dataset.
The data is 17K math and coding problems solved step by step. They created this dataset by prompting QwQ at $450 cost.
Can it be done without another reasoning model to distill?
Teach a 1000 student class and assign 17 homework problems. Side benefit: make $10M by charging $10K tuition.
Model data and full code here. Very interesting work that shows simple SFT is all you need (if you have good data).
The data is 17K math and coding problems solved step by step. They created this dataset by prompting QwQ at $450 cost.
Can it be done without another reasoning model to distill?
Teach a 1000 student class and assign 17 homework problems. Side benefit: make $10M by charging $10K tuition.
Model data and full code here. Very interesting work that shows simple SFT is all you need (if you have good data).
π₯170β€158π152π147
βοΈA new way to teach robots has been created, which can be compared to how we teach small children.
Former Google employees, creators of the Physical Intelligence company, have developed a new method FAST (Fast Action Signal Tokenization) for more efficient control of robots. Previously, they presented GPT for robots.
The problem that FAST solves
Existing AI models for controlling robots require a simple breakdown of actions into detailed parts, which is not efficient enough for complex and precise tasks.
Solution - FAST is a new way to tokenize (transform) robot actions that:
- Uses compression methods similar to those used in JPEG.
- organize work on performing a complex task with a high level of complexity
- Learns 5 times faster than previous methods
Robots with FAST can perform complex tasks, such as folding laundry, cleaning tables, and packing groceries.
For the first time, it was possible to create a universal model that can solve various problems in new conditions, simply by receiving commands in natural language.
The system was successfully tested at 3 American universities.
The company released a public version of the FAST tokenizer, training 1 million sequential robot actions that researchers can use for their projects.
https://www.pi.website/research/fast
Former Google employees, creators of the Physical Intelligence company, have developed a new method FAST (Fast Action Signal Tokenization) for more efficient control of robots. Previously, they presented GPT for robots.
The problem that FAST solves
Existing AI models for controlling robots require a simple breakdown of actions into detailed parts, which is not efficient enough for complex and precise tasks.
Solution - FAST is a new way to tokenize (transform) robot actions that:
- Uses compression methods similar to those used in JPEG.
- organize work on performing a complex task with a high level of complexity
- Learns 5 times faster than previous methods
Robots with FAST can perform complex tasks, such as folding laundry, cleaning tables, and packing groceries.
For the first time, it was possible to create a universal model that can solve various problems in new conditions, simply by receiving commands in natural language.
The system was successfully tested at 3 American universities.
The company released a public version of the FAST tokenizer, training 1 million sequential robot actions that researchers can use for their projects.
https://www.pi.website/research/fast
www.pi.website
FAST: Efficient Robot Action Tokenization
A new robot action tokenizer that allows us to train generalist policies 5x faster than previous models.
π178π176π₯163β€148
This media is not supported in your browser
VIEW IN TELEGRAM
Ai video generator progress in just a year is incredible!
π2
Forwarded from BlockChainWORLD.ai - daily crypto and AI news and promos!
$TRUMP is now officially available for both futures and spot trading on CoinW!
Weβre inviting users to trade TRUMP with up to 60x leverage and win random rewards ranging from 5 to 200 USDT, plus additional TRUMP token gifts!
Event Details:
π Event Time: 2025/1/18 16:00 - 1/27 16:00 (UTC)
πΉ How to Participate:
1. Register on Coinw
πΉ Rewards:
β’ Trade & Win: Get 5-200 USDT in random rewards for any TRUMP futures trade.
β’ Earn More TRUMP Tokens: Reach a 50,000 USDT trading volume to receive an additional 2-100 USDT in TRUMP spot tokens.
β’ Exclusive Welcome Bonus: New futures users who trade TRUMP will receive an extra 10 USDT bonus.
Event Rules:
β’ Rewards will be distributed within 7 working days after the event.
β’ CoinW strictly prohibits fraudulent activities (e.g., volume manipulation, mass account creation).
β’ CoinW reserves the right to the final interpretation of the event.
Start here https://www.coinw.com/en_US/referral-code?r=3240303 @CoinWOfficial
Weβre inviting users to trade TRUMP with up to 60x leverage and win random rewards ranging from 5 to 200 USDT, plus additional TRUMP token gifts!
Event Details:
π Event Time: 2025/1/18 16:00 - 1/27 16:00 (UTC)
πΉ How to Participate:
1. Register on Coinw
πΉ Rewards:
β’ Trade & Win: Get 5-200 USDT in random rewards for any TRUMP futures trade.
β’ Earn More TRUMP Tokens: Reach a 50,000 USDT trading volume to receive an additional 2-100 USDT in TRUMP spot tokens.
β’ Exclusive Welcome Bonus: New futures users who trade TRUMP will receive an extra 10 USDT bonus.
Event Rules:
β’ Rewards will be distributed within 7 working days after the event.
β’ CoinW strictly prohibits fraudulent activities (e.g., volume manipulation, mass account creation).
β’ CoinW reserves the right to the final interpretation of the event.
Start here https://www.coinw.com/en_US/referral-code?r=3240303 @CoinWOfficial
β€59π51π44π₯43
This media is not supported in your browser
VIEW IN TELEGRAM
The developers of neuro-photoshop Krea AI have shown a new function β turning pictures into 3D objects. Any picture becomes a model that can be rotated for a better pose.
π₯266π263β€250π229
DeepSeek-R1 is here! Performance on par with OpenAI-o1. Fully open-source model & technical report. MIT licensed: Distill & commercialize freely.
Try DeepThink.
API guide.
Bonus: Open-Source Distilled Models!
Distilled from DeepSeek-R1, 6 small models fully open-sourced. 32B & 70B models on par with OpenAI-o1-mini.
Try DeepThink.
API guide.
Bonus: Open-Source Distilled Models!
Distilled from DeepSeek-R1, 6 small models fully open-sourced. 32B & 70B models on par with OpenAI-o1-mini.
GitHub
DeepSeek-R1/DeepSeek_R1.pdf at main Β· deepseek-ai/DeepSeek-R1
Contribute to deepseek-ai/DeepSeek-R1 development by creating an account on GitHub.
π236π232β€231π₯223
βοΈDario Amodei, Anthropic, stated that AI will surpass human intelligence by 2027
Also, the founder of Anthropic said that it will appear soon.:
1. Claude's voice mode
2. Anthropic will use over a million GPUs by 2026
3. Claude will have an improved memory.
4. More intelligent models will appear in the coming months.
Also, the founder of Anthropic said that it will appear soon.:
1. Claude's voice mode
2. Anthropic will use over a million GPUs by 2026
3. Claude will have an improved memory.
4. More intelligent models will appear in the coming months.
The Wall Street Journal
Anthropic CEO Says AI Could Surpass Human Intelligence by 2027
Anthropic Chief Executive Officer Dario Amodei said that his AI startup is racing to secure the computing power needed to meet demand for its generative AI chatbot Claude.
βThe surge in demand weβve seen over the last year, and particularly in the last threeβ¦
βThe surge in demand weβve seen over the last year, and particularly in the last threeβ¦
π₯123β€120π112π88
OpenAI is prepping to release "Operator," a new ChatGPT feature that will take actions on behalf of users in their browsers, this week.
Interesting details:
- Operator provides suggested prompts
- Users can save/share tasks
- Not available in API
Interesting details:
- Operator provides suggested prompts
- Users can save/share tasks
- Not available in API
The Information
OpenAI Preps βOperatorβ Release For This Week
OpenAI is preparing to release a new ChatGPT feature this week that will automate complex tasks typically done through the Web browser, such as making restaurant reservations or planning trips, according to a person with direct knowledge of the plans.
β¦
β¦
β€68π₯68π65π57
OpenAI released a computer-using agent Operator as a research preview.
Ensuring safety for agentic models is far more complex than for chatbots.
Errors can lead to serious consequencesβfor instance, the agent might make costly real-world decisions, like accidentally spending money from a credit card.
Achieving full agentic safety will be as challenging as ensuring the safety of self-driving cars, but with an added layer of complexity.
Ensuring safety for agentic models is far more complex than for chatbots.
Errors can lead to serious consequencesβfor instance, the agent might make costly real-world decisions, like accidentally spending money from a credit card.
Achieving full agentic safety will be as challenging as ensuring the safety of self-driving cars, but with an added layer of complexity.
Openai
Introducing Operator
π185π₯175π163β€160
NVIDIA Senior Researcher: DeepSeek Proves AI Infrastructure and Base AI Models Will Become a Commodity
Jim Fan, NVIDIA, says:
1. DeepSeek has shown the best results in several independent tests.
2. Most importantly, they have achieved this with much less computing resources.
DeepSeek proves that it is possible to get the same level of intelligence for 10 times less. This means that with current computing power, it is possible to create 10 times more powerful AI. The timeline of AI development is compressing.
Jim Fan, NVIDIA, says:
βWhether you like it or not,
the future of AI is its democratization, where every Internet user will be able to run advanced models even on weak devices.
This is a historical trend that is pointless to fight.β
1. DeepSeek has shown the best results in several independent tests.
2. Most importantly, they have achieved this with much less computing resources.
DeepSeek proves that it is possible to get the same level of intelligence for 10 times less. This means that with current computing power, it is possible to create 10 times more powerful AI. The timeline of AI development is compressing.
β€119π116π109π₯98
DeepSeek showed us in just 4 days:
1. Open-source AI is only <6 months behind closed AI
2. China is leading the open-source AI race
3. we are entering the LLM RL golden era
4. distilled models are powerful, we'll have highly intelligent AI running locally on our phones.
Reactions:
βοΈOpenAI o3-mini available for free-tier
π½hopefully we will see less AGI/ASI vagueposting.
Itβs hard to predict who will ultimately win, but donβt forget the power of the last mover advantage: Google invented the Transformer, but OpenAI unlocked its true potential.
1. Open-source AI is only <6 months behind closed AI
2. China is leading the open-source AI race
3. we are entering the LLM RL golden era
4. distilled models are powerful, we'll have highly intelligent AI running locally on our phones.
Reactions:
βοΈOpenAI o3-mini available for free-tier
π½hopefully we will see less AGI/ASI vagueposting.
Itβs hard to predict who will ultimately win, but donβt forget the power of the last mover advantage: Google invented the Transformer, but OpenAI unlocked its true potential.
π95π89β€87π₯70
Neural networks make schoolchildren smarter. In Nigeria, children were given access to a chatbot for 6 weeks, and then tested their knowledge with a test. The results of the experiment killed:
π€ Students with AI score above average, without AI β below average.
π€6 weeks of AI studies turned out to be equal to 2 years of regular
π€AI has surpassed other learning methods by 80%.
π€ Students with AI score above average, without AI β below average.
π€6 weeks of AI studies turned out to be equal to 2 years of regular
π€AI has surpassed other learning methods by 80%.
β€135π130π₯129π97
Forwarded from BlockChainWORLD.ai - daily crypto and AI news and promos!
HyperCard: Shop Anywhere, Earn Big: The Secret Behind HyperPayβs Crypto Card!
https://youtu.be/IrDC9TayyRg?si=tXgPx_juAD2jKxyU
https://youtu.be/IrDC9TayyRg?si=tXgPx_juAD2jKxyU
YouTube
HyperCard: Shop Anywhere, Earn Big: The Secret Behind HyperPayβs Crypto Card!
HyperPayβs HyperCard is redefining the way crypto enthusiasts shop, invest, and manage their assets. With seamless integration into platforms like Google, Amazon, eBay, and more, it turns your crypto into a versatile payment tool for everyday use. Featuringβ¦
π₯174π174π167β€164
Why DeepSeek's Success Doesn't Change the AI Race: Dario Amodei's View
Anthropic's CEO explains why the apparent breakthrough fits into the expected trajectory of AI development.*
3 Laws of AI Development
1. Scaling Law:
- More resources = better results
- $1M = 20% tasks, $10M = 40%, $100M = 60%
- Progress is smooth and predictable
2. Curve Shifting:
- Innovations improve efficiency
- Typical improvements:
* Small (1.2x)
* Medium (2x)
* Large (10x)
- Overall pace: ~4x per year
3. Paradigm Shifts:
- 2020-2023: text training
- 2024: adding Reinforcement Learning
- Now: unique "crossover point"
What DeepSeek actually achieved?
- Performance similar to 7-10 month old US models
- Lower costs, but within normal trend
- Significant resources (~50,000 chips, ~$1B)
Not a revolution because:
- Cost reduction follows expected 4x/year trend
- V3 more innovative than R1
- Total company spending comparable to US labs
The Future (2026-2027)
According to Amodei, truly advanced AI will require:
- Millions of chips
- Tens of billions of dollars
- 2-3 years of work
Key Takeaway
DeepSeek demonstrates an expected point on the progress curve, not a revolutionary breakthrough. The real race for superhuman AI is just beginning, and it will require unprecedented resources.
"Making AI that is smarter than almost all humans at almost all things will require millions of chips, tens of billions of dollars (at least), and is most likely to happen in 2026-2027"- Dario Amodei
Anthropic's CEO explains why the apparent breakthrough fits into the expected trajectory of AI development.*
3 Laws of AI Development
1. Scaling Law:
- More resources = better results
- $1M = 20% tasks, $10M = 40%, $100M = 60%
- Progress is smooth and predictable
2. Curve Shifting:
- Innovations improve efficiency
- Typical improvements:
* Small (1.2x)
* Medium (2x)
* Large (10x)
- Overall pace: ~4x per year
3. Paradigm Shifts:
- 2020-2023: text training
- 2024: adding Reinforcement Learning
- Now: unique "crossover point"
What DeepSeek actually achieved?
- Performance similar to 7-10 month old US models
- Lower costs, but within normal trend
- Significant resources (~50,000 chips, ~$1B)
Not a revolution because:
- Cost reduction follows expected 4x/year trend
- V3 more innovative than R1
- Total company spending comparable to US labs
The Future (2026-2027)
According to Amodei, truly advanced AI will require:
- Millions of chips
- Tens of billions of dollars
- 2-3 years of work
Key Takeaway
DeepSeek demonstrates an expected point on the progress curve, not a revolutionary breakthrough. The real race for superhuman AI is just beginning, and it will require unprecedented resources.
"Making AI that is smarter than almost all humans at almost all things will require millions of chips, tens of billions of dollars (at least), and is most likely to happen in 2026-2027"- Dario Amodei
Darioamodei
Dario Amodei β On DeepSeek and Export Controls
π₯212π210β€200π184
Mistral AI released a new small model
Mistral Small 3, a latency-optimized 24B-parameter model released under the Apache 2.0 license.
Mistral Small 3 is competitive with larger models such as Llama 3.3 70B or Qwen 32B, and is an excellent open replacement for opaque proprietary models like GPT4o-mini.
Mistral Small 3 is on par with Llama 3.3 70B instruct, while being more than 3x faster on the same hardware.
Mistral Small 3 is a pre-trained and instructed model catered to the β80%β of generative AI tasksβthose that require robust language and instruction following performance, with very low latency.
Mistral Small 3, a latency-optimized 24B-parameter model released under the Apache 2.0 license.
Mistral Small 3 is competitive with larger models such as Llama 3.3 70B or Qwen 32B, and is an excellent open replacement for opaque proprietary models like GPT4o-mini.
Mistral Small 3 is on par with Llama 3.3 70B instruct, while being more than 3x faster on the same hardware.
Mistral Small 3 is a pre-trained and instructed model catered to the β80%β of generative AI tasksβthose that require robust language and instruction following performance, with very low latency.
mistral.ai
Mistral Small 3 | Mistral AI
Mistral Small 3: Apache 2.0, 81% MMLU, 150 tokens/s
π198π180π₯172β€162
Google rolled out Gemini 2 Flash for free to its products.
β one of the most high quality non-reasoning LLMs
β super fast (150tok/s+)
β 1M tok context window.
API price isnt out but was previously $0.075/$0.30 per M input/output tokens. Big move from Google.
β one of the most high quality non-reasoning LLMs
β super fast (150tok/s+)
β 1M tok context window.
API price isnt out but was previously $0.075/$0.30 per M input/output tokens. Big move from Google.
π₯384π381π373β€371
OpenAI has released the o3-mini, available today to all users on ChatGPT
Openai
OpenAI o3-mini
Pushing the frontier of cost-effective reasoning.
π₯253π242β€230π217
Top 3 AI Tokens That May Rise in February π¬
βΊTAO. Over the past month, TAO has fallen by 18% and formed a bottom at $362. Then it began to recover, which highlights the interest in AI tokens. If the upward momentum accelerates, TAO may test the resistance levels at $459 and $495.
βΊGRIFFAIN. The token grew steadily in December, and by January 22, the coin's market cap reached $600 million. But then GRIFFAIN faced a sharp correction. If interest in AI agents returns, the coin may recover above the resistance levels of $0.218 and $0.31.
βΊARC. The token showed strong growth, reaching a peak capitalization of $622 million. The subsequent correction reduced the total value of all ARC tokens in circulation to $221 million. In case of a reversal, ARC may rise to the resistance levels of $0.279 and $0.348.
βΊTAO. Over the past month, TAO has fallen by 18% and formed a bottom at $362. Then it began to recover, which highlights the interest in AI tokens. If the upward momentum accelerates, TAO may test the resistance levels at $459 and $495.
βΊGRIFFAIN. The token grew steadily in December, and by January 22, the coin's market cap reached $600 million. But then GRIFFAIN faced a sharp correction. If interest in AI agents returns, the coin may recover above the resistance levels of $0.218 and $0.31.
βΊARC. The token showed strong growth, reaching a peak capitalization of $622 million. The subsequent correction reduced the total value of all ARC tokens in circulation to $221 million. In case of a reversal, ARC may rise to the resistance levels of $0.279 and $0.348.
β€127π122π₯111π91