This media is not supported in your browser
VIEW IN TELEGRAM
Amazon launched Nova Sonic speech-to-speech AI for human-like interactions
—Outperforms OpenAI's voice models with ~ 80% less cost
—4.2% word error rate across languages
— 46.7% better accuracy than GPT-4o for noisy environments
—On Amazon Bedrock
—Outperforms OpenAI's voice models with ~ 80% less cost
—4.2% word error rate across languages
— 46.7% better accuracy than GPT-4o for noisy environments
—On Amazon Bedrock
This media is not supported in your browser
VIEW IN TELEGRAM
Amazon also dropped an upgraded Nova Reel 1.1 video model
—Delivers improved quality, style consistency
—Extends generations to 2 min via automated and manual, shot-by-shot modes
—Also available on Amazon Bedrock
—Delivers improved quality, style consistency
—Extends generations to 2 min via automated and manual, shot-by-shot modes
—Also available on Amazon Bedrock
This media is not supported in your browser
VIEW IN TELEGRAM
Stanford students' research discussion forum AlphaXiv introduced Deep Research for arXiv
The tool compiles literature reviews from trending papers, turning hours of research work into mere seconds of natural language search
The tool compiles literature reviews from trending papers, turning hours of research work into mere seconds of natural language search
This media is not supported in your browser
VIEW IN TELEGRAM
OpenAI released GPT-4.1, 4.1 Mini, and the ultra-fast 4.1 Nano—all designed for devs
— Each model beats GPT-4o and 4o mini on dev tasks
— 1M token context windows
— GPT-4.1 scored 55% on SWE-Bench Verified
— Starting at $0.10/0.40 per million I/O tokens
— Each model beats GPT-4o and 4o mini on dev tasks
— 1M token context windows
— GPT-4.1 scored 55% on SWE-Bench Verified
— Starting at $0.10/0.40 per million I/O tokens
This media is not supported in your browser
VIEW IN TELEGRAM
Google released DolphinGemma, an AI that can generate dolphin vocalizations
The system analyzes dolphin whistles and sounds to identify patterns and predict subsequent sounds!
Set to be open-sourced soon!
The system analyzes dolphin whistles and sounds to identify patterns and predict subsequent sounds!
Set to be open-sourced soon!
This media is not supported in your browser
VIEW IN TELEGRAM
Hugging Face acquired Pollen Robotics, a French startup building open-source humanoids
Pollen is already selling Reachy 2, an open and VR-compatible humanoid for research, education, and embodied AI
A big move from HF in open robotics!
Pollen is already selling Reachy 2, an open and VR-compatible humanoid for research, education, and embodied AI
A big move from HF in open robotics!
This media is not supported in your browser
VIEW IN TELEGRAM
TikTok parent ByteDance just dropped Seaweed, a hyper-efficient 7B-param video AI
—Supports text-to-video, image-to-video, and audio-driven synthesis
—Clips up to 20s
—Matches or outperforms larger models like Sora, Kling 1.6, and Veo
—Supports text-to-video, image-to-video, and audio-driven synthesis
—Clips up to 20s
—Matches or outperforms larger models like Sora, Kling 1.6, and Veo
This media is not supported in your browser
VIEW IN TELEGRAM
Ex-OpenAI chief scientist Ilya Sutskever's Safe Superintelligence (SSI) has reportedly raised $2B at a $32B valuation
This comes as SSI continues to climb a "different mountain" for developing advanced, superintelligent AI
This comes as SSI continues to climb a "different mountain" for developing advanced, superintelligent AI
This media is not supported in your browser
VIEW IN TELEGRAM
OpenAI dropped the new o3 and o4-mini reasoner models
o3 pushes SOTA performance across coding, math, science, and multimodality, while o4-mini offers fast, cost-efficient performance
Both have agentic access to ChatGPT tools and can "think with images"
o3 pushes SOTA performance across coding, math, science, and multimodality, while o4-mini offers fast, cost-efficient performance
Both have agentic access to ChatGPT tools and can "think with images"
This media is not supported in your browser
VIEW IN TELEGRAM
Anthropic added a Research feature in Claude with Google Workspace integration
Research will perform searches across the web and users’ connected work data
This data will also include users' emails, calendars, and docs, thanks to the Workspace link
Research will perform searches across the web and users’ connected work data
This data will also include users' emails, calendars, and docs, thanks to the Workspace link
This media is not supported in your browser
VIEW IN TELEGRAM
Microsoft also started rolling out Copilot Vision in its Edge browser
It will read what's on screen to summarize aloud, working as a real-time collaborator/assistant when browsing the internet.
Best part: it's free—and opt-in (not active by default)!
It will read what's on screen to summarize aloud, working as a real-time collaborator/assistant when browsing the internet.
Best part: it's free—and opt-in (not active by default)!
This media is not supported in your browser
VIEW IN TELEGRAM
China's Kling AI released two new models: KLING 2.0 Master for video generation and KOLORS 2.0 for images
Both come with improved prompt adherence, with KLING 2.0 standing out when dealing with prompts with sequential actions and complex motions
Both come with improved prompt adherence, with KLING 2.0 standing out when dealing with prompts with sequential actions and complex motions
This media is not supported in your browser
VIEW IN TELEGRAM
Elon Musk's xAI started rolling out a memory feature into its Grok assistant (in beta)
Just like ChatGPT, Grok will reference past chats to provide personalized answers.
There's also a dedicated "forget" button to exclude specific chats from its memory
Just like ChatGPT, Grok will reference past chats to provide personalized answers.
There's also a dedicated "forget" button to exclude specific chats from its memory