MINIMAX π₯: A new open-weights model, MiniMax M3, has been released to the public on APIs and MiniMax Agent.
MiniMax Agent Updates π
> Meet M3: Our most intelligent and responsive model designed to handle any task.
> Persistent Memory: Your Agent remembers what you've shared, so you never have to repeat yourself.
> Evolving Skills: It learns as you collaborate, turning complex tasks into skills made just for you.
> Unified Billing: Fully integrated with Token Plan for a smoother, more consistent experience.
MiniMax Agent Updates π
> Meet M3: Our most intelligent and responsive model designed to handle any task.
> Persistent Memory: Your Agent remembers what you've shared, so you never have to repeat yourself.
> Evolving Skills: It learns as you collaborate, turning complex tasks into skills made just for you.
> Unified Billing: Fully integrated with Token Plan for a smoother, more consistent experience.
β€7π₯6π3 2
π¨ AI News | TestingCatalog
MINIMAX π₯: A new open-weights model, MiniMax M3, has been released to the public on APIs and MiniMax Agent. MiniMax Agent Updates π > Meet M3: Our most intelligent and responsive model designed to handle any task. > Persistent Memory: Your Agent remembersβ¦
MiniMax M3 scores 59% on SWE Bench Pro (on par with GPT-5.5), supports a 1M context window via MiniMax Sparse Attention, and is natively multimodal.
β€8π4π€©3π1
OpenAI β€οΈ AWS
OpenAI models are now generally available on AWS Bedrock! Daybreak will be available later on AWS as well.
> That includes future availability for Daybreak, OpenAIβs vision for changing how software is built and defended.
> Daybreak, which includes cyber models and Codex Security, is designed to help cyber defenders see risk earlier.
OpenAI models are now generally available on AWS Bedrock! Daybreak will be available later on AWS as well.
> That includes future availability for Daybreak, OpenAIβs vision for changing how software is built and defended.
> Daybreak, which includes cyber models and Codex Security, is designed to help cyber defenders see risk earlier.
β€8 4π₯2
Claude for iOS will get a redesigned settings menu along with a support for the upcoming Memory Files feature.
> A slightly redesigned UI is being prepared for both Claude web and mobile, primarily revamping settings and navigation bar.
> Memory Files is the upcoming new knowledge based memory system for Claude.
> A slightly redesigned UI is being prepared for both Claude web and mobile, primarily revamping settings and navigation bar.
> Memory Files is the upcoming new knowledge based memory system for Claude.
β€4π3π₯3
Media is too big
VIEW IN TELEGRAM
OPENAI π₯: New Sites, role-specific Plugins, and Annotations features are rolling out in preview for Business and Enterprise plans.
> Today, weβre introducing new ways to do more of your work with Codex: plugins that adapt Codex to your role and tools, annotations that help you refine the result in place, and a preview of the ability to create interactive websites and apps you can share with your workspace using a URL.
> Today, weβre introducing new ways to do more of your work with Codex: plugins that adapt Codex to your role and tools, annotations that help you refine the result in place, and a preview of the ability to create interactive websites and apps you can share with your workspace using a URL.
β€3π3
SpaceX is targeting $1.75T valuation for its upcoming IPO according to The Information.
> SpaceX is seeking a valuation of $1.75 trillion in its initial public offering next week, including additional shares the underwriting banks could sell if investor demand is strong, according to people familiar with the matter.
> SpaceX is seeking a valuation of $1.75 trillion in its initial public offering next week, including additional shares the underwriting banks could sell if investor demand is strong, according to people familiar with the matter.
β€2π1π₯1
TinyFish Bigset turns text prompts into live datasets from web
TinyFish launched Bigset, an open-source, self-hosted multi-agent system that turns plain-language prompts into live web datasets. It infers schema, verifies sources, deduplicates rows, supports scheduled refreshes, and exports CSV/XLSX.
π #sponsored @testingcatalog
TinyFish launched Bigset, an open-source, self-hosted multi-agent system that turns plain-language prompts into live web datasets. It infers schema, verifies sources, deduplicates rows, supports scheduled refreshes, and exports CSV/XLSX.
π #sponsored @testingcatalog
TestingCatalog AI News
TinyFish Bigset turns text prompts into live datasets from web
TinyFish launched Bigset, open-source multi-agent system that builds self-refreshing datasets from a plain-language sentence.
β€3π2
MICROSOFT π₯: New MAI Code 1 Flash and MAI Thinking 1 models have been revealed on the official MAI website!
Also, MAI Image 2.5, MAI Voice 2, and MAI Transcribe 1.5 are there too.
> MAI-Code-1-Flash plans and reasons through complex coding tasks from start to finish, so you spend less time debugging and more time building.
> MAI-Thinking-1 (35B active, ~1T total parameters, MoE) has a smaller inference footprint than much larger models, yet is competitive with Claude Opus 4.6 on SWE-Bench Pro.
Also, MAI Image 2.5, MAI Voice 2, and MAI Transcribe 1.5 are there too.
> MAI-Code-1-Flash plans and reasons through complex coding tasks from start to finish, so you spend less time debugging and more time building.
> MAI-Thinking-1 (35B active, ~1T total parameters, MoE) has a smaller inference footprint than much larger models, yet is competitive with Claude Opus 4.6 on SWE-Bench Pro.
β€7π₯3π2
π¨ AI News | TestingCatalog
MICROSOFT π₯: New MAI Code 1 Flash and MAI Thinking 1 models have been revealed on the official MAI website! Also, MAI Image 2.5, MAI Voice 2, and MAI Transcribe 1.5 are there too. > MAI-Code-1-Flash plans and reasons through complex coding tasks from startβ¦
MAI Thinking 1 Benchmarks π
β€5π₯4π2
Google tests Planning Mode for NotebookLM Video Overviews
Google is testing a planning mode for NotebookLM Video Overviews that lets users review and edit a draft outline before generation. The feature points to tighter editorial control and a possible shift from Veo to Gemini Omni.
h/t @Thomasguka
π #notebooklm @testingcatalog
Google is testing a planning mode for NotebookLM Video Overviews that lets users review and edit a draft outline before generation. The feature points to tighter editorial control and a possible shift from Veo to Gemini Omni.
h/t @Thomasguka
π #notebooklm @testingcatalog
TestingCatalog AI News
Google tests Planning Mode for NotebookLM Video Overviews
Google is testing a Planning Mode for NotebookLMβs Video Overviews, letting users approve a draft plan before Gemini creates the final video.
β€2π2π₯1
Media is too big
VIEW IN TELEGRAM
HERMES π₯: A new Hermes Desktop app from Nous Research is now available on macOS, Windows, and Linux!
Testing time π
Testing time π
π4β€3