Apple's FastVLM + MobileCLIP2 are now live on Hugging Face:
→ Up to 85x faster
→ 3.4x smaller
→ Runs in real time, directly in your browser
→ Even does live video captioning 100% locally
https://huggingface.co/apple
→ Up to 85x faster
→ 3.4x smaller
→ Runs in real time, directly in your browser
→ Even does live video captioning 100% locally
https://huggingface.co/apple
huggingface.co
apple (Apple)
Org profile for Apple on Hugging Face, the AI community building the future.
👍1
The true source of nondeterminism lies in how kernels handle batches. Understanding non-determinism of LLMs.
https://thinkingmachines.ai/blog/defeating-nondeterminism-in-llm-inference/
https://thinkingmachines.ai/blog/defeating-nondeterminism-in-llm-inference/
Thinking Machines Lab
Defeating Nondeterminism in LLM Inference
Reproducibility is a bedrock of scientific progress. However, it’s remarkably difficult to get reproducible results out of large language models.
For example, you might observe that asking ChatGPT the same question multiple times provides different results.…
For example, you might observe that asking ChatGPT the same question multiple times provides different results.…
Google and Coinbase have integrated the x402 protocol into Google's Agentic Payments Protocol (AP2), empowering AI agents to process payments autonomously using stablecoins for micropayments and automation. Demonstrated via Lowe's Innovation Lab, this enables agents to monetize services, pay each other, and handle tasks like shopping and checkout seamlessly.
https://www.coinbase.com/developer-platform/discover/launches/google_x402
https://www.coinbase.com/developer-platform/discover/launches/google_x402
Coinbase
Google Agentic Payments Protocol + x402: Agents Can Now Actually Pay Each Other
Agents can already talk to each other. And now, with x402 within Google’s new AP2, they can pay each other too. Stablecoins make this possible at the speed of code, unlocking micropayments and new models of automation that legacy rails simply can’t support.
Less is More
With only 7M parameters,
TRM obtains 45% test-accuracy on ARC-AGI-
1 and 8% on ARC-AGI-2, higher than most
LLMs (e.g., Deepseek R1, o3-mini, Gemini 2.5
Pro) with less than 0.01% of the parameters
This paper from Samsung fundamentally alters how we design training architecture and required compute.
https://arxiv.org/pdf/2510.04871v1
With only 7M parameters,
TRM obtains 45% test-accuracy on ARC-AGI-
1 and 8% on ARC-AGI-2, higher than most
LLMs (e.g., Deepseek R1, o3-mini, Gemini 2.5
Pro) with less than 0.01% of the parameters
This paper from Samsung fundamentally alters how we design training architecture and required compute.
https://arxiv.org/pdf/2510.04871v1
Ribbit_Token_Letter_June_2025_Confidential_vFinal_Distributed.pdf
3 MB
Token Letter - Ribbit Capital