Vortex Next Gen Trends

The Ultra-Scale Playbook - a Hugging Face Space by nanotron

HuggingFace released the "Ultra-Scale Playbook"

A free, open-source, book to learn everything about 5D parallelism, ZeRO, fast CUDA kernels, how and why overlap compute & communication – all scaling bottlenecks and tools introduced with motivation, theory, interactive plots from our 4000+ scaling experiments and even NotebookLM podcasters to tag along with you.

- How was DeepSeek trained for $5M only?
- Why did Mistral trained an MoE?
- Why is PyTorch native Data Parallelism implementation so complex under the hood?
- What are all the parallelism techniques and why were they invented?
- Should I use ZeRO-3 or Pipeline Parallelism when scaling and what's the story behind both techniques?
- What is this Context Parallelism that Meta used to train Llama 3? Is it different from Sequence Parallelism?
- What is FP8? how does it compares to BF16?

The largest factor for democratizing AI will always be teaching everyone how to build AI and in particular how to create, train and fine-tune high performance models. In other word making accessible to everybody the techniques that power all recent large language models and efficient training is possibly one of the most essential of them.

huggingface.co

The ultimate guide to training LLM on large GPU Clusters

❤57🔥47👍46🎉46

30K views18:38

Wow, DeepSeek announced Day 0: Warming up for OpenSourceWeek

Starting next week, they'll be open-sourcing 5 repos, sharing sincere progress with full transparency.

These humble building blocks in their online service have been documented, deployed and battle-tested in production.

Daily unlocks are coming soon. No ivory towers - just pure garage-energy and community-driven innovation.

🎉83❤81🔥74👍72

28.6K viewsedited 19:24

Which side are you on?
❤️Robots that look like robots
👍Robots that look like humans

❤166🔥166🎉164👍145

25.9K views21:07

0:38

The Pika neural network has introduced the Pikaswaps feature, which changes objects in a video to any kind of thing. You can replace a pancake with a human face, a dog with an iguana, or your hand with a cyber prosthesis.

Special effects are no longer needed

👍115🔥115❤107🎉105

33K views17:01

0:36

A new android has been created in Norway — NEO Gamma. The mechanical servant from 1X Technologies will do housework and bring coffee to its owners.

Robots do the hard work, not humans

🔥104🎉96👍95❤82

36.9K views20:06

0:39

Anthropic to release Claude Sonnet 3.7 on Feb 26

It’s expected to have step-by-step thinking, never before seen coding capabilities and web search.

The best coding model which powers Cursor and Windsurf is about to get a whole lot better.

Claude 3.7 Sonnet is Anthropic's most intelligent model to date and the first Claude model to offer extended thinking - the ability to solve complex problems with careful, step-by-step reasoning.

Anthropic is the first AI lab to introduce a single model where users can balance speed and quality by choosing between standard thinking for near-instant responses or extended thinking or advanced reasoning.

Claude 3.7 Sonnet is state-of-the-art for coding, and delivers advancements in computer use, agentic capabilities, complex reasoning, and content generation. With frontier performance and more control over speed, Claude 3.7 Sonnet is the ideal choice for powering AI agents, especially customer-facing agents, and complex AI workflows.

Supported use cases: RAG or search & retrieval over vast amounts of knowledge, product recommendations, forecasting, targeted marketing, code generation, quality control, parse text from images, agentic computer use, content generation

Model attributes: Reasoning, Text generation, Code generation, Rich text formatting, Agentic computer use

👍174🎉168🔥159❤149

37.3K viewsedited 18:53

FastRTC: The Real-Time Communication Library for Python

AI models now handle voice/speech yet building with them in Python is very frustrating

FastRTC is here to solve
- Automatic Voice Detection
- Handling WebRTC & the backend for real-time apps
- Calling Phones

Github

huggingface.co

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

❤72🔥72🎉68👍55

33.6K views20:40

DeepSeek makes 2 major announcements

1. Starting today, DeepSeek is offering significant discounts on their API Platform during off-peak hours (16:30-00:30 UTC daily):
• DeepSeek-V3: 50% OFF
• DeepSeek-R1: Massive 75% OFF

This means you can access powerful AI models at a fraction of the cost during these hours. For example, DeepSeek-R1 output cost drops from $2.19 to just $0.550 per 1M tokens!

2. DeepSeek has also released DeepGEMM - an impressive FP8 GEMM library that supports both dense and MoE GEMMs, powering their V3/R1 models.

Key features:
- Up to 1350+ FP8 TFLOPS on Hopper GPUs
- Lightweight with no heavy dependencies
- Fully Just-In-Time compiled
- Core logic at just ~300 lines of code
- Outperforms expert-tuned kernels on most matrix sizes
- Supports dense layout and two MoE layouts

🔥213👍180🎉177❤168

32.5K views20:13

GitHub - deepseek-ai/DualPipe: A bidirectional pipeline parallelism algorithm for computation-communication overlap in DeepSeek…

New announcements from DeepSeek Optimized Parallelism Strategies

1. DualPipe - a bidirectional pipeline parallelism algorithm for computation-communication overlap in V3/R1 training.

2. EPLB - an expert-parallel load balancer for V3/R1.

3. Analyze computation-communication overlap in V3/R1.

GitHub

A bidirectional pipeline parallelism algorithm for computation-communication overlap in DeepSeek V3/R1 training. - deepseek-ai/DualPipe

👍81🎉75🔥69❤64

41.3K views18:10

We’re releasing a research preview of GPT‑4.5—our largest and best model for chat yet. GPT‑4.5 is a step forward in scaling up pre-training and post-training.

GPT-4.5 is out! Knowledge Still Stuck in October 2023, it’s not going to blow your mind, but it might befriend you.

It's more like a personality, communication, and creativity upgrade than a huge intelligence leap. It's like OpenAI is pivoting its base model from "bland assistant" to "AI bestie."

What it does do well:

- OpenAI says it scores 64% on SimpleQA (double GPT-4's score)
- Much better writing with cleaner, better structured, more human-like prose
- Genuinely warmer and more emotionally intelligent (gave me some good advice!)
- Less robotic, more opinionated responses

4.5 is more extroverted, agreeable, and less neurotic than 4o.

It's sometimes worse at following instructions and because it's less sycophantic and more creative.

The model received approximately 10x more computational resources during pre-training compared to GPT-4. Training occurred simultaneously across multiple data centers.

Pricing $75 per million input tokens and $150 per million output tokens – 15-30x more expensive than GPT-4o! This pricing reflects the model's scale and resource requirements.

Performance and Context Generation is noticeably slower than its predecessors, context length remains at 128K tokens. Knowledge cutoff stays at October 2023, which is disappointing for many users.

Functionality Supports Canvas, search, and file uploads. Currently lacks multimodal features like voice mode or video.

Availability:
Already available to Pro users and developers of all API tiers
Coming to Plus subscribers ($20) next week
OpenAI plans to add "tens of thousands of GPUs" next week to expand access

Independent Benchmark Results:
Aider Polyglot Coding Benchmark: Recent tests show that GPT-4.5 Preview significantly outperforms its predecessor but lags behind specialized models:
Claude 3.7 Sonnet with thinking mode (32k tokens) — 65%
Claude 3.7 Sonnet without thinking mode — 60%
DeepSeek V3 — 48%
GPT-4.5 Preview — 45%
ChatGPT-4o — 27%
GPT-4o — 23%

Openai

Introducing GPT-4.5

🔥73❤70👍70🎉63

38.4K views18:22

0:57

Media is too big

The Magnific neural network can change the style of pictures in seconds, while accurately preserving their essence. AI can turn even a simple sketch on paper into a high-quality render or photorealistic picture.

🎉122🔥111👍109❤97

33.3K views19:13

Mercury Access NFT Membership | Exclusive Blockchain Rewards

Want next-level access control that is fast, secure, and scalable? 🚀 Mercury Access by JM Digital is a cutting-edge solution designed to provide seamless authentication, advanced security, and unparalleled access management for businesses and institutions.

✔ Smart & secure access control for any environment
✔ Scalable & flexible to meet your security needs
✔ Seamless integration with modern digital infrastructure
✔ Enhanced authentication & encrypted security measures

🔗 Discover Mercury Access today: https://www.jmdigital.tech/mercury-access

https://vt.tiktok.com/ZSM5LXeq1/

JMD

Get exclusive access to NFT membership benefits with Mercury Access. Enjoy private communities, discounts, and early product launches. Join today!

🔥93👍77❤72🎉68

36.5K views21:36

0:26

AI has learned to play Pokemon. The newest model Claude 3.7 Sonnet was ordered to play Pokemon Red, and it has already beaten the leader of the first stadium.

❤176🎉164🔥156👍154

30.2K views18:10

open-infra-index/202502OpenSourceWeek/day_6_one_more_thing_deepseekV3R1_inference_system_overview.md at main · deepseek-ai/open…

DeepSeek introduced DeepSeek-V3/R1 Inference System Overview

Optimized throughput and latency via:

1. Cross-node EP-powered batch scaling
2. Computation-communication overlap
3. Load balancing

Statistics of DeepSeek's Online Service:
- 73.7k/14.8k input/output tokens per second per H800 node
- Cost profit margin 545%

GitHub

Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation - deepseek-ai/open-infra-index

👍130❤113🔥100🎉96

31.3K views18:56

Cohere Labs Aya Vision - a CohereLabs Collection

Huge VLM release from Cohere for AI is just in

Aya-Vision is a new VLM family based on SigLIP and Aya, and it outperforms many larger models.

> 8B and 32B models covering 23 languages and two new benchmark dataset
> supported by HF transformers from get-go

huggingface.co

Aya Vision is a state-of-the-art family of vision models that brings multimodal capabilities to 23 languages.

❤225🎉216🔥202👍197

27.4K views19:50

Sales AI market map

🔥166👍161🎉148❤136

28.3K views18:59

Anthropic’s recommendations to OSTP for the U.S. AI action plan

Today Anthropic submitted their recommendations to the OSTP for the U.S. AI Action Plan

Anthropic predicts powerful AI systems will appear by late 2026 or early 2027, with intellectual abilities matching Nobel Prize winners, able to autonomously handle digital tasks (text, audio, video, internet browsing), reason independently over hours or weeks, and control physical equipment digitally

They recommend stronger national security actions, including government testing of AI models for security risks, stricter export controls on key chips like the H20, and secure communication channels between AI labs and intelligence agencies

They suggest the government build 50 gigawatts of additional power capacity dedicated to AI by 2027, speed up AI adoption across federal agencies, and improve economic data collection to prepare for AI’s impact on jobs and society

Anthropic

Anthropic is an AI safety and research company that's working to build reliable, interpretable, and steerable AI systems.

🔥200❤187🎉165👍157

26.8K views19:14