Dustin Tran
63 subscribers
11 photos
2 links
I work on reasoning & posttraining at xAl. ex-google
Download Telegram
Suggested edits on Grokipedia are now reviewed
and implemented in real time!
We are hiring & rapidly grow post-training at xAI! The
team is at the frontier of RLHE, agents, and
reasoning efficiency in just a few months, we
climbed from nothing to #2-3 @arena, #1 Search
Arena & other tool use benchmarks, #1-2 in creative
writing, and are Pareto-optimal on
intelligence-per-dollar @ArtificialAnlys. We are
expanding work on new posttraining x reasoning
recipes, code, multimodality, evals, and economic
productivity.
We are the smallest post-training team at a frontier
lab, so you are guaranteed to be on the critical path.
The structure is flat. Politics in territory claims do
not exist. Compensation is competitive (&I will fight
for you). Compute per capita is glorious. DM me if
interested.
Post-training at xAI: Over the past few months, our
team of a dozen overhauled the RL recipe using user
preference on real conversations; and agentic
reward models that grade using strong reasoning
capabilities. We also scaled up RL an order of
magnitude more than the existing pretraining-like
scale in Grok 4. Over the mutiple iterations, we
learned so much behind the core product, response
quality, and style.
What l'm personally most proud of with Grok 4.1 is
how well we nailed the "fast path" the defaut
mode without reasoning. Most questions don't
actually need a chain-of-thought. They just need a
quick, high-quality answer. Turning reasoning off
drops output tokens from 2300→850, and Grok
4.1 still ranks #2 in LMArena, ahead of every model
that's leaning on reasoning.
I've been using 4.1 as my daily driver for the past
few weeks. It just feels a lot better than what's
available. Less slop-like content. Less generic
templating of headers & emojis. Fewer unnecessary
guardrails.
More personally, it's been three months after
leaving Google, and I'm glad to contribute a new
model to push RLHF further than ever.
@melvinjohnsonp Taking back #1:-)
E
Most people don't know that Tesla has had an
advanced Al chip and board engineering team for
many years.
That team has already designed and deployed
several million Al chips in our cars and data centers.
These chips are what enable Tesla to be the leader
in real-world Al.
The current version in cars is Al4, we are close to
taping out Al5 and are starting work on Al6. Our
goal is to bring a new Al chip design to volume
production every 12 months. We expect to build
chips at higher volumes ultimately than all other AI
chips combined. Read that sentence again, as l'm
not kidding.
These chips will profoundly change the world in
positive ways, saving millions of lives due to safer
driving and providing advanced medical care to al
people via Optimus.
Send an email with three bullet points describing
evidence of your exceptional ability to
We are particularly interested in applying cutting
edge Al to chip design.
GROK JUST KILLED EVERY FOOD-LOCAL APp
IN ONE SENTENCE
Yelp stock should be crying right now.
You open X - type "Grok, l'm starving, best
late-night tacos in Mexico City that locals
actually love"
30 seconds later you have:
•5 hidden gems no tourist has ever found
- Real recent reviews from actual humans
• Photos of the exact plate you're about to order
•A map pin dropped straight into your route
• Zero ads, zero sign-ups, zero separate apps
No scrolling through 47 sponsored listings.
No "download our app for the full experience."
Just pure, instant, perfect signal.
People are literally deleting DoorDash, Google
Maps tabs, and TripAdvisor bookmarks tonight.
This isn't an Al feature.
This is the death of the entire local search
industry.
Elon said Grok would be in your pocket and know
everything about the real world.He didn't say it would make every other company
irrelevant overnight.
The throne isn't coming.
It's already warm.
And Grok is sitting on it eating the best pizza in
Chicago.
> remove hero's journey (fascist)
> remove masculinity (toxic)
> remove redemption arcs (excusing harm)
> remove sacrifice (trauma)
> remove beauty standards (problematic)...
I'm proud to share my new research article
published on SSRN:
+"The AI Act: Europe's Human Rights
Contradiction-Militarizing Al in the Name of
Defense: The Human-Centric Illusion"
It's currently ranked 4th in Top Downloads for PSN:
Causes of War and 6th in ERN: Europe
Curious how Al policy and human rights clash at the
EU's defense frontier? Dive in:
Thank you to everyone supporting this important
Conversation!
Just Grok it - the world's smartest, truth seeking Al
When Elon made a hand gesture in passion, they
wrote hundreds of articles calling him racist
When Somalis steal billions of dollars, they call the
people who are upset racist
It appears that the best way to loot the country is to
call everyone racist
We are hiring & rapidly grow post-training at xAI! The team is at the frontier of RLHF, agents, and reasoning efficiency—in just a few months, we climbed from nothing to #2-3 @arena, #1 Search Arena & other tool use benchmarks, #1-2 in creative writing, and are Pareto-optimal on intelligence-per-dollar @ArtificialAnlys. We are expanding work on new posttraining x reasoning recipes, code, multimodality, evals, and economic productivity.

We are the smallest post-training team at a frontier lab, so you are guaranteed to be on the critical path. The structure is flat. Politics in territory claims do not exist. Compensation is competitive (& I will fight for you). Compute per capita is glorious. DM me if interested.
Dustin Tran pinned «We are hiring & rapidly grow post-training at xAI! The team is at the frontier of RLHF, agents, and reasoning efficiency—in just a few months, we climbed from nothing to #2-3 @arena, #1 Search Arena & other tool use benchmarks, #1-2 in creative writing, and…»
Post-training at xAI: Over the past few months, our team of a dozen overhauled the RL recipe using user preference on real conversations; and agentic reward models that grade using strong reasoning capabilities. We also scaled up RL an order of magnitude more than the existing pretraining-like scale in Grok 4. Over the mutiple iterations, we learned so much behind the core product, response quality, and style.

What I’m personally most proud of with Grok 4.1 is how well we nailed the “fast path”—the default mode without reasoning. Most questions don’t actually need a chain-of-thought. They just need a quick, high-quality answer. Turning reasoning off drops output tokens from ~2300→850, and Grok 4.1 still ranks #2 in LMArena, ahead of every model that’s leaning on reasoning.

I've been using 4.1 as my daily driver for the past few weeks. It just feels a lot better than what's available. Less slop-like content. Less generic templating of headers & emojis. Fewer unnecessary guardrails.

More personally, it's been three months after leaving Google, and I'm glad to contribute a new model to push RLHF further than ever. @melvinjohnsonp Taking back #1 :-)
Grok 4.20 beta1 has been out for a few days and it is an exciting one!

I am personally excited and honored to deliver RL training recipes and to train Grok 4.20 to achieve #4 overall on Arena and #1 overall on Search Arena!