Ahmad
ramping up on a project and closing 100+ chrome tabs is an absolute high

feels like the weight of the world just came off my shoulders and i am free lol
tweet
Offshore
Photo
Ahmad
> be OpenAI
> hire a small army of ex-Facebook ad and monetization people
> literally has a Slack channel just for ex-Meta staff
> brings in the full “targeted ads” playbook

> launch “Pulse”
> reads your chats while you sleep
> remembers you wanna visit Bora Bora
> knows your kid is 6 months old and
> “thinks” of your baby milestones
> suggests developmental toys next

> “it's for your convenience 🥺
> actually laying the groundwork for targeted ads using memory

> internal memo: some people already think ChatGPT shows ads
> OpenAI staff: “might as well then”

> congrats, you’re back in the Facebook era
> except this time, you’re training the algo yourself
tweet
Offshore
Photo
Ahmad
> youʼre OpenAI
> hire a small army of ex-Meta ad and monetization people
> a Slack channel just for ex-Facebook staff
> brings in the full “targeted ads” playbook

> launch a browser
> users install it, and OpenAI collects personalized, granular data at scale
> it’s a browser-shaped surveillance device
> it’s a mapping machine of your workflows
> itʼs a reverse-engineering tool for the internetʼs data pipelines, deployed at scale via their users

> launch Sora 2
> a TikTok‑style social network
> infinite AI-generated video feed
> you create or remix clips, upload your face, become the cameo star
> every scroll, like, remix is another data point, another ad signal
> their model learns exactly what hooks you and dials up the dopamine
> you’re not just watching, you’re training their algorithm for better ad targeting
> viral videos driven by your input + their algorithm = your attention refined into $$$
> “your feedback helps us improve the experience” (yeah, for advertisers)

> launch “Pulse”
> reads your chats while you sleep
> remembers you wanna visit Bora Bora
> knows your kid is 6 months old and
> “thinks” of your baby milestones
> suggests developmental toys next
> “it's for your convenience”
> actually laying the groundwork for targeted ads using memory

> internal memo: some people already think ChatGPT shows ads
> OpenAI staff: “might as well then”

> congrats, you’re back in the Facebook era
> except this time, you’re training the algo yourself

> Buy a GPU
> run your LLMs locally
> reject adware LLMs before it’s too late
tweet
Ahmad
dear algorithm,

every other post on my timeline

is a reply by someone i follow

to a tweet i have zero interest in

please stop
tweet
Offshore
Photo
Ahmad
here is my twitter growth strategy: https://t.co/HhA6C07zPZ

here is my twitter growth strategy: https://t.co/luJa9ihS2n
- Min💙
tweet
Offshore
Photo
Ahmad
Buy a GPU, The Movement, will do its best in the coming years to lobby and fix this

https://t.co/nbxeGoDA3r
- @levelsio
tweet
Offshore
Photo
Ahmad
RT @TheAhmadOsman: My house has 33 GPUs.

> 21x RTX 3090s
> 4x RTX 4090s
> 4x RTX 5090s
> 4x Tenstorrent Blackhole p150a

Before AGI arrives:

Acquire GPUs.

Go into debt if you must.

But whatever you do, secure the GPUs. https://t.co/8U89OStknt
tweet
Offshore
Photo
Ahmad
RT @TheAhmadOsman: own your infra, the cloud is a HUGE scam https://t.co/C2exwlkwIe
tweet
Offshore
Photo
Ahmad
RT @TheAhmadOsman: > be us
> Larry & Sergey
> at Stanford with a crawler and a dream
> accidentally organize the entire internet
> call it Google
> build search, email, maps, docs, OS, phones, browser, car, satellite, thermostat, AI lab, TPU farm, and quantum computer

> 2025
> everyone talking about AGI
> OpenAI: “we need data, sensors, feedback, and scale”
> us: staring at Google Maps, YouTube, Gmail, Android, Waymo, Pixel, Fitbit, Docs, Calendar, Street View, and Earth Engine
> "damn. guess we already did that."

> YouTube: 2.6M videos/day
> Android: 3B phones, streaming sensor data 24/7
> Gmail: 1.8B inboxes of human priors
> Search: global-scale RLHF
> Waymo: 71M miles of real-world self-driving footage
> Google Earth: modeled the entire planet
> also your calendar

> people training LLMs on books and PDFs
> we train on humanity
> every click, swipe, tap, misspelled search, scroll, and bookmark
> feedback loop from hell (or heaven)
> depends who you ask

> OpenAI: “we need $100B for GPUs”
> us: already built TPUs
> custom silicon
> datacenters pre-co-located with planetary data lakes
> no egress, no latency
> just vibes and FLOPs

> coders: fine-tuning on GitHub repos
> us: 2 BILLION lines of internal code
> labeled, typed, tested
> every commit is a training signal
> Code LLMs dream of being our monorepo

> AGI recipe?
> multimodal perception
> real-world feedback
> giant codebase
> scalable compute
> alignment signals
> embodied sensors
> user data for days
> yeah we’ve had that since like 2016

> no investor decks
> no trillion-dollar hype rounds
> just a 25-year accidental simulation of Earth
> running in prod

> OpenAI raises $1T to build AGI
> investors call it revolutionary
> us: quietly mapping 10M new miles in Street View
> syncing another 80PB of Earth imagery
> collecting another year of Fitbit biosignals
> enjoy your foundation model
> we OWN the foundation

> people: “but Google is fumbling”
> true
> we’re fumbling in 120 countries simultaneously
> with the greatest compute footprint and research team on Earth
> fumble hard enough and you loop back into winning

> AGI?
> we don’t need to build it
> it’s already inside the building
> powered by Chrome tabs and doc revisions

> mfw we spent 20 years indexing reality
> mfw our data is so good it scares us
> mfw the only thing stopping us from AGI is a meeting between four VPs and one confused lawyer

> call it research
> call it scale
> call it “planetary simulation-as-a-service”
> we call it Tuesday
tweet
Ahmad
RT @TheAhmadOsman: - you are
- a random CS grad with 0 clue how LLMs work
- get tired of people gatekeeping with big words and tiny GPUs
- decide to go full monk mode
- 2 years later i can explain attention mechanisms at parties and ruin them

- here’s the forbidden knowledge map
- top to bottom, how LLMs *actually* work

- start at the beginning
- text → tokens
- tokens → embeddings
- you are now a floating point number in 4D space
- vibe accordingly

- positional embeddings:
- absolute: “i am position 5”
- rotary (RoPE): “i am a sine wave”
- alibi: “i scale attention by distance like a hater”

- attention is all you need
- self-attention: “who am i allowed to pay attention to?”
- multihead: “what if i do that 8 times in parallel?”
- QKV: query, key, value
- sounds like a crypto scam
- actually the core of intelligence

- transformers:
- take your inputs
- smash them through attention layers
- normalize, activate, repeat
- dump the logits
- congratulations, you just inferred a token

- sampling tricks for the final output:
- temperature: how chaotic you want to be
- top-k: only sample from the top K options
- top-p: sample from the smallest group of tokens whose probabilities sum to p
- beam search? never ask about beam search

- kv cache = cheat code
- saves past keys & values
- lets you skip reprocessing old tokens
- turns a 90B model from “help me I’m melting” to “real-time genius”

- long context hacks:
- sliding window: move the attention like a scanner
- infini attention: attend sparsely, like a laser sniper
- memory layers: store thoughts like a diary with read access

- mixture of experts (MoE):
- not all weights matter
- route tokens to different sub-networks
- only activate ~3B params out of 80B
- “only the experts reply” energy

- grouped query attention (GQA):
- fewer keys/values than queries
- improves inference speed
- “i want to be fast without being dumb”

- normalization & activations:
- layernorm, RMSnorm
- gelu, silu, relu
- they all sound like failed Pokémon
- but they make the network stable and smooth

- training goals:
- causal LM: guess the next word
- masked LM: guess the missing word
- span prediction, fill-in-the-middle, etc
- LLMs trained on the art of guessing and got good at it

- tuning flavors:
- finetuning: new weights
- instruction tuning: “please act helpful”
- rlhf: reinforcement from vibes and clickbait prompts
- dpo: direct preference optimization — basically “do what humans upvote”

- scaling laws:
- more data, more parameters, more compute
- loss goes down predictably
- intelligence is now a budget line item

- bonus round:

- quantization:
- post-training quantization (PTQ)
- quant-aware training (QAT)
- models shrink, inference gets cheaper
- gguf, awq, gptq — all just zip files with extra spice

- training vs inference stacks:
- deepspeed, megatron, fschat — for pain
- vllm, tgi, tensorRT-LLM — for speed
- everyone has a repo
- nobody reads the docs

- synthetic data:
- generate your own training set
- model teaches itself
- feedback loop of knowledge and hallucination
- welcome to the ouroboros era

- final boss secret:
- you can learn *all of this* in ~2 years
- no PhD
- no 10x compute
- just relentless curiosity, good bookmarks, and late nights

- the elite don’t want you to know this
- but now that you do
- choose to act
- start now
- build the models
tweet