Ars Dev

I enjoyed Andrej Karpathy latest insight — clean, simple, and actually useful.

He reminds us that LLMs are simulators, not independent entities. And honestly, this shift in thinking changes how you interact with them.

Instead of asking:
“What do you think about xyz?”

Try this:
“What group of people would be best suited to discuss xyz? What would they say?”

There’s no “you” in there. The model doesn’t hold opinions or think things through the way we do. It hasn’t reflected on the topic and formed a stance. When you force it into a “you” framing, it still responds — but it’s essentially adopting a personality vector drawn from training data statistics and simulating it.

That’s fine. It works. But there’s far less mystery here than people assume when they ask questions to “artificial intelligence.”

This is what good prompting advice looks like)

👍3🔥2

1.47K views10:36

Ars Dev

0:33

This media is not supported in your browser

VIEW IN TELEGRAM

#TipsAndTools

Found an interactive tool for learning TypeScript types: Visual Types.

It’s basically everything from the TypeScript handbook, but way more visual and hands-on. Instead of just reading docs, you click through examples — some using objects, others showing Venn diagrams and charts to illustrate how types work.

And it’s not just about basic types. There’s unknown vs any, conditional types, common patterns, and even mapped types. Pretty comprehensive.

Worth bookmarking if you’re into 👩‍💻📌

Please open Telegram to view this post

VIEW IN TELEGRAM

👍4

1.81K viewsedited 12:33

Ars Dev

#WeeklyDigest №2

🔹 Google launches Workspace Studio, enabling no-code agents that automate tasks across Gmail, Docs, and Sheets

🔹 Google presents Gemini 3 Pro instructions that improve agentic benchmark performance by roughly 5%. Featuring 1M token context window, 64K output, priced at $2-$4/$12-$18 per million tokens, with training data from late 2024 to early 2025. Scores: 91.9% on GPQA Diamond, 37.5% on HLE, 95% on AIME 2025, 76.2% on SWE-bench Verified, 2.439 on LiveCodeBench Pro, 45.1% on ARC-AGI-2, 54.2% on Terminal-Bench 2.0

🔹 Anthropic launched their new model - Claude Opus 4.5. This is the company's flagship model, specifically optimized for development. It offers 200K token context, 32K output, priced at $5/$25 per million tokens, which is more expensive than competitors. That said, the model shows strong (and in some cases, best-in-class) results: 80.9% on SWE-bench Verified, 87.0% on GPQA Diamond, 59.3% on Terminal-Bench 2.0

🔹 Mistral launches Devstral 2, an open-source coding model alongside its first autonomous Vibe CLI agent

🔹 OpenAI reports 320× growth in enterprise reasoning tokens as organizations integrate AI into workflows

🔥5

1.56K views09:46

Ars Dev

OpenAI’s new State of Enterprise AI report drops like a reality check for anyone building with frontier models. It shows that enterprises now treat AI less like a shiny tool and more like infrastructure.

The standout moment comes from a single number: reasoning token consumption grows 320× in one year, a clear signal that real workloads now run on AI at scale.

Teams experiment with AI, see early wins, but struggle to turn scattered prompts into dependable systems. The report shows how organizations fix that by shifting to structured workflows like predefined steps, shared context, and consistent execution

The breakthrough appears when companies use these workflows inside engineering, data, and product pipelines. They move from trying AI to relying on it.

Key features and results:

- Projects and Custom GPTs grow 19×, standardizing multi-step tasks.

- ChatGPT Enterprise messages increase 8×, with workers sending 30% more.

- Workers save 40-60 minutes daily through AI-assisted workflows.

- Frontier users send 6× more messages than typical workers.

👍2🔥2

1.06K viewsedited 13:07

Ars Dev

I found a pretty cool piece of software 😂

Protect your spine: An AI assistant that watches your posture through your webcam 🤨

If you start turning into a shrimp or faceplanting into your monitor, the app will gently (or persistently) remind you to straighten up:
What it does:

• AI analyzes your neck and shoulder angles in real-time through your camera

• When it detects “tech neck syndrome,” you get a notification with recommendations

• Assigns your posture a score (0-100) and tracks your progress over time

• Fully configurable sensitivity and check intervals, so it won’t spam you every second

#TipsAndTools

👍2🔥2⚡1

1.39K views15:03

Ars Dev

#WeeklyDigest №3

🔹 Cursor releases Debug Mode so agents debug using real runtime logs instead of static guesses

🔹 OpenAI releases GPT-5.2, boosting reasoning, coding reliability, and long-context performance for production agents

🔹 Google launches Code Wiki, an automated system that keeps repo documentation continuously up to date

🔹 NVIDIA releases Nemotron 3, an open model family built for large-scale multi-agent AI systems

🔹 Anthropic rolls out syntax highlighting, prompt suggestions, and a plugin marketplace in Claude Code

🔥2

1.3K viewsedited 12:48

#TipsAndTools

OpenAI just dropped GPT-Image-1.5, and it's genuinely impressive 🔥

The new model follows prompts much more accurately, handles edits without breaking details, and generates images 4× faster than before. It's now free in ChatGPT and available via API.

What I love: you can edit existing images iteratively without losing quality — faces, text, and composition stay intact. Plus there's a dedicated Images panel with preset styles to speed up workflows.

Been testing it for two days now. If you do any work with AI images, definitely worth checking out.

Official prompting guide

1🔥2

1.57K viewsedited 15:51

Ars Dev

Big news 🚀

After months of work, my private developer community is finally live 🥹

Ars Dev Hub

Join now to get:
📚 +5 Practical AI courses
📰 WeeklyDigest
🛠️ Collection of Tools & Resources
☕ MondaySync
💬 Real community
🚀 and more...

💎 First 50 members lock $5/month for lifetime (then $25/m)
💎 Try Risk FREE for 7-days

See you inside 🔥

👍5🔥1

10.2K viewsedited 11:54

Ars Dev

Ars Dev pinned a photo

21:49

Ars Dev

The 7 skills that matter more than frameworks in 2025-26

If you're learning to code in 2025, don't just chase frameworks.

The game has changed. AI is here, and the developers winning are the ones who know how to think, not just how to code.

I broke down the 7 skills that actually matter now 👇

Read post in ADH Community

⚡2

1.48K views12:48

Ars Dev

#WeeklyDigest №1 2026

🔹 Along with the regular GPT-5.2, OpenAI released GPT-5.2 Pro — their most powerful model to date. Costs a whopping $21/168 per million tokens, available via API.

🔹 Right after that, GPT-5.2 Codex dropped — SOTA among closed models for development. Currently only available in the Codex app and CLI, API access promised for early 2026.

🔹 Google launches Gemini 3 Flash, matching Pro-level reasoning while cutting inference cost by almost 50%

🔹...

The complete WeeklyDigest is now available in HUB

🔥2👍1

1.09K viewsedited 13:03

Ars Dev

Andrej Karpathy’s 2025 Year in Review

Andrej Karpathy published his comprehensive year-end retrospective, and it’s packed with insights on where AI actually stands right now.

🔗 Read the full post: karpathy.bearblog.dev/year-in-review-2025/

Key Takeaways:

1. RLHF is out, Verifiable Rewards are in
Reinforcement Learning from Verifiable Rewards has replaced RLHF as the dominant training paradigm. Models now learn from automatically verifiable rewards rather than human feedback—and this shift is what unlocked reasoning capabilities. The year’s biggest breakthrough? Scaling test-time compute.

2. Benchmarks have lost their credibility
LLMs don’t work like human brains. They can be superhuman in some domains while surprisingly weak in adjacent ones. 2025 made it crystal clear: high benchmark scores ≠ AGI.

3. The Cursor era has begun
Cursor and similar tools represent a new layer of LLM applications. AI is no longer just about raw models—it’s about orchestrating API calls, managing context, designing UX, balancing autonomy, and optimizing costs for specific tasks.

4. “Vibe coding” went mainstream
Writing code is now cheap and accessible. Anyone can vibe-code something functional. The democratization is real.

5. Autonomous AI agents have arrived
We’re seeing the first examples of truly capable autonomous agents working directly on your computer—tools like Claude Code. They’re here, and they work.

6. Image and video generation made a huge leap
The progress has been so dramatic that an LLM-powered operating system no longer feels like distant sci-fi.

LLMs turned out to be simultaneously smarter and dumber than expected. But they’re already incredibly useful—and the industry has barely tapped into their potential, maybe 10% at most.

Definitely worth reading in full 👉 karpathy.bearblog.dev/year-in-review-2025/

732 views13:53

Ars Dev

Today marks exactly one year since the term “vibe coding” entered our vocabulary.

On February 3, 2025, Andrej Karpathy introduced it to the world — and it stuck.

Today at work we will celebrate the anniversary with our favorite agent 😅

🔥8

482 views08:27

About

Blog

Apps

Platform