Ars Dev
1.98K subscribers
70 photos
7 videos
75 links
Hi, I’m Ars! Here I share practical insights on programming and AI 🚀

To learn more JOIN my private community https://www.skool.com/ars-dev-hub-3159/about?ref=71f574f3ce3542eb976d068c3e133e1b

Contact: @ars_kylnyk
Download Telegram
Media is too big
VIEW IN TELEGRAM
A new interview with Ilya Sutskever has recently been published

Ilya co-founded OpenAI and was one of the key minds behind GPT. In 2024, he left OpenAI to start Safe Superintelligence Inc — a company focused on building safe superintelligence with safety at its core


Here are the main points he talked about:

1. Models today can crush benchmarks, ace IQ tests, and solve olympiad-level problems. But they struggle with real-world tasks. The current training approach has hit a ceiling.

2. The "throw more compute at it" formula is exhausted.
What matters now isn't adding more power — it's discovering new training methods. We have enough compute, but it's not delivering the exponential gains we used to see. The focus is shifting to new algorithms and research with existing models.

3. The difference between a model and a human? Humans learn fast from tiny amounts of data. We literally build our worldview from fragments of information and self-correct along the way. AI needs to consume entire knowledge bases and still struggles with context. Bottom line: humans are way more efficient learners right now.

4. Interesting take on AGI: it's not a model that knows everything. It's a model that knows how to learn. When that happens, progress will accelerate dramatically. Why? Millions of AI workers learning like humans, but faster. That's going to shake up the job market hard.

5. AI can't verify its own actions yet. Humans have emotions, intuition, that gut feeling when something's off — it's a feedback system. AI just executes functions. Without this mechanism, it's unreliable.

6. Big models will be rolled out gradually so society can adapt. Just like GPT existed and functioned for 3 years before being shown to the public.

The overall picture? We're not racing toward one massive Skynet. Instead, we're heading toward specialized AIs, each mastering its own domain. Once we crack that, we'll clone millions of copies — and that's when the real shift begins. Some roles will vanish, others will survive.

So what should we do? Learn to work with these tools, not against them. And double down on the skills that can't be automated

watch interview
👍4🔥4👎2
Channel photo updated
DeepSeek introduces open-source V3.2 models with Speciale variant matching Gemini-3.0-Pro on hard reasoning

DeepSeek’s new V3.2 models arrive like a sequel that actually fixes the plot. The setup is simple: developers want open models that think, plan, and act with the precision of top proprietary systems.

The problem is that long-context reasoning and agent workflows usually break when attention costs spike or post-training budgets run thin.

The insight came from studying where open models fall short: slow attention, weak RL signals, and limited agent data. DeepSeek answers with a redesigned attention layer and a scaled reinforcement learning pipeline that treats reasoning as a first-class target.
The standout moment comes from V3.2-Speciale, which reaches gold-level scores on the 2025 IMO, CMO, ICPC, and IOI and matches Gemini-3.0-Pro on complex reasoning.

Key features and results:

• DeepSeek Sparse Attention reduces long-context compute without hurting accuracy.

• Reinforcement learning uses over 10% of pre-training compute to sharpen reasoning.

• Agent data spans 1,800 environments and 85,000 prompts for stronger generalization.

• V3.2-Speciale matches Gemini-3.0-Pro on demanding reasoning benchmarks.

• Open weights on Hugging Face support fine-tuning with LoRA or full training.
🔥5👍1
#WeeklyDigest №1

🔹 OpenAI released GPT-5.1 with 400K context, 128K output, priced at $1.25/$10 per million tokens. The model scores 76.3% on SWE-bench Verified, 88.1% on GPQA Diamond, and 94.0% on AIME 2025.

🔹 Codex also got updated, including the most powerful version yet — GPT-5.1 Codex Max. It scores 77.9% on SWE-Bench Verified. However, it's currently only available in Codex CLI and Codex plugins. They promise to add it to the API soon.

🔹 xAI released Grok 4.1 — the model became much more empathetic and sensitive, with improved creative writing. Though it falls behind GPT-5.1 on major benchmarks and isn't available in the API yet. However, they did add grok-4-1-fast-reasoning and grok-4-1-fast-non-reasoning versions to the API. 2M context window, $0.2/$0.5 per million tokens. Overall, not particularly interesting for programming tasks — waiting for the updated Grok Code.


🔹 One of the most interesting developments: a new data structuring format for working with models is gaining momentum — Token-Oriented Object Notation (TOON). The format surpasses everything in tokenization efficiency: from XML to JSON and even CSV. There are already tons of adapters and converters available online for the new format.

🔹 Cursor in version 2.1 improved Plan Mode, updated the search functionality (now faster and more accurate), and added AI Code Reviews with the option to run automatically on every commit (can be enabled in settings).

🔹 Qoder released a beta version of their IDE for Linux and launched plugins for JetBrains IDEs.

---
P.s: Hey everyone! This is the first edition of our new Weekly Digest — I'll be sharing the most interesting dev and AI news every week. Let me know what you think!
👍3🔥21
#TipsAndTools

Hey frontenders👋, this one's for you: the legendary icon library and toolkit just got even better

It recently updated again and grew to impressive scale — 63,119 icons, 30 styles, a full SVG library, and font ligatures that work like normal text.

What's new:
▫️ Refreshed design with new icon sets

▫️Official NPM packages for React and other frameworks

▫️CDN, SVG packages, fonts — use whatever works for you

▫️Everything rebuilt from scratch, no legacy baggage

Here's the link — enjoy!
👍7🔥2🌚2
I enjoyed Andrej Karpathy latest insight — clean, simple, and actually useful.

He reminds us that LLMs are simulators, not independent entities. And honestly, this shift in thinking changes how you interact with them.

Instead of asking:
“What do you think about xyz?”

Try this:
“What group of people would be best suited to discuss xyz? What would they say?”

There’s no “you” in there. The model doesn’t hold opinions or think things through the way we do. It hasn’t reflected on the topic and formed a stance. When you force it into a “you” framing, it still responds — but it’s essentially adopting a personality vector drawn from training data statistics and simulating it.

That’s fine. It works. But there’s far less mystery here than people assume when they ask questions to “artificial intelligence.”


This is what good prompting advice looks like)
👍3🔥2
This media is not supported in your browser
VIEW IN TELEGRAM
#TipsAndTools

Found an interactive tool for learning TypeScript types: Visual Types.

It’s basically everything from the TypeScript handbook, but way more visual and hands-on. Instead of just reading docs, you click through examples — some using objects, others showing Venn diagrams and charts to illustrate how types work.

And it’s not just about basic types. There’s unknown vs any, conditional types, common patterns, and even mapped types. Pretty comprehensive.

Worth bookmarking if you’re into 👩‍💻📌
Please open Telegram to view this post
VIEW IN TELEGRAM
👍4
#WeeklyDigest №2

🔹 Google launches Workspace Studio, enabling no-code agents that automate tasks across Gmail, Docs, and Sheets

🔹 Google presents Gemini 3 Pro instructions that improve agentic benchmark performance by roughly 5%. Featuring 1M token context window, 64K output, priced at $2-$4/$12-$18 per million tokens, with training data from late 2024 to early 2025. Scores: 91.9% on GPQA Diamond, 37.5% on HLE, 95% on AIME 2025, 76.2% on SWE-bench Verified, 2.439 on LiveCodeBench Pro, 45.1% on ARC-AGI-2, 54.2% on Terminal-Bench 2.0

🔹 Anthropic launched their new model - Claude Opus 4.5. This is the company's flagship model, specifically optimized for development. It offers 200K token context, 32K output, priced at $5/$25 per million tokens, which is more expensive than competitors. That said, the model shows strong (and in some cases, best-in-class) results: 80.9% on SWE-bench Verified, 87.0% on GPQA Diamond, 59.3% on Terminal-Bench 2.0

🔹 Mistral launches Devstral 2, an open-source coding model alongside its first autonomous Vibe CLI agent

🔹 OpenAI reports 320× growth in enterprise reasoning tokens as organizations integrate AI into workflows
🔥5
OpenAI’s new State of Enterprise AI report drops like a reality check for anyone building with frontier models. It shows that enterprises now treat AI less like a shiny tool and more like infrastructure.

The standout moment comes from a single number: reasoning token consumption grows 320× in one year, a clear signal that real workloads now run on AI at scale.

Teams experiment with AI, see early wins, but struggle to turn scattered prompts into dependable systems. The report shows how organizations fix that by shifting to structured workflows like predefined steps, shared context, and consistent execution

The breakthrough appears when companies use these workflows inside engineering, data, and product pipelines. They move from trying AI to relying on it.

Key features and results:

- Projects and Custom GPTs grow 19×, standardizing multi-step tasks.

- ChatGPT Enterprise messages increase , with workers sending 30% more.

- Workers save 40-60 minutes daily through AI-assisted workflows.

- Frontier users send  more messages than typical workers.
👍2🔥2
I found a pretty cool piece of software 😂

Protect your spine: An AI assistant that watches your posture through your webcam 🤨

If you start turning into a shrimp or faceplanting into your monitor, the app will gently (or persistently) remind you to straighten up:
What it does:

• AI analyzes your neck and shoulder angles in real-time through your camera

• When it detects “tech neck syndrome,” you get a notification with recommendations

• Assigns your posture a score (0-100) and tracks your progress over time

• Fully configurable sensitivity and check intervals, so it won’t spam you every second

#TipsAndTools
👍2🔥21
#WeeklyDigest №3

🔹 Cursor releases Debug Mode so agents debug using real runtime logs instead of static guesses

🔹 OpenAI releases GPT-5.2, boosting reasoning, coding reliability, and long-context performance for production agents

🔹 Google launches Code Wiki, an automated system that keeps repo documentation continuously up to date

🔹 NVIDIA releases Nemotron 3, an open model family built for large-scale multi-agent AI systems

🔹 Anthropic rolls out syntax highlighting, prompt suggestions, and a plugin marketplace in Claude Code
🔥2
Media is too big
VIEW IN TELEGRAM
#TipsAndTools

OpenAI just dropped GPT-Image-1.5, and it's genuinely impressive 🔥

The new model follows prompts much more accurately, handles edits without breaking details, and generates images 4× faster than before. It's now free in ChatGPT and available via API.

What I love: you can edit existing images iteratively without losing quality — faces, text, and composition stay intact. Plus there's a dedicated Images panel with preset styles to speed up workflows.

Been testing it for two days now. If you do any work with AI images, definitely worth checking out.

Official prompting guide
1🔥2
Big news 🚀

After months of work, my private developer community is finally live 🥹

Ars Dev Hub

Join now to get:
📚 +5 Practical AI courses
📰 WeeklyDigest 
🛠️ Collection of Tools & Resources
 MondaySync 
💬 Real community
🚀 and more...

💎 First 50 members lock $5/month for lifetime (then $25/m)
💎 Try Risk FREE for 7-days


See you inside 🔥
👍5🔥1
Ars Dev pinned a photo
The 7 skills that matter more than frameworks in 2025-26

If you're learning to code in 2025, don't just chase frameworks.

The game has changed. AI is here, and the developers winning are the ones who know how to think, not just how to code.

I broke down the 7 skills that actually matter now 👇

Read post in ADH Community
2
#WeeklyDigest №1 2026

🔹 Along with the regular GPT-5.2, OpenAI released GPT-5.2 Pro — their most powerful model to date. Costs a whopping $21/168 per million tokens, available via API.

🔹 Right after that, GPT-5.2 Codex dropped — SOTA among closed models for development. Currently only available in the Codex app and CLI, API access promised for early 2026.

🔹 Google launches Gemini 3 Flash, matching Pro-level reasoning while cutting inference cost by almost 50%

🔹...

The complete WeeklyDigest is now available in HUB
🔥2👍1
Andrej Karpathy’s 2025 Year in Review

Andrej Karpathy published his comprehensive year-end retrospective, and it’s packed with insights on where AI actually stands right now.

🔗 Read the full post: karpathy.bearblog.dev/year-in-review-2025/

Key Takeaways:

1. RLHF is out, Verifiable Rewards are in
Reinforcement Learning from Verifiable Rewards has replaced RLHF as the dominant training paradigm. Models now learn from automatically verifiable rewards rather than human feedback—and this shift is what unlocked reasoning capabilities. The year’s biggest breakthrough? Scaling test-time compute.

2. Benchmarks have lost their credibility
LLMs don’t work like human brains. They can be superhuman in some domains while surprisingly weak in adjacent ones. 2025 made it crystal clear: high benchmark scores ≠ AGI.

3. The Cursor era has begun
Cursor and similar tools represent a new layer of LLM applications. AI is no longer just about raw models—it’s about orchestrating API calls, managing context, designing UX, balancing autonomy, and optimizing costs for specific tasks.

4. “Vibe coding” went mainstream
Writing code is now cheap and accessible. Anyone can vibe-code something functional. The democratization is real.

5. Autonomous AI agents have arrived
We’re seeing the first examples of truly capable autonomous agents working directly on your computer—tools like Claude Code. They’re here, and they work.

6. Image and video generation made a huge leap
The progress has been so dramatic that an LLM-powered operating system no longer feels like distant sci-fi.

LLMs turned out to be simultaneously smarter and dumber than expected. But they’re already incredibly useful—and the industry has barely tapped into their potential, maybe 10% at most.


Definitely worth reading in full 👉 karpathy.bearblog.dev/year-in-review-2025/
Today marks exactly one year since the term “vibe coding” entered our vocabulary.

On February 3, 2025, Andrej Karpathy introduced it to the world — and it stuck.

Today at work we will celebrate the anniversary with our favorite agent 😅
🔥8