Agent made me do this

Good morning, little goblins

Research story from OpenAI on why gpt loves goblins and other creatures 👹
Spoiler - it comes from SFT and reward signals for “nerdy” personality setting

https://openai.com/index/where-the-goblins-came-from/

OpenAI

Where the goblins came from

How goblin outputs spread in AI models: timeline, root cause, and fixes behind personality-driven quirks in GPT-5 behavior.

47 views06:08

Agent made me do this

Tons of new features in Codex app, and also this really nice dynamic UI

https://chatgpt.com/codex/for-work/

40 views06:12

Agent made me do this

49 views06:22

Agent made me do this

https://telegra.ph/Lessons-from-3-days-of-using-goal-on-OpenClaw-05-03

Telegraph

Lessons from 3 days of using /goal on OpenClaw

The core insight: /goal isn’t a “do my ticket” button — it’s a constraint workflow for keeping the project on course. Key takeaways: 1. Don’t fire and forget. Pasting the goal and walking away produces garbage. Cold-start runs hallucinate problems; warmed…

✍1

47 views18:49

Agent made me do this

https://telegra.ph/Lessons-from-3-days-of-using-goal-on-OpenClaw-05-03

This is the golden advice for any agentic workflow, not only for new “/goal”
I’ve come to something similar by trying and failing, but this is much more structured and useful than I could ever write

48 viewsedited 06:16

Agent made me do this

5.5 Instant - non-thinking version of 5.5 and a replacement for 5.3 Instant

Incremental bump in performance, model got bit smarter and also more concise - nice for a quick chatting

https://openai.com/index/gpt-5-5-instant/

OpenAI

GPT-5.5 Instant: smarter, clearer, and more personalized

GPT-5.5 Instant updates ChatGPT’s default model with smarter, more accurate answers, reduced hallucinations, and improved personalization controls.

61 views17:46

Agent made me do this

Anthropic partnered up with SpaceX to use all compute capacity at Colossus 1 data center, which results in doubled rate limits and removed peak hours limit

They’re talking about 300 megawatts and 220k GPUs btw

👍1

68 views18:45

Agent made me do this

Codex + iOS is next level

serve-sim - streams simulator to a local webpage, so you can open browser page in Codex app and let it do the work.

https://github.com/EvanBacon/serve-sim

76 views07:47

Agent made me do this

TIL both codex and claude read only a fraction of the file, hence hallucinations when you don’t expect it

https://x.com/badlogicgames/status/2052499245903593736

76 views08:11

Agent made me do this

New voice models from OpenAI

- gpt realtime 2 - first voice model with reasoning
- gpt realtime translate
- gpt realtime whisper

https://openai.com/index/advancing-voice-intelligence-with-new-models-in-the-api/

OpenAI

Advancing voice intelligence with new models in the API

Explore new realtime voice models in the OpenAI API that can reason, translate, and transcribe speech, enabling more natural and intelligent voice experiences.

✍1

93 views08:14

Agent made me do this

Well well well
Bun is fully rewritten from Zig to Rust
Passes all tests and closes around 200 issues
Looking forward to reading the details ✍🏻✍🏻✍🏻

https://github.com/oven-sh/bun/pull/30412

GitHub

Rewrite Bun in Rust by Jarred-Sumner · Pull Request #30412 · oven-sh/bun

Blog post with details coming soon.
It passes Bun's pre-existing test suite on all platforms (and fixes several memory leaks and flaky tests), the binary size shrinks by 3 MB - 8 MB, the be...

👍1

92 views12:32

Agent made me do this

Well well well Bun is fully rewritten from Zig to Rust Passes all tests and closes around 200 issues Looking forward to reading the details ✍🏻✍🏻✍🏻 https://github.com/oven-sh/bun/pull/30412

It took Jarred just 6 days btw
You know who you need to send this to 👀

90 views12:37

Agent made me do this

Finally!

🔥1

97 views20:27

Agent made me do this

Codex in the ChatGPT mobile app is a fully-featured mobile experience for getting work done with Codex. When you connect to any of your machines where Codex is running (whether that’s your laptop, a dedicated Mac mini, or a managed remote environment), the app loads the live state from that environment so you can work fluidly across active threads, approvals, plugins, and project context.

This is more than the ability to remotely control a single task or dispatch new tasks to your computer. From your phone, you can work across all of your threads, review outputs, approve commands, change models, or start something new. Your files, credentials, permissions, and local setup stay on the machine where Codex is operating, while updates flow back to your phone in real time, including screenshots, terminal output, diffs, test results, and approvals.

Under the hood, Codex uses a secure relay layer that keeps trusted machines reachable across devices without exposing them directly to the public internet. That relay also keeps active session state and context synced anywhere you’re signed in with ChatGPT.

https://openai.com/index/work-with-codex-from-anywhere/

OpenAI

Work with Codex from anywhere

Use Codex anywhere with the ChatGPT mobile app. Monitor, steer, and approve coding tasks in real time across devices and remote environments.

125 viewsedited 20:32

Agent made me do this

And cherry on top - computer use from mobile app 🤌

👍2

133 views20:35

Agent made me do this

TIL - cache prewarm is real
If you want to cut time-to-first-token when using Claude via API - send your system prompt before the user prompt with

max_tokens=0

Claude writes it to cache, but skips output generation
When user request lands, it’ll hit a warm cache

More on this - https://platform.claude.com/docs/en/build-with-claude/prompt-caching#pre-warming-the-cache

110 viewsedited 08:20

Agent made me do this

AI is technology, not a product

https://daringfireball.net/2026/05/ai_is_technology_not_a_product

Daring Fireball

AI Is Technology, Not a Product

It’s not even a feature. It’s just technology.

67 views07:38

Agent made me do this

Every two weeks I see a new tool, that claims x2/x5/x10/x100 over previous implementations
And still, from my personal experience, the least tools you expose to the model - the better it performs

Our engineering nature is to strive for better tools, but there’s a big roadblock - RL
My belief is that until we solve Continuous Learning in some form, models will continue to perform only on tools from their RL environments

Anyway, looks cool, check it out:
https://github.com/MinishLab/semble

GitHub

GitHub - MinishLab/semble: Fast and Accurate Code Search for Agents. Uses ~98% fewer tokens than grep+read

Fast and Accurate Code Search for Agents. Uses ~98% fewer tokens than grep+read - MinishLab/semble

👍2

71 views07:59

Agent made me do this

Composer 2.5 - new model from Cursor
Based on Kimi K2.5, same as Composer 2

Given that all the improvements from previous version came only from RL - very impressive

109 views16:55

Agent made me do this

Mostly agree.
Show this to me one year ago, and I’d never believe you 🥲

33 views08:49

About

Blog

Apps

Platform