Good morning, little goblins
Research story from OpenAI on why gpt loves goblins and other creatures 👹
Spoiler -it comes from SFT and reward signals for “nerdy” personality setting
https://openai.com/index/where-the-goblins-came-from/
Research story from OpenAI on why gpt loves goblins and other creatures 👹
Spoiler -
https://openai.com/index/where-the-goblins-came-from/
OpenAI
Where the goblins came from
How goblin outputs spread in AI models: timeline, root cause, and fixes behind personality-driven quirks in GPT-5 behavior.
Tons of new features in Codex app, and also this really nice dynamic UI
https://chatgpt.com/codex/for-work/
https://chatgpt.com/codex/for-work/
Agent made me do this
https://telegra.ph/Lessons-from-3-days-of-using-goal-on-OpenClaw-05-03
This is the golden advice for any agentic workflow, not only for new “/goal”
I’ve come to something similar by trying and failing, but this is much more structured and useful than I could ever write
I’ve come to something similar by trying and failing, but this is much more structured and useful than I could ever write
5.5 Instant - non-thinking version of 5.5 and a replacement for 5.3 Instant
Incremental bump in performance, model got bit smarter and also more concise - nice for a quick chatting
https://openai.com/index/gpt-5-5-instant/
Incremental bump in performance, model got bit smarter and also more concise - nice for a quick chatting
https://openai.com/index/gpt-5-5-instant/
OpenAI
GPT-5.5 Instant: smarter, clearer, and more personalized
GPT-5.5 Instant updates ChatGPT’s default model with smarter, more accurate answers, reduced hallucinations, and improved personalization controls.
Codex + iOS is next level
serve-sim - streams simulator to a local webpage, so you can open browser page in Codex app and let it do the work.
https://github.com/EvanBacon/serve-sim
serve-sim - streams simulator to a local webpage, so you can open browser page in Codex app and let it do the work.
https://github.com/EvanBacon/serve-sim
TIL both codex and claude read only a fraction of the file, hence hallucinations when you don’t expect it
https://x.com/badlogicgames/status/2052499245903593736
https://x.com/badlogicgames/status/2052499245903593736
New voice models from OpenAI
- gpt realtime 2 - first voice model with reasoning
- gpt realtime translate
- gpt realtime whisper
https://openai.com/index/advancing-voice-intelligence-with-new-models-in-the-api/
- gpt realtime 2 - first voice model with reasoning
- gpt realtime translate
- gpt realtime whisper
https://openai.com/index/advancing-voice-intelligence-with-new-models-in-the-api/
OpenAI
Advancing voice intelligence with new models in the API
Explore new realtime voice models in the OpenAI API that can reason, translate, and transcribe speech, enabling more natural and intelligent voice experiences.
✍1
Well well well
Bun is fully rewritten from Zig to Rust
Passes all tests and closes around 200 issues
Looking forward to reading the details ✍🏻✍🏻✍🏻
https://github.com/oven-sh/bun/pull/30412
Bun is fully rewritten from Zig to Rust
Passes all tests and closes around 200 issues
Looking forward to reading the details ✍🏻✍🏻✍🏻
https://github.com/oven-sh/bun/pull/30412
GitHub
Rewrite Bun in Rust by Jarred-Sumner · Pull Request #30412 · oven-sh/bun
Blog post with details coming soon.
It passes Bun's pre-existing test suite on all platforms (and fixes several memory leaks and flaky tests), the binary size shrinks by 3 MB - 8 MB, the be...
It passes Bun's pre-existing test suite on all platforms (and fixes several memory leaks and flaky tests), the binary size shrinks by 3 MB - 8 MB, the be...
👍1
Agent made me do this
Well well well Bun is fully rewritten from Zig to Rust Passes all tests and closes around 200 issues Looking forward to reading the details ✍🏻✍🏻✍🏻 https://github.com/oven-sh/bun/pull/30412
It took Jarred just 6 days btw
You know who you need to send this to 👀
You know who you need to send this to 👀
Codex in the ChatGPT mobile app is a fully-featured mobile experience for getting work done with Codex. When you connect to any of your machines where Codex is running (whether that’s your laptop, a dedicated Mac mini, or a managed remote environment), the app loads the live state from that environment so you can work fluidly across active threads, approvals, plugins, and project context.
This is more than the ability to remotely control a single task or dispatch new tasks to your computer. From your phone, you can work across all of your threads, review outputs, approve commands, change models, or start something new. Your files, credentials, permissions, and local setup stay on the machine where Codex is operating, while updates flow back to your phone in real time, including screenshots, terminal output, diffs, test results, and approvals.
Under the hood, Codex uses a secure relay layer that keeps trusted machines reachable across devices without exposing them directly to the public internet. That relay also keeps active session state and context synced anywhere you’re signed in with ChatGPT.
https://openai.com/index/work-with-codex-from-anywhere/
OpenAI
Work with Codex from anywhere
Use Codex anywhere with the ChatGPT mobile app. Monitor, steer, and approve coding tasks in real time across devices and remote environments.
TIL - cache prewarm is real
If you want to cut time-to-first-token when using Claude via API - send your system prompt before the user prompt with
Claude writes it to cache, but skips output generation
When user request lands, it’ll hit a warm cache
More on this - https://platform.claude.com/docs/en/build-with-claude/prompt-caching#pre-warming-the-cache
If you want to cut time-to-first-token when using Claude via API - send your system prompt before the user prompt with
max_tokens=0
Claude writes it to cache, but skips output generation
When user request lands, it’ll hit a warm cache
More on this - https://platform.claude.com/docs/en/build-with-claude/prompt-caching#pre-warming-the-cache
Every two weeks I see a new tool, that claims x2/x5/x10/x100 over previous implementations
And still, from my personal experience, the least tools you expose to the model - the better it performs
Our engineering nature is to strive for better tools, but there’s a big roadblock - RL
My belief is that until we solve Continuous Learning in some form, models will continue to perform only on tools from their RL environments
Anyway, looks cool, check it out:
https://github.com/MinishLab/semble
And still, from my personal experience, the least tools you expose to the model - the better it performs
Our engineering nature is to strive for better tools, but there’s a big roadblock - RL
My belief is that until we solve Continuous Learning in some form, models will continue to perform only on tools from their RL environments
Anyway, looks cool, check it out:
https://github.com/MinishLab/semble
GitHub
GitHub - MinishLab/semble: Fast and Accurate Code Search for Agents. Uses ~98% fewer tokens than grep+read
Fast and Accurate Code Search for Agents. Uses ~98% fewer tokens than grep+read - MinishLab/semble
👍2
Composer 2.5 - new model from Cursor
Based on Kimi K2.5, same as Composer 2
Given that all the improvements from previous version came only from RL - very impressive
Based on Kimi K2.5, same as Composer 2
Given that all the improvements from previous version came only from RL - very impressive