The day came when I crossed the line
Yesterday I paid for the Claude Max subscription at $100 - the one that supposedly gives you the equivalent of $1,354 in API usage. No idea how to verify that, but I'd been hitting the limits every day, and pretty fast at that.
The most noticeable shift happened after the period of increased limits a couple weeks ago, when Anthropic pulled off their little act of unprecedented generosity. During that window, working with agents felt exactly the way it should - I'd actually manage to make meaningful progress on my tasks before hitting the wall. Codex, for what it's worth, feels exactly like that all the time.
But once the inflated (ambitions) limits went back to normal, the amount of work Claude could get through in a single session became genuinely depressing.
The new 5x limits are looking good so far - we'll see how it holds up in practice.
One thing worth noting: after switching to Max, the Sonnet model gets tracked on a separate progress bar. Haven't quite figured out why they did it that way, but I'll keep watching.
<written by a human being>
Yesterday I paid for the Claude Max subscription at $100 - the one that supposedly gives you the equivalent of $1,354 in API usage. No idea how to verify that, but I'd been hitting the limits every day, and pretty fast at that.
The most noticeable shift happened after the period of increased limits a couple weeks ago, when Anthropic pulled off their little act of unprecedented generosity. During that window, working with agents felt exactly the way it should - I'd actually manage to make meaningful progress on my tasks before hitting the wall. Codex, for what it's worth, feels exactly like that all the time.
But once the inflated (ambitions) limits went back to normal, the amount of work Claude could get through in a single session became genuinely depressing.
The new 5x limits are looking good so far - we'll see how it holds up in practice.
One thing worth noting: after switching to Max, the Sonnet model gets tracked on a separate progress bar. Haven't quite figured out why they did it that way, but I'll keep watching.
I spent a whole week being dumb and launching the review agent manually!
And only today it hit me that Claude Code and Codex can do this autonomously. Even though I knew that running subagents (spinning up a new session that runs in the background of an active session) was possible - and I watch it happen all the time - it just hadn't clicked that I could configure their launch myself, exactly the way I need.
We're moving toward AI agent orchestration, which I'm diving deeper and deeper into, but for now let's stick to the "simple" practical stuff.
First, I had locked in my agent instructions that after creating a PR in the repository (a branch with code changes), it needed to wait for a review of that branch to be completed - the code review process. And I was manually kicking off the review in a new session, passing it the right PR branch, then coming back to the code agent with the results and demanding fixes.
But then I noticed that Superpowers (the Claude Code one) - in one of the sessions - had launched the review agent on its own and waited for its results! And that's when it hit me: I need to update the instructions to explicitly spell out that a subagent should be launched for the review, along with the rules for working with it.
Now the agent hands me back a fully completed task result, code review included.
Oh god, I love AI!
<written by a human being>
And only today it hit me that Claude Code and Codex can do this autonomously. Even though I knew that running subagents (spinning up a new session that runs in the background of an active session) was possible - and I watch it happen all the time - it just hadn't clicked that I could configure their launch myself, exactly the way I need.
We're moving toward AI agent orchestration, which I'm diving deeper and deeper into, but for now let's stick to the "simple" practical stuff.
First, I had locked in my agent instructions that after creating a PR in the repository (a branch with code changes), it needed to wait for a review of that branch to be completed - the code review process. And I was manually kicking off the review in a new session, passing it the right PR branch, then coming back to the code agent with the results and demanding fixes.
But then I noticed that Superpowers (the Claude Code one) - in one of the sessions - had launched the review agent on its own and waited for its results! And that's when it hit me: I need to update the instructions to explicitly spell out that a subagent should be launched for the review, along with the rules for working with it.
Now the agent hands me back a fully completed task result, code review included.
Oh god, I love AI!
Today's topic is, of course, the new Opus 4.7.
Honestly, I haven't tried it yet - just got back from my Friday grocery run for the week and writing this post is literally the first thing I'm doing. But I'll definitely run a few tasks to test the fresh model today.
What I've been thinking about is how many more tokens it's going to devour. For some reason they show us these tables where one model version beats another by 0.2% on some synthetic benchmark. What does that even mean in practice?
Honestly, I couldn't care less about fractions of a percent on this or that test. Obviously the next model is supposed to be smarter than the previous one and handle tasks better - which in practice means fewer corrections on my end.
What actually interests me is how much faster it'll burn through my limits, because I'm paying for those out of my own pocket. I'd honestly be way happier seeing comparison tables on model costs rather than model smartness.
Have you already had a chance to test the new model?
<written by a human being>
Honestly, I haven't tried it yet - just got back from my Friday grocery run for the week and writing this post is literally the first thing I'm doing. But I'll definitely run a few tasks to test the fresh model today.
What I've been thinking about is how many more tokens it's going to devour. For some reason they show us these tables where one model version beats another by 0.2% on some synthetic benchmark. What does that even mean in practice?
Honestly, I couldn't care less about fractions of a percent on this or that test. Obviously the next model is supposed to be smarter than the previous one and handle tasks better - which in practice means fewer corrections on my end.
What actually interests me is how much faster it'll burn through my limits, because I'm paying for those out of my own pocket. I'd honestly be way happier seeing comparison tables on model costs rather than model smartness.
Have you already had a chance to test the new model?
Media is too big
VIEW IN TELEGRAM
Desktop App Architecture Starts with One Question
<written by a human being>
I can't believe what just happened...
My video I posted yesterday... was edited by AI!
Guys, it's so over and at the same time it's so cool!
First of all, there's a thing called programmatic video editing. I didn't know it existed! In simple terms, it's code that animates motion graphics (things that move in a video) like it does on websites. Basically, we're utilizing web technology for video creation.
Where to get these motion graphics, though? You still need to design them, right? Well, again, we can use modern programming languages to build visual things like we do on the web.
And code also decides when to pop up the animation, when to switch the scene, when to pop the caption.
Don't get me wrong - it's still an enormous amount of work. But! We now have freaking AI agents that can build all this code for us!
And that's exactly what I did...
You can find my video here (just scroll my timeline) or go to my YouTube channel and watch the last short video. Captions, transitions, motion graphics and animations - made by AI.
I'm still in shock...
<written by a human being>
What AI does really well and fast - data processing.
The custom ERP system I built for a client on top of no-code solutions occasionally needs investigations because of data errors. Users entered something wrong, the backend processing didn't fire correctly, database indexes didn't update in time.
And other problems that come from a very specific nuance that's not sitting on the surface but requires detailed, thorough analysis of the entire chain of data update operations, tracking down the root cause, and then working to restore order.
Easy enough to do at "low RPM," but when there are several hundred tables and several hundred thousand records in them, the task turns into hours of digging the culprit out of the sand.
Turns out, if you feed Claude Opus (in many cases Sonnet is enough) or Codex the table exports, the data structure export - or better yet, give the agent an API with direct access to the data in those tables - solving these things becomes pleasant, fast, and happens in plain language rather than in SQL queries or pipelines inside a no-code tool.
Of course, don't forget to check the result. But after a couple rounds of corrections and saving things to memory and agent instructions, you've got a real personal data analyst in your hands that knocks these out one-two.
Recommend rolling this out in any business that has at least one table with data.
<written by a human being>
AI agents, just like people, love structure. If everything is clear and systematic, the agents will work the same way.
For a while I was tracking project progress in markdown files of the same name, which is generally pretty convenient - drop a link to the file in the agent instructions and every session the agent appends its log there and always knows what's been done.
But in complex projects - like when you're building information systems - those files balloon fast and start eating a ton of tokens right at the start of a session just to get the agent up to speed on the context of what needs doing.
Or you can hook up a human-grade task manager via MCP and enjoy life. I really took a liking to Linear for this - AI agents work with it super smoothly.
At the planning stage, ask the agent to create all the tasks there, link them with dependencies (some can't be picked up until blockers are closed), milestones, descriptions, labels, and all the other good stuff that helps you actually navigate the project.
PRs from the repo plug in natively too, which is great - you can clearly trace exactly what code changes were made for a given task.
And also ask the agent to write the task result in the comments, so that history sticks around for future generations.
Media is too big
VIEW IN TELEGRAM
Your "Desktop" Apps Are Just a Browser in Disguise
<written by a human being>
Ever noticed how at some point AI starts approaching the intelligence level of a loaf of bread?
This happens because of context window overflow. If you work with agents, you know exactly what I'm talking about. If not - imagine your head at the end of a very intense, information-packed workday. So much information that producing anything with your cognitive machinery becomes really hard.
AI models work in a similar way. And roughly like you need a good night's sleep to start fresh in the morning, AI needs its context window cleared.
Recent versions already have a built-in auto-compacting feature for the context window, but by the time it kicks in, it's usually already too late - intelligence efficiency has degraded badly by that point, which means you'll be burning more tokens to get things done.
Better to do it way ahead of time, don't let the context run up to the limit. But how do you know how much context is already used? Claude has an option to configure a status bar with a visual progress bar specifically for this. The setup command:
/statusline show model name and context percentage with a progress bar
<written by a human being>
I started working on creating an educational series in micro-learning format - using AI tools, obviously. And already from the start I can say: prep work, or pre-production, is the key moment.
It all starts with an idea and a script - and here, yeah, AI-slop won't cut it. The script was written by a professional screenwriter, and you can clearly feel that when you read it.
But after that comes this massive layer of work that someone unprepared like me would find pretty damn hard to get through without the right experience. And that's where I called in Claude - to prepare absolutely everything needed and be my assistant while directing the videos.
The AI agent created the file structure, helps with artifact organization - for example, sorts the photo references and character voice recordings I upload into the right folders, keeps a registry of all this stuff, chops the script into chunks to simplify generation, writes prompts for generators, tells me what to do next.
We'll see the result, of course, but what would I do without it...
<written by a human being>
I never get tired of saying this, but the beauty of working with AI is that you can automate even the things you wouldn't have dared to automate not that long ago - because they just weren't worth the time or didn't bring enough value to justify the resources spent on them.
I wrote recently about how important it is to refresh the context window, to not let it overflow when working with AI agents. And to do that, we need to brief the next agent on what's been happening in the current session.
Yeah, we have agent instructions - but that's a huge document, and it tends to grow even bigger during a project. Feeding it into every new session would be pointless, because you don't always need it. Sometimes you just need to carry the conversation chain forward from where you left off and save tokens doing it.
So yesterday I decided to shorten my path to getting a handoff prompt for the next session - so I'm not writing it manually every time, not explaining what I want it to do.
I created a handoff-prompt skill that I can call with a simple slash command in the terminal, and now instead of crafting a request and typing it out in full at the end of every session when I want to reset the context so the agent can write a prompt for the next agent — I just press literally four keys: slash, the first letter of the command (h), Tab, and Enter.
Attaching the skill if anyone needs it.
https://github.com/sidorovanthon/handoff-prompt.git
<written by a human being>
Well, if you think that AI nowadays can make a video for you, you are wrong.
It took me almost a week of non-stop work with several sleepless nights to get somewhat final renders of 3 vertical videos 3.5 minutes total length.
Yes, AI generates shots. But what shots do you need to generate? How should they look? What scenes do you want to create? Characters, locations, script as the essence of storytelling. What about sound design? Montage?
Still a ton of work on the human part. And we are still far away from the point when everything is done by AI. More room for artistic people though. And if you have a good taste, you will thrive in this day and age.
Media is too big
VIEW IN TELEGRAM
Desktop App Security: What Web Devs Don't Know
<written by a human being>
Something strange happened - and completely unexpected for me.
I got offered a sponsored video deal. And yeah, of course, the first thing I did was everything I could to make sure it wasn't a fake or another scam scheme - which I've been pretty used to for a while now, and those kinds of emails usually get spam-blocked with the sender reported within a second. At minimum one or two every single day.
But this one had zero phishing links - even in the second email! Nothing that would require any action on my part that could somehow compromise my info or anything else. The sender's address was on the domain of the actual service they're asking me to promote. And the service itself is real, even though I'd never heard of it before.
In the end I threw the whole thing into ChatGPT to do some research, and it confirmed the request looks legit.
And now, attention - the question. Should I start selling my ass already? Or is that the wrong path? What would you do?
<written by a human being>
Video stuff is getting serious.
I found a tool that can edit video! And I'm not just talking about adding animations and b-roll, which I'm already doing with Remotion - wrote about it recently, and you can see the results in my latest videos.
But it can also do what I'm still doing by hand right now - cut out silence and bad takes!
The thing works alongside ElevenLabs, which handles audio transcription - and that's the key piece when it comes to video work. But you can also hook it up to the free Whisper model, which I'm already using for exactly that purpose.
I haven't tried it in action yet, but tomorrow I'm planning to do it live - I'll be streaming the whole process from setup to final edit on my YouTube, Twitch, and Kick channels. If you're curious, come hang out at the stream, roughly
09:00 Bangkok / 02:00 UTC / 22:00 ET (prev day) / 23:00 CET.<written by a human being>
Yesterday during a long vibe-coding session for a home accounting system, Claude and I evolutionarily arrived at a new structure for managing project context.
I've already gotten used to vibe-coding with Superpowers, which helps you first thoroughly break down the task bone by bone, plan its execution, and only after that fire off agents to actually write the code.
This works beautifully in isolation, when you need to one-shot implementation. But since my project is very long and complex, with many sessions, I keep tasks in Linear. And I noticed that every time I started asking Superpowers NOT to write the spec plan as a separate document, but to immediately create tasks and detail the plan in their descriptions.
It turns out to be super convenient, because in a new session the agent reads the instructions, understands that it needs to go to Linear and take the next task into work - and the description is already there, ready. And you don't have to drag some rudimentary document from session to session, everything is clearly captured in the task manager.
Tomorrow I'll write about how I plan to go even further and eliminate another batch of project documents.
<written by a human being>
Yesterday's stream didn't go as planned: usually in a couple hours I fully finish editing 1 video and start working on a second. With Remotion, which I've been using for the last videos, I cut that time by about three times: in one stream I'd manage to get three videos ready for final editing with animation, and the AI agent handled the rest.
Yesterday I was planning to set up a new editing system, but the task wasn't just to replace Remotion - I want to fully automate the whole video editing process from start to finish.
I feed in raw talking-head footage - and get back a publish-ready video with clean audio, color correction, properly cut (meaning no pauses, silence, or bad takes), ready motion-graphics animations edited to context, plus captions and background music.
Two hours of streaming wasn't enough to set up this beast. And honestly a whole day wasn't enough either, so I'm continuing today, and yesterday's video never went up.
Although the first part is already working - and yeah, it genuinely fully automatically cleans audio, cuts the video, transcribes, does color grading and renders a clip ready for final editing. Pure magic.
<written by a human being>
The new ChatGPT 5.5 is topping all the charts in image generation, and rightfully so.
But what about practical tasks? Since I'm constantly using AI for project management these days, I decided to test the new model on a fairly trivial (by today's standards) task - parse two hour-long client meeting transcripts, review the website and a previous project plan, and based on all that input, put together a launch plan for a new project, essentially a todo list with tasks.
I usually give this kind of work to Claude and he handles it effortlessly, delivering a nearly perfect result I can use right away.
But Codex didn't exactly shine here. It invented extra tasks, produced some strange hallucinations that clearly fell outside the project scope and anything that had actually been discussed on the calls.
It took me 5-6 rounds of corrections and explanations of exactly what it got wrong. I even had to ask it to spin up an independent agent to check alignment with the task and source materials - which basically just repeated the original task all over again.
In the end, I still never got a working version of the project plan. Meanwhile, it handles coding tasks brilliantly. But when it comes to cognitive tasks, Claude's models still have the edge for now.
<written by a human being>
In today's time crunch, I hate stepping away from the computer - what if Claude finishes its work and needs my input!
It's genuinely annoying. You walk away hoping to come back and review the results, only to find that 30 seconds after launching, the agent hit a blocker and needs your call.
But there's a fix! Notifications, configurable via hooks! I asked Claude to set up hook-based notifications that send a signal to my system - for when I'm working away from the AI agent console - and to my ntfy mobile app, for when I've stepped away from the computer entirely.
ntfy, by the way, is completely free and dead simple - you just subscribe to the notifications you want and they work. Exactly how I like things.
So now, whenever Claude needs my attention, I always know about it.
Also - these are the only notifications, besides banking alerts, that are turned on on my phone.
❤1