Software Engineer Updates
250 subscribers
3 photos
5 files
3.81K links
Download Telegram
https://www.nature.com/articles/s41467-025-63804-5

This research paper proposes a new way to help AI language models (like ChatGPT) get better at planning and multi-step reasoning by taking inspiration from how the human brain works.

### The Problem

Large language models struggle with tasks that require planning ahead or working through problems step-by-step, even though they’re good at many other things. For example, they might hallucinate (make up invalid steps), get stuck in loops, or fail at classic puzzles like the Tower of Hanoi.

### The Brain-Inspired Solution

The researchers noticed that different parts of the human brain’s prefrontal cortex handle different planning tasks - like monitoring for errors, predicting what happens next, evaluating options, breaking big tasks into smaller ones, and coordinating everything.

They created MAP (Modular Agentic Planner) - a system where multiple AI modules work together, each with a specialized job:

- Task Decomposer: Breaks big goals into smaller subgoals (like planning a route)
- Actor: Proposes actions to take
- Monitor: Checks if actions are valid (catches mistakes)
- Predictor: Predicts what will happen if you take an action
- Evaluator: Judges how good a predicted outcome is
- Orchestrator: Tracks when goals are achieved

### How It Works

Think of it like a team where each member has expertise. Instead of one AI trying to do everything at once, these specialized modules talk to each other - the Actor proposes moves, the Monitor catches bad ones, the Predictor thinks ahead, and so on.

### The Results

MAP significantly outperformed standard AI methods on challenging tasks like Tower of Hanoi, graph navigation puzzles, logistics planning, and multi-step reasoning questions. For example, on Tower of Hanoi problems, regular GPT-4 only solved 11% correctly, while MAP solved 74%.

### Bottom Line

By organizing AI more like how the brain organizes planning - with specialized modules working together rather than one system doing everything - the researchers dramatically improved AI’s ability to plan and reason through complex problems.
🔥1
You can now run Claude Code directly in Zed and use it side-by-side with Zed's first-party agent, Gemini CLI, and any other ACP-compatible agent. Make sure you’re on the latest version of Zed and find your available agents in the Plus menu in the Agent Panel.

https://zed.dev/blog/claude-code-via-acp
https://techcrunch.com/2025/10/06/a-19-year-old-nabs-backing-from-google-execs-for-his-ai-memory-startup-supermemory/

## Simpler Explanation

This is about a 19-year-old entrepreneur who just raised $2.6 million for an AI memory startup.

### The Founder’s Story

Dhravya Shah, originally from Mumbai, started building bots and apps as a teenager and even sold one to a social media company called Hypefury. With that money, he moved to the U.S. to attend Arizona State University, where he challenged himself to build something new every week for 40 weeks.

### The Problem He’s Solving

While AI models have gotten better at “remembering” information within a single conversation, they still struggle to maintain context across multiple sessions over time. Think of it like this: if you chat with an AI today and then come back tomorrow, it usually forgets everything from yesterday.

### What Supermemory Does

Supermemory is a “universal memory API for AI apps” that extracts insights from unstructured data and helps applications understand context better. It’s basically a memory layer that other AI apps can plug into.

Practical examples:

- A writing app could search through month-old entries
- An email app could find relevant past conversations
- A video editor could fetch assets from a library based on prompts
- It can ingest files, docs, chats, emails, PDFs, and connect to services like Google Drive or Notion

### The Backing

Shah raised $2.6 million from major investors including Google’s AI chief Jeff Dean, Cloudflare executives, and investors from OpenAI, Meta, and Google. Pretty impressive for a 19-year-old!

### Why Investors Are Excited

One investor said he was impressed by how quickly Shah moves and builds things. The startup already has multiple customers including AI assistants, video editors, and search tools.

### The Competition

Supermemory faces competition from other memory-focused startups, but Shah claims his advantage is lower latency - meaning it retrieves relevant information faster.

Bottom line:
A teen founder turned a weekly coding challenge into a funded startup that helps AI apps remember things better - and convinced some of the biggest names in tech to back him.
Don’t just add AI agents to your current setup and expect magic. Instead, you need to rethink your entire approach to how work gets done, focusing on what you’re trying to accomplish rather than just automating individual tasks. Companies that fail to do this reorganization will struggle, just like companies in the past that failed to adapt to previous technological changes.

Three Keys to Success

1. Focus on Outcomes, Not Tasks

- Assign a “mission owner” responsible for complete customer experiences (not just one department’s piece)
- Example: One company measures success by “time until new hires write their first code” - not just “did we send them a laptop?”

2. Connect Your Data (But Don’t Obsess Over Perfection)

- Good news: You don’t need to merge all your systems into one perfect database
- AI agents can work across messy, separated systems - they just need clear instructions about where to find things and what decisions to make

3. Prepare Your People

- Train everyone to understand these “digital teammates”
- Address fears about job loss (emphasize AI handles boring tasks, humans do meaningful work)
- Set up proper oversight and safety rules

https://hbr.org/2025/10/designing-a-successful-agentic-ai-system
https://aws.amazon.com/message/101925/

The Big Picture:
On October 19-20, 2025, AWS had a major outage in their Northern Virginia data center that lasted about 14 hours and affected many services. It started with a database problem and cascaded into multiple other issues.

What Went Wrong (in order):

1. DynamoDB Database Crash (11:48 PM - 2:40 AM)

- DynamoDB is like AWS’s phonebook system for storing data
- The system that manages DynamoDB’s address (DNS) had a software bug - think of it like two workers accidentally deleting the address to a building while cleaning up old records
- Without this address, nobody could connect to DynamoDB, including other AWS services that depend on it

2. EC2 Server Problems (11:48 PM - 1:50 PM next day)

- EC2 manages the virtual computers that customers rent
- These computers have “managers” (DWFM) that check in with them regularly using DynamoDB
- When DynamoDB went down, the managers lost track of their computers
- When DynamoDB came back, there was such a backlog that the system got overwhelmed and couldn’t launch new computers
- It was like a traffic jam that kept getting worse

3. Network Load Balancer Issues (5:30 AM - 2:09 PM)

- These distribute internet traffic across servers
- New servers were being added before their network connections were fully set up
- Health checks kept failing, causing servers to be removed and re-added repeatedly

The Domino Effect:
Many other AWS services broke because they depend on DynamoDB and EC2, including Lambda, EKS, ECS, Connect, and Redshift. Even AWS’s login system had problems.

The Fix:
Engineers had to manually fix the DNS issue, restart systems, throttle traffic, and gradually bring everything back online over 14+ hours.

What AWS is doing to prevent this:

- Fixing the software bug that caused the DNS deletion
- Adding safety checks to prevent similar cascading failures
- Improving their testing and recovery procedures​​​​​​​​​​​​​​​​