Coding Agents and Software Development: Understand Before You Act

Coding Agents and Software Development

Although it might seem strange to most, the first launch of ChatGPT — which sparked what we now call the AI boom — happened less than four years ago. Since then, most of us have witnessed an incredible change in the way we work, across a wide variety of jobs.

Software development was one of the most affected fields, and the reactions were understandably mixed. The first wave of concern was existential: would developers be made redundant? I do not believe this is the case. As in all major revolutions in history humans have never been made redundant, they just adapted, so developers will adapt too. But in all adaptation processes some old ways of doing things will disappear and new ways will emerge. This process of change can be painful, because it  comes with a form of resistance, an inertia around changing the way you work and the way you think.

To overcome that resistance, I believe that it’s important to  have an abstract idea of how  an agent actually works, because it can help you understand what it is and isn’t, where it shines and where it doesn’t, allowing you to take full advantage of this amazing technology.  In this post, I want to share my mental picture of  agents and the practical habits that follow from it. I want to make clear that  this is what worked for me, it may not be the only way or the right way to think about it, but having a clear abstraction in mind definitely made my  work   easier. 

The mental picture draws on ideas popularized in 2025 by Andrej Karpathy.


A Mental Model for Agentic Systems

We are probably all familiar with the main hardware components of a computer: At the core of a computer is the processing unit, i. e. the CPU. The CPU is the place where computation actually happens — where bits of information are processed. All the other components are builded around it: the RAM, the hard disk, the keyboard, the display. These components are part of  the hardware, which is the substrate where computation lives. But hardware alone does not specify what to compute, for that you need software.

On the software side, the Operating System (OS) orchestrates the flow of bits from external sources (the hard disk, keyboard inputs, and so on) through the CPU, and back to the display, or any other output. But it still doesn’t specify how this information is processed. This last step — the critical step for the resolution of the specific problem — is specified in a program: a set of instructions that defines how bits of information must be processed for a specific task.

At this point, note if we stop thinking about  “bits” of information and we start to think about  “tokens” of information – i. e. written text, you might see that an agentic system operates under a very similar conceptual scheme. The table below summarizes the analogy.

LayerCoding agent equivalent
CPUThe LLM — natural language processor
(no equivalent)The human — co-processor alongside the LLM
RAMContext window
OSThe agent’s harness
ProgramThe initial task (e. g. first user prompt)
SubprogramA skill 
RuntimeThe session (input context)

The OS is analogous to the agent’s harness. You can think about the harness as the TypeScript or Python code that manages the sessions and orchestrates tool execution. To go farther, you can think about the model’s context window as the RAM, and operations like context compression, which are part of the agent’s harness, are analogous to an OS freeing up RAM when the maximum capacity is close. But it is in the program, subprograms, runtime and CPU parallels that the analogy gets more interesting. 


The Program, the Runtime, and the Natural-Language CPU

In a traditional computer, the program is a fixed set of instructions written before execution. These instructions might also imply that other subprograms are called during the execution in order to perform specific operations. The runtime is the live state that accumulates while the program runs. 

In an agentic system, the program is the initial task — the first user message, or the task assigned by a higher-level agent to a subagent. A subprogram maps to a skill: a pre-written set of instructions that can be invoked during the execution of the main program for a specific operation. The runtime is the session — the input context that accumulates in the context window as the program (the task) is executed. 

But what about the CPU? In a traditional computer, the CPU  executes binary instructions deterministically. The LLM occupies the CPU position — but it is not a CPU. It is a natural language processor. This difference is important: the CPU speaks binary, the LLM processes natural language — the same medium the human uses. This makes the boundary between human and machine far more permeable: the “brain” of the machine speaks human now, thus the human can contribute to the runtime directly, far more easily than with a CPU, without a formal specification standing between them and the execution.

Moreover, the shared language implies that the program can be steered during execution. Each message shapes the context, and therefore the output, for every subsequent turn. The human is no longer outside the execution — they are a co-processor of the runtime, alongside the LLM.

From this mental picture, I drew three practical consequences.


Three Habits, One Principle

1. Start by talking, not doing

Working with a coding agent is not configuring a machine. It is beginning a collaborative session with a peer processor. The opening turns establish the context that shapes the probability distribution for everything that follows: how the problem is framed, what constraints are clarified, what assumptions are made explicit. 

As a rule of thumb, the first move in a session should be a question, not an instruction. A few minutes of exploration before the first command often produces better outcomes. In this regard, you might want to check out  grill-with-docs, a Claude Code skill built by Matt Pocock which I use a lot before starting building code.

2. Beware of the tangent

Exploration is natural and often productive, but can also be bad if it goes too far, that is if it loses focus on the main task.  Asking a  related question or  considering a  different approach can be  useful, but sometimes  the conversation can go on a tangent, which in our analogy means that the system starts running a different unrelated main program in the same runtime. A tangent in a session does not disappear. Even if you return back to the main task, its content stays in the input context, consuming tokens and confusing the agent. 

The fix is a fresh session. When you recognise the tangent, you run something like handoff, another skill by Matt Pocock. It compresses the current session into a focused summary that a new session can pick up from immediately. In our analogy, it’s a self-contained subprogram that will create the main program to be run in a new runtime to develop the idea in the tangent, while the original runtime can continue focused on the original program.

3. You can outsource your thinking, but not your understanding

The specs-to-code paradigm — write a full specification, hand it over, let the agent build everything autonomously — tries to restore the original boundary between the human and the machine: human as author outside, agent as executor inside. It may work, but in any case you lose the steering advantage. The human is no longer a co-processor — they are waiting for output, which must then be reviewed and understood by the human in any case, and all at once. 

The alternative is incremental steps: break the problem into small units, complete one at a time, review each before moving to the next. Remember that the LLM can go on tangents just as the human can. Incremental review is the only way to catch them before their content accumulates in the input context and compounds. If you understand each step before proceeding you are kept inside the execution.

You can outsource your thinking, but not your understanding” describes beautifully  the cause of what can go wrong when a coding agent is treated too much as an autonomous contractor rather than a collaborative runtime.


Conclusion

An agentic system architecture is similar to a traditional computer architecture, but with a CPU that speaks human language. The LLM is the CPU, the session is the runtime, and you are a co-processor. You can program the agent in plain English and  you can steer  the runtime acting directly on the CPU. Three practical habits follow naturally from this view: start by establishing context before issuing commands, recognise tangents before they compound, and stay inside the execution through incremental review. Remember that you can outsource your thinking, but not your understanding.


Autore: Giovanni Vacanti, Data Scientist @ Bitrock

Do you want to know more about our services? Fill in the form and schedule a meeting with our team!