You should read it especially now when more and more code is written by LLM. The important thing is not the code itself but your mental model of the software you're building. Sadly we seem to be moving away from it. We're accumulating more and more code that we don't understand or haven't even read.
I was going to say that an LLM can't do this, because it loses everything at the end of the session. But... could an LLM write out its "state" or "understanding" so that you could recover that for the next session? Do any LLMs currently have that ability?
It's very common, but (like most things with LLMs) it's not as deterministic as you might imagine. A common technique for agents is to have them create a "handoff" document (usually markdown) that summarizes the previous session-- goals, important files/links, etc. There are dozens of proprietary ways of doing this, and Claude Code automates the process with its /compact command and even does auto-compaction as you reach your context limit. ChatGPT has been doing autocompaction since the beginning as it started out with a comically small context window.
The problem with auto compaction is that you aren’t given the opportunity to review its compacted understanding to confirm that it’s correct or doesn’t contain large omissions. I try to avoid letting it compact whenever possible and stick to plans that I review because it seems to get extremely dumb after an auto compaction.
Yeah, I still find Opus to be pretty unreliable once you get past around 150K tokens, so I usually run a custom hand-off command at that point that extracts specific elements to specialized documents. The command contains a "Documentation Map" with single line summaries of each of those documents to help the agent sort everything out. Like most memory systems, it works pretty well around 80% of the time. I messed around with RAG and other complex solutions, and I never got much better results than my KISS system.
This brings up a philosophical question. Are we willing to hand over the role of "theory building" to LLM if that's even possible? If yes, what will be the role of human beings?
It may destroy many foundational assumptions that humans have had for thousands of years.
In theory maybe in some sense, but if we read Naur's definition of "theory" in a more strict or philosophical way, they can't in full. An LLM can't build a theory, because it doesn't have "real" experience, it's essentially just following rules. It also can't really argue or justify its choices like a person can.
This is discussed in the "Ryle's Notion of Theory" section of the original essay.
The name "theory building" doesn't really resonate with me - I think effective design ("programming" if you will) is more about things like decomposition, factoring and representations.
The larger the project the more ways there are that you could decompose it, but only some of these are going to have good outcomes in terms of things like a concise flexible implementation, easy to read/write, debug and extend etc.
You are mentally exploring the alternatives trying to find the ideal factorization that minimizes complexity, keeps interfaces between parts simple and friction-free, and results in an implementation where the code almost reads like a high level description of the requirements, with additional levels of detail only exposed as you descend each level of the implementation.
I can't off the top of my head think of a super pithy way of expressing it, but optimizing the factorization and representations being exchanged between parts (the two go hand in hand) is the key. How do you reduce the requirements into a design with the fewest moving parts and simplest interfaces between parts. It's kind of co-evolution in a way.
In my mind design and theory are inseparable. Design is the accumulation of many design decisions. Theory explains what influenced those decisions.
Design needs theory to be intentional. It can of course be accidental (”seems to work, I guess”) or intuitive (”i know in my guts this is right but cant explain it”).
While both can end up with functional systems, if you cant vocalize the design journey the system is not very maintainable in the industrial sense (hence - theory is the vocalization of the design and the forces that influenced it).
In the sense that you seem to be using "theory" I'd somewhat agree, else all you really have is an artifact, not a intentional design. You could say the design is what-it-is, while the theory is why-it is, and both are certainly useful to document.
Where I'd tend to disagree, is whether the processs of programming is well described as "theory building" in this sense. The process is not the same as the destination, and the rationale for the destination is not always going to honestly reflect the process of getting there. I would also say that the process is not really about building/evolving these design-specific theories, although others may disagree.
For me, what drives the programming/design process is more the things I mentioned - at least as much general principles of software design, decomposition, factorization, orthogonalization, decoupling, minimal interfaces, etc, as it is about problem-specific design and theory. One does of course also have an evolving mental model of the parts of the problem and how they inter-relate (how one might break the problem down into modules and classes, etc), but I think that when things "click" and you go "Aha! I like this!", it's often more about generic design principles, and familiar design patterns, than it is about some problem-specific "theory" that you are evolving.
It's very hard, if not impossible, to describe the exact mental process of software design. I seems it has to be, like all reasoning and learnt-skill based behavior, a process of of subconscious pattern matching and doing what worked before. Biological RL if you like. I think it's possible to understand some of the patterns involved, such as these generic software design principles, but it'd be post-hoc rationalization to claim that some specific design rationale had been driving the whole process.
It seems it's the same for experts in any domain, e.g. chess - they can probably not tell you exactly why they selected a particular move, or why they intuitively ignored other potential lines, but can post-hoc provide endless theory and analysis of why it nonetheless made sense!
Most of my posts have aged terribly in the age of AI (especially the ones I didn't finish...so long, extended discussion of how to use a lab notebook when debugging, we hardly knew ye. Claude fixes our bugs now) but one job that engineers still have is the collection and retention of context that AI doesn't have and can't easily get.
This is helps describe my biggest pain point when engineering a program with LLMs. They do not have the full theory of the program, which makes things difficult. Additionally, the more hands-off approach to programming (even when I try to maintain involvement as much as I can) means that I lose the clear conceptualization of that piece code. I'm still trying it to see if it can work, but it is definitely a vibe shift from making 20 micro-architectural decisions in every function.
The only reason I recommend this paper is because I encounter so many people that have a very myopic view of the software that they’re building. They are focused on individual features and how to quickly made them happen regardless of what happens to its cohesiveness. You start to talk about interfaces and contracts and they’re like a deer blinded by a car’s headlights.
I wouldn't start to think of someone as a real developer unless they've at least designed & written something of at least 10K LOC or so of complexity from scratch a few times. At least, you're not going to be able to understand these lessons and characterizations of programming in the large unless you do have at least that level of experience.
The larger and more varied projects that you have designed from scratch, the more you start to understand what programming/designing is really about.
You should read it especially now when more and more code is written by LLM. The important thing is not the code itself but your mental model of the software you're building. Sadly we seem to be moving away from it. We're accumulating more and more code that we don't understand or haven't even read.
I was going to say that an LLM can't do this, because it loses everything at the end of the session. But... could an LLM write out its "state" or "understanding" so that you could recover that for the next session? Do any LLMs currently have that ability?
It's very common, but (like most things with LLMs) it's not as deterministic as you might imagine. A common technique for agents is to have them create a "handoff" document (usually markdown) that summarizes the previous session-- goals, important files/links, etc. There are dozens of proprietary ways of doing this, and Claude Code automates the process with its /compact command and even does auto-compaction as you reach your context limit. ChatGPT has been doing autocompaction since the beginning as it started out with a comically small context window.
The problem with auto compaction is that you aren’t given the opportunity to review its compacted understanding to confirm that it’s correct or doesn’t contain large omissions. I try to avoid letting it compact whenever possible and stick to plans that I review because it seems to get extremely dumb after an auto compaction.
Yeah, I still find Opus to be pretty unreliable once you get past around 150K tokens, so I usually run a custom hand-off command at that point that extracts specific elements to specialized documents. The command contains a "Documentation Map" with single line summaries of each of those documents to help the agent sort everything out. Like most memory systems, it works pretty well around 80% of the time. I messed around with RAG and other complex solutions, and I never got much better results than my KISS system.
This brings up a philosophical question. Are we willing to hand over the role of "theory building" to LLM if that's even possible? If yes, what will be the role of human beings?
It may destroy many foundational assumptions that humans have had for thousands of years.
In theory maybe in some sense, but if we read Naur's definition of "theory" in a more strict or philosophical way, they can't in full. An LLM can't build a theory, because it doesn't have "real" experience, it's essentially just following rules. It also can't really argue or justify its choices like a person can.
This is discussed in the "Ryle's Notion of Theory" section of the original essay.
The name "theory building" doesn't really resonate with me - I think effective design ("programming" if you will) is more about things like decomposition, factoring and representations.
The larger the project the more ways there are that you could decompose it, but only some of these are going to have good outcomes in terms of things like a concise flexible implementation, easy to read/write, debug and extend etc.
You are mentally exploring the alternatives trying to find the ideal factorization that minimizes complexity, keeps interfaces between parts simple and friction-free, and results in an implementation where the code almost reads like a high level description of the requirements, with additional levels of detail only exposed as you descend each level of the implementation.
I can't off the top of my head think of a super pithy way of expressing it, but optimizing the factorization and representations being exchanged between parts (the two go hand in hand) is the key. How do you reduce the requirements into a design with the fewest moving parts and simplest interfaces between parts. It's kind of co-evolution in a way.
I think you did use an appropriate word: design.
In my mind design and theory are inseparable. Design is the accumulation of many design decisions. Theory explains what influenced those decisions.
Design needs theory to be intentional. It can of course be accidental (”seems to work, I guess”) or intuitive (”i know in my guts this is right but cant explain it”).
While both can end up with functional systems, if you cant vocalize the design journey the system is not very maintainable in the industrial sense (hence - theory is the vocalization of the design and the forces that influenced it).
> In my mind design and theory are inseparable
In the sense that you seem to be using "theory" I'd somewhat agree, else all you really have is an artifact, not a intentional design. You could say the design is what-it-is, while the theory is why-it is, and both are certainly useful to document.
Where I'd tend to disagree, is whether the processs of programming is well described as "theory building" in this sense. The process is not the same as the destination, and the rationale for the destination is not always going to honestly reflect the process of getting there. I would also say that the process is not really about building/evolving these design-specific theories, although others may disagree.
For me, what drives the programming/design process is more the things I mentioned - at least as much general principles of software design, decomposition, factorization, orthogonalization, decoupling, minimal interfaces, etc, as it is about problem-specific design and theory. One does of course also have an evolving mental model of the parts of the problem and how they inter-relate (how one might break the problem down into modules and classes, etc), but I think that when things "click" and you go "Aha! I like this!", it's often more about generic design principles, and familiar design patterns, than it is about some problem-specific "theory" that you are evolving.
It's very hard, if not impossible, to describe the exact mental process of software design. I seems it has to be, like all reasoning and learnt-skill based behavior, a process of of subconscious pattern matching and doing what worked before. Biological RL if you like. I think it's possible to understand some of the patterns involved, such as these generic software design principles, but it'd be post-hoc rationalization to claim that some specific design rationale had been driving the whole process.
It seems it's the same for experts in any domain, e.g. chess - they can probably not tell you exactly why they selected a particular move, or why they intuitively ignored other potential lines, but can post-hoc provide endless theory and analysis of why it nonetheless made sense!
I wrote a series of blog posts about this a few years ago: https://creating.software/essays/theory_of_a_program/, some of the few I ever actually finished, lol.
Most of my posts have aged terribly in the age of AI (especially the ones I didn't finish...so long, extended discussion of how to use a lab notebook when debugging, we hardly knew ye. Claude fixes our bugs now) but one job that engineers still have is the collection and retention of context that AI doesn't have and can't easily get.
This is helps describe my biggest pain point when engineering a program with LLMs. They do not have the full theory of the program, which makes things difficult. Additionally, the more hands-off approach to programming (even when I try to maintain involvement as much as I can) means that I lose the clear conceptualization of that piece code. I'm still trying it to see if it can work, but it is definitely a vibe shift from making 20 micro-architectural decisions in every function.
By the curry-howard correspondence, this is literally true.
The only reason I recommend this paper is because I encounter so many people that have a very myopic view of the software that they’re building. They are focused on individual features and how to quickly made them happen regardless of what happens to its cohesiveness. You start to talk about interfaces and contracts and they’re like a deer blinded by a car’s headlights.
I wouldn't start to think of someone as a real developer unless they've at least designed & written something of at least 10K LOC or so of complexity from scratch a few times. At least, you're not going to be able to understand these lessons and characterizations of programming in the large unless you do have at least that level of experience.
The larger and more varied projects that you have designed from scratch, the more you start to understand what programming/designing is really about.