I’ve always liked to think of the A in AI as augmented instead of artificial. Humans plus AI instead of versus. This is hardly controversial if you look at it in terms of any technology invention: cars made horse buggies obsolete. EVs and solar panels are making petrol engines go the way of the dodo. And in every case, the buggy drivers suffered and had to reinvent themselves, but the human quality of life improved. But perhaps for the first time, generative LLMs seem to be a good enough replacement for a whole class of human knowledge work. Even with the optimistic lessons from Jevon’s paradox, we fear that the transition window will be extra painful this time, if not lead to outright displacement.
I’ll be honest: I don’t know what’s going to happen. LLMs are definitely improving significantly every generation, and from around last September when I realized that I could more effectively program in English, I’ve been thinking about the future of my much loved craft. What would future Software Engineering look like?
Abstractions
I feel one way we can understand the puzzle is to look at it in terms of historical abstractions in the field. Assembly gave way to higher level procedure oriented programming, that gave to patterns: OOO and functional. At every point, programmers took a step back, refined the principles behind their craft a bit more, learnt about patterns in their craft, understood more of the world around them, and built better software.
Would LLMs lead to a utopian ideal where programmers just describe what they want in English, and magic will happen? Two counter arguments to this:
Why should those magic users be programmers?
I’ve had designers at Chronicle raise PRs that programmers never would be able to. Just because of the different qualities that they’ve cultivated through their craft: attention to detail, and just a better understanding of what makes good UX. Product folks at the company have built prototypes with ideas that have then been refined into crucial bits in the product.
Let’s face it: programmers were wizards with spells that weaved code and produced magic. But now, those abracadabra khul ja sim sim magic incantations are just plain English instead, and everybody can use these spells. What’s a pavam wizard to do?
Leaky Cauldrons Abstractions
The second counter argument is of course that like Joel told us a long time ago, abstractions are leaky. They “save us time working, but they don’t save us time learning”. And now, we’ve built an abstraction on a word completing engine, and a language that is just plain crazy. Despite my liberal analogies here, LLMs aren’t magic: how do we predict a good outcome when you instruct a number counting machine with a non-specific tool that is functionally equivalent to D20 dice?
There is another more nuanced problem here: when programmers were talking in code, non-programmers were talking in terms of business requirements, PRDs, specs, design documents, et. al. The abstraction here was always leaky too: design handoffs, QA, and release engineering were places where the “real world” impeded on the ideal of code as the source of truth. Can we build a new shared reality, perhaps above the layer that is code? And can programmers, these new reinvented wizards, be responsible for maintaining that layer?
Aside: But is this a new problem?
The reality every working programmer knows is that programming is instructing a machine on how to reliably make something in a way that humans would consider both pedantic and obtruse. Computers are just plain dumb: if you haven’t seen it already, just watch this exact instruction challenge where a dad behaves like a computer, how the kids react is pretty much the frustration a lot of programmers experience.
Abstractions have always been an effort to make this process easier. And now, we have a new tool in our belt: computers are just a bit less dumb than they have been before. They, through the wonders of neural networks and inference, can now grok more of plain English speech. But the same problems remain:
- How do we make these dumb creatures work to our intent every time?
- i.e. reliability, scalability, correctness and so forth.
- How do we make these systems evolve gracefully over time?
- system design, architecture, business intelligence, and well, just teaching ourselves about a domain.
- How can we make sure they have the right affordances?
- usability & user experience, the customer experience journey, delight, and the works.
Problems to Solutions
Let’s flip the problems we had to solutions. How do we build a new abstraction that will:
- make sure everybody (not just traditional programmers) can instruct this dumb computer effectively, thereby multiplying productivity?
- build a substrate that leaks less, and on our terms?
So that programmers can as they have always done, sit at the bridge between the world of messy problems and the world of too literal computers, but their job now would be to maintain this new substrate: this interface that helps everybody (not just us mystical wizards) perform magic spells to solve problems.
Fawkes: The Pheonix Abstraction
Let’s be clear, I’ve not built such a system yet (I don’t think anybody has), but it’s something I’m keen to explore. The closest articulation that I’ve found to it is Chad Fowler’s Pheonix Architecture. The core idea is the same: code is no longer the substrate that we should build on. It’s an artifact of a higher abstraction, one shared with the entire company or community. Chad envisions software as a set of interconnected components that have four distinct artifacts:
- Specification: A contract from which code is derived
- Evaluation: A series of tests that define what is “correct” for the component
- Context Boundary: The interfaces that one component exposes to others
- Provenance: A decision record for when, how and why the other artifacts: specs, evaluation, or boundaries change.
While all 4 have a bunch of priors, in this new age of AI wiring, all are in its infancy. How do you build specification and evaluation that result in reproducible results while retaining the advantages of broader access? How would you architect components and boundaries in a way that map sensibly to business domains but at the same time account for cohesion and dependency? What do decision records and traceability look like?
What I’d like to do is build Fawkes, a programming abstraction that works this way. Here’s what I am considering as building blocks:
- Specification & Evaluation: a version of BDD that is LLM native, striking a balance between being reviewable by non-programmers but strict enough to conform to deterministic non-LLM or cheap LLM evaluation. For UX, we build on existing standards like DESIGN.md and MOTION.md.
- Context Boundary: A deterministic contract (perhaps via Protobuf / Typespec) that is generated from the specification.
- Provenance: Some form of DeltaDB/Entire where agent decisions and changes are stored inside version control. We store and read both the intent and the extent of the change.
The new world
Programmers are still responsible for abstracting away the dumb yet smart box, but they delve more deeply into architecture and maintenance. LLMs do the building, humans do the decision making, the domain mapping and specification. When problems occur and expectations mismatch, a record of a decision tree that led teams there helps them recover and build to a new good state. Code is an artifact, endlessly regenerated, the spells are now prose, accessible to any speaker. UX and CX are first class concerns: as designers and product folks work directly with these new abstractions to tailor new software, they lean on engineers to decide where component boundaries should lie or how software design should evolve. It’ll be nice to live in such a world 🙂
Leave a Reply