Dimensions

I’ve been thinking quite a bit about various dimensions along which we might assess how we use code and how codebases behave as environment.

This piece takes off from my recent musing on Cognition Enhancing Technologies on my personal blog, and which the reader might find useful as background. Today’s piece is an attempt to start getting more concrete about how we might find our way out of the software-construction bottleneck we’re in.

Longevity

One such is a Longevity Axis. I decline contracting jobs with [A Certain Company] because their projects tend to fall near one end of the Longevity axis — the end that says, “Write code fast, get it working, don’t bother much about structure, design or code hygiene because the customer just wants as much functionality out of their limited budget as they can get in the available time. And there’s every good chance that nobody will ever touch the code ever again.” (Because a great many of those projects are speculative in nature and have a high probability of not getting further oxygen/funding when the conjectured advantage turns out to be Not So Great.) You end up using lots (lots!) of Other Peoples’ Code (libraries, cloud infra, remote APIs) with all the attendant pains of making those parts play together. Mostly you’re writing Glue Code, and not really much stuff that warrants any real level of design or introspection. (I am temperamentally ill suited to that environment, so it’s best for both me and potential clients not to go there.)

At the other end of that axis live codebases where people live inside them for years or decades. Yes, there’s a lot of ancient stuff in there, but it tends to be pruned and kept reasonably clean. (Because if you shit in the nest, the person most likely to step in it is only yourself. So you don’t.) Version control rules, refactoring is taken seriously, code hygiene is a valued quality (though not always!), design and structure gets discussed, debated, improved, experimented upon. And such experiments are abandoned with alacrity once they’ve taught us what we need to learn. Yes, of course, tools, libraries and Other Peoples’ Infra gets used, but carefully, judiciously, with deliberation — very likely isolated behind interfaces under our control so that they can be replaced or augmented without perturbing all the rest of Our Stuff. Dependencies get actively managed. (/me Looks hard at the Node ecosystem!)

I’m much reminded in this of Nat Lunn’s parable of Bridges And Airports. The Long-Lived end of the axis is very much the Airports World.

Distribution Surface

There’s another axis: the Area Of Distribution Axis. At one extreme, I might write a bit of software just for me. It will tend to be small. There’s a limit to how much time a piece of software can save me relative to how much time it takes to conceive, implement, debug. (And here we’re back to the Central Question of why our tooling is so pathetic and weak.) So it tends to be ad-hoc, messy, incomplete, flaky and half-baked. It’s those quick-and-dirty awk scripts and cron jobs that ease some minor pain-point in our lives.

At the other end of the axis lies the stuff that the Hacker News crowd loves to imagine they’re all going to do one day: Web Scale! The stuff that requires multiple datacentres filled with server and storage racks. The code embodies all those sexy computer sciency problems in distributed systems. Measurement, monitoring, failure management and control are overriding concerns, and are frequently much more important than the actual function of the software.

Mostly the systems we write are somewhere between those two extremes. A bit of a letdown, that. A bit humdrum. Welcome to reality.

The Nature of the Work

A third dimension, this one more about the nature of our work rather than the nature of the codebase itself: The writing of greenfields code versus the (more commonplace) reality of fixing problems in existing code or adding small increments of function to a large, pre-existing codebase. And by “greenfields” I don’t only mean truly clean-slate, new projects; I also include adding more substantial chunks of new function in the context of larger existing systems. After all, there’s not really that much difference. Even the greenest of greenfield projects exists in the context of existing data sources, user-bases and legacy infrastructure.

At the greenfields end of things we’re likely to be in a more exploratory, experimental mode of development. At the brownfields end our need is more for insight and understanding of what, how and why the existing code is the way it is so that, in fixing some problem, we don’t trigger a chain reaction of further failures. Every veteran dev has war stories about this.


Are these 3 dimensions useful in any way? I’m not sure yet.

Does this help us think about

  • What tooling is appropriate to a given situation? Is it tooling that helps us wire together Other Peoples’ Shit, that improves the legibility of what’s happening (dynamically! in time!) as bits fly hither and yon between our own glue code and the opaque black boxes we barely understand? Is it tooling that allows us to model and remodel the world we’re exploring with agility and grace in rewiring all the attendant other bits of code that hang off our plasticine world? Is it tooling to better expose the structures (static and dynamic, at many levels) of (often large!) legacy code so that we can move quickly and with better understanding, with greater confidence insert surgical changes?

  • Code hygiene. Codebase cleanliness. Good design. (Whatever that means. We recognise it when we see it, but have a tough time pinning down a concise definition.) All of these really boil down to “comprehensibility” — the ability of other people (i.e. ourselves in six weeks’ or six months’ time) to understand the intent behind code: WHAT it’s meant to be doing, WHY it’s doing it this particular way, HOW the pieces interact with other pieces internal and external. Glaringly absent from this picture is WHAT ELSE we tried that didn’t work, which is a critically important piece of the WHY. Occasionally this might be documented in a comment. “Don’t remove the next line of code. Nothing works without it. Nobody knows why. But don’t fuck with it.“ Not even a multi-day trawl through the murky underwaters of our version-control system will reveal what went down there. The original programmer is long gone, possibly retired or dead. And our existing tools are, frankly, utterly and dismally useless at capturing this kind of (critically important!) knowledge and learning.


I think there’s some useful insight here, but really, all this talk so far of “better tooling” is merely incremental, building upon the existing paradigms of programming. It’s all still stuck in the ways of thinking that got us into the situation we’re in, and, if Einstein was at all right, the same kind of thinking won’t get us out of it, won’t get us to a quantum improvement in how we conceive software.

The most useful thing to have come out of all this might just be the idea of code as environment/code as ecosystem. More to think about… We flex the environment; the environment flexes us right back.