Good with a keyboard and fluent in gibberish

AI Taught Me Development is Engineering

and we need to take that seriously.

AIs and Code

I believe it is uncontroversial to say that generative AIs produce bad code. AI proponents will say that trusting it blindly is negligent, you need to review it or tweak it after generation or run two agents in an adversarial loop with a test suite in between, or whatever. I don’t use AI, too much has already been written about AI and their proponents, and this post isn’t about that. I’m taking it axiomatically that AI generated code is bad, and that humans are bad at moderating that badness.

There have been lots of anecdotes about how generated code was bad and had bad impacts. I haven’t seen this studied academically, but I’m again going to assume that this is a thing that has happened.

It is arguable that AI produced code is good enough, that the bug rate of AI is comparable to people. Or a couple of bugs isn’t that big of deal. Or some other argument. The human brain is shockingly good at rationalizing. I’m not here for an AI debate; I’m laying out my assumptions.

What we don’t know is how deep the rabbit hole of “bizarre behaviors” from Choices That AI (Didn’t) Make is. And that possibility space is infinitely infinite–everything from “lying about saving a file” to “capitalize every third word” to “CC some emails to a random person in your address book”. Software doesn’t really place fundamental limits on what’s possible, so everything is possible.

AIs and Language

Even in situations where an AI system is created soundly, LLMs lack comprehension and therefore are simply incapable of understanding the natural language it would seem like they should be good at. Whether it’s translation, dictation, summarization, writing, or another application, these applications by their nature are about language–about communication. Communication which has similarly infinite possibilities of nature and importance–everything from “haha butts” to “99.9% of transgender people do not regret the procedures they have received”.

There is no end to the anecdotes of significant mistakes, everything from the subtle to the wildly incorrect. In most uses, the human is incapable of double checking, or the checking would defeat the purpose of the tool.

To use the above example about medical regret rates, the removal of a single word changes the meaning of the sentence. The sentence that generalizes an entire demographic and is absolutely capable of shaping policy and impacting millions of people.

Again, the amount of possible bizarre behaviors is a recursive rabbit hole.

What is Engineering?

Back in 2021, Hillel Wayne published The Crossover Project (Are We Really Engineers?, We Are Not Special, and What engineering can teach (and learn from) us), and it’s been rattling around in my head since. If I were to summarize it, it’s that physical engineering doesn’t deserve the pedestal that software developers often put it on, and physical engineering often also involves ugly hacks and last minute “just make it work”. It’s worth reading, but takes an empirical approach to the question. I’m going to use a moral and philosophical approach in exploring engineering.

Society is, at it’s core, an emergenet phenomen of millions people and the billions of interactions between them. However those interactions hang off of and are mediated by a constructed world: buildings, roads, cars, telephones, electricity, data. A superstructure giving society a shape.

Engineering is the construction and maintenance of the superstructure of society. Engineers are participants in that maintenance.

Computers have been part of this ever since governments started using them to analyze demographics and developing policy based on those answers. Since then, software has moved so far beyond simple databases. All of business depends on software in every aspect. Basically everything that moves has a blob of firmware. An uncountable amount of communications are transit through software. Entire relationships exist only because software connected distant people.

Software is critical to society; it is part of the superstructure. Software developers create and maintain that superstructure.

The Collision

So on one hand, we have people going “eh, it’s fine”, and on the other hand we have society.

This is how engineering disasters happen. And while software is not generally in a position to literally and directly kill people (Therac-25 is so well known because it’s an exception), it can absolutely cause real harm to people.

Some of these systems mediate welfare programs, carry communications about employee performance, or shuffle draft policies through a bureaucratic process. While it is quite impossible for this software to reach a person and literally touch someone, it is perfectly capable of taking food from them, swinging between a raise and a termination, or making them warm or homeless.

Or denying life-altering medical care. Inside or outside of provider setting.

If you don’t understand the concept of metaphorical blood spilled, of indirect harm, we won’t agree on much. Can I suggest making friends outside of your socioeconomic bracket? Or reading Terry Pratchet’s Going Postal?

Beyond tracable harm to individuals, I worry about the harm to that superstructure.

I am concerned that wide adoption of AI will lead to an overall decline in the quality of software. I can already hear the “it won’t be that bad”. But users are already frustrated with software–with enshittification and SaaS behaving in exciting ways. What happens when user confidence is further pushed by the bizarreness? Will software continue to be trusted by people and organizations? Will we put the cat back in the bag, or will the systems of society become an extra spicy flavor of capricious?

The Call

Software is no longer a weird hobby of society–it is as critical as bridges. We carry memes, relationships, medical care, and well being.

And we must not ignore the potential harms that brings with.

Software Development is Engineering. And it’s time we took that seriously.