The End of Programming

Matt Welsh
Level Up Coding
Published in
8 min readOct 5, 2022

--

The end of classical Computer Science is coming, and most of us are dinosaurs waiting for the meteor to hit.

An old computer crumbling and falling apart in a desert, digital art
All of the art in this post was generated by DALL-E 2.

I came of age in the 1980s, programming personal computers like the Commodore VIC-20 and Apple ][e at home. Going on to study Computer Science in college and ultimately getting a PhD at Berkeley, the bulk of my professional training was rooted in what I will call “classical” CS: programming, algorithms, data structures, systems, programming languages. In Classical Computer Science, the ultimate goal is to reduce an idea to a program written by a human — source code in a language like Java or C++ or Python. Every idea in Classical CS — no matter how complex or sophisticated — from a database join algorithm to the mind-bogglingly obtuse Paxos consensus protocol — can be expressed as a human-readable, human-comprehendible program.

When I was in college in the early ’90s, we were still in the depth of the AI Winter, and AI as a field was likewise dominated by classical algorithms. My first research job at Cornell was working with Dan Huttenlocher, a leader in the field of computer vision (and now Dean of the MIT School of Computing). In Dan’s PhD-level computer vision course in 1995 or so, we never once discussed anything resembling deep learning or neural networks—it was all classical algorithms like Canny edge detection, optical flow, and Hausdorff distances. Deep learning was in its infancy, not yet considered mainstream AI, let alone mainstream CS.

Of course, this was 30 years ago, and a lot has changed since then, but one thing that has not really changed is that Computer Science is taught as a discipline with data structures, algorithms, and programming at its core. I am going to be amazed if in 30 years, or even 10 years, we are still approaching CS in this way. Indeed, I think CS as a field is in for a pretty major upheaval that few of us are really prepared for.

A futuristic cyborg Moai Easter Island head statue with glowing blue eyes, with a futuristic cityscape in the background, synthwave, long shot

Programming will be obsolete

I believe that the conventional idea of “writing a program” is headed for extinction, and indeed, for all but very specialized applications, most software, as we know it, will be replaced by AI systems that are trained rather than programmed. In situations where one needs a “simple” program (after all, not everything should require a model of hundreds of billions of parameters running on a cluster of GPUs), those programs will, themselves, be generated by an AI rather than coded by hand.

I don’t think this idea is crazy. No doubt the earliest pioneers of Computer Science, emerging from the (relatively) primitive cave of Electrical Engineering, stridently believed that all future Computer Scientists would need to command a deep understanding of semiconductors, binary arithmetic, and microprocessor design to understand software. Fast forward to today, and I am willing to bet good money that 99% of people who are writing software have almost no clue how a CPU actually works, let alone the physics underlying transistor design. By extension, I believe the Computer Scientists of the future will be so far removed from the classic definitions of “software” that they would be hard-pressed to reverse a linked list or implement Quicksort. (Hell, I’m not sure I remember how to implement Quicksort myself.)

AI coding assistants like CoPilot are only scratching the surface of what I’m talking about. It seems totally obvious to me that of course all programs in the future will ultimately be written by AIs, with humans relegated to, at best, a supervisory role. Anyone who doubts this prediction need only look at the very rapid progress being made in other aspects of AI content generation, like image generation. The difference in quality and complexity between DALL-E v1 and DALL-E v2 — announced only 15 months later — is staggering. If I have learned anything over the last few years working in AI, it is that it is very easy to underestimate the power of increasingly large AI models. Things that seemed like science fiction only a few months ago are rapidly becoming reality.

Cave painting showing a group of AI researchers training a giant brain

So I’m not just talking about CoPilot replacing programmers. I’m talking about replacing the entire concept of writing programs with training models. In the future, CS students aren’t going to need to learn such mundane skills as how to add a node to a binary tree or code in C++. That kind of education will be antiquated, like teaching engineering students how to use a slide rule.

The engineers of the future will, in a few keystrokes, fire up an instance of a four-quintillion-parameter model that already encodes the full extent of human knowledge (and them some), ready to be given any task required of the machine. The bulk of the intellectual work of getting the machine to do what one wants will be about coming up with the right examples, the right training data, and the right ways to evaluate the training process. Suitably powerful models capable of generalizing via few-shot learning will require only a few good examples of the task to be performed. Massive, human-curated datasets will no longer be necessary in most cases, and most people “training” an AI model won’t be running gradient descent loops in PyTorch, or anything like it. They will be teaching by example, and the machine will do the rest.

In this New Computer Science — if we even call it Computer Science at all — the machines will be so powerful and already know how to do so many things that the field will look like less of an engineering endeavor and more of an an educational one; that is, how to best educate the machine, not unlike the science of how to best educate children in school. Unlike (human) children, though, these AI systems will be flying our airplanes, running our power grids, and possibly even governing entire countries. I would argue that the vast majority of Classical CS becomes irrelevant when our focus turns to teaching intelligent machines rather than directly programming them. Programming, in the conventional sense, will in fact be dead.

“A dirty, crumbling computer keyboard, planted in the ground, on a grassy hill, with a large tree growing out of the top of the keyboard. The roots of the tree are growing through the keyboard. Digital art, 4K”

How does all of this change how we think about the field of Computer Science?

How does all of this change how we think about the field of Computer Science?

The new atomic unit of computation becomes not a processor, memory, and I/O system implementing a von Neumann machine, but rather a massive, pre-trained, highly adaptive AI model. This is a seismic shift in the way we think about computation — not as a predictable, static process, governed by instruction sets, type systems, and notions of decidability. AI-based computation has long since crossed the Rubicon of being amenable to static analysis and formal proof. We are rapidly moving towards a world where the fundamental building blocks of computation are temperamental, mysterious, adaptive agents.

This shift is underscored by the fact that nobody actually understands how large AI models work. People are publishing research papers actually discovering new behaviors of existing large models, even though these systems have been “engineered” by humans. Large AI models are capable of doing things that they have not been explicitly trained to do, which should scare the shit out of Nick Bostrom and anyone else worried (rightfully) about an superintelligent AI running amok. We currently have no way, apart from empirical study, to determine the limits of current AI systems. As for future AI models that are orders of magnitude larger and more complex — good friggin’ luck!

The shift in focus from programs to models should be obvious to anyone who has read any modern machine learning papers. These papers barely mention the code or systems underlying their innovations; the building blocks of AI systems are much higher-level abstractions like attention layers, tokenizers, and datasets. A time traveller from even 20 years ago would have a hard time making sense of the three sentences in the (75-page-long!) GPT-3 paper that describe the actual software that was built for the model:

We use the same model and architecture as GPT-2 [RWC+19], including the modified initialization, pre-normalization, and reversible tokenization described therein, with the exception that we use alternating dense and locally banded sparse attention patterns in the layers of the transformer, similar to the Sparse Transformer [CGRS19]. To study the dependence of ML performance on model size, we train 8 different sizes of model, ranging over three orders of magnitude from 125 million parameters to 175 billion parameters, with the last being the model we call GPT-3. Previous work [KMH+20] suggests that with enough training data, scaling of validation loss should be approximately a smooth power law as a function of size; training models of many different sizes allows us to test this hypothesis both for validation loss and for downstream language tasks.

This shift in the underlying definition of computing presents a huge opportunity, and plenty of huge risks. Yet I think it’s time to accept that this is a very likely future, and evolve our thinking accordingly, rather than just sit here waiting for the meteor to hit.

Sidebar: What have I been up to lately?

My long-time readers will notice that I haven’t been blogging much lately. Sorry about that! I’ve been a bit busy.

About three years ago, I left Google for a startup that was acquired by Apple. I subsequently left Apple to lead engineering at another startup for a couple of years. In that time I learned a ton about startup life, building AI systems, and building teams. It was great.

A couple of months ago I jumped ship to start my own company. I’m now the co-founder and CEO of Fixie.ai, a stealth startup in the AI space, and the blog post above may or may not represent some of the thinking going into this new venture. We have a fantastic founding team and I’m really excited about what we’re going to be able to accomplish. Over the next few months I hope to share more about what we’re up to. Until then, hang tight!

--

--

AI and Systems hacker. Formerly at Fixie.ai, OctoML, Google, Apple, Harvard CS prof. I like big models and I cannot lie.