The Apollo Guidance Computer: Architecture and Operation by Frank O'Brien
Publisher: Springer Praxis Books
By now I've spent almost two decades in the software industry. Before that I used to hack around on my own toy projects, so it's probably fair to say I got a decent grasp on programming. But despite all these years thinking about code I noticed that I knew embarrassingly little about how computers actually work. What happens to all the code we write once it has been compiled down to machine instructions? How does it get executed? And, just as interesting, how can ones and zeros control peripherals like keyboards, printers or even rocket engines? My degree in electrical engineering gave me a decent understanding of several necessary components but it was far from enough. And yes, for those who know me, I may not have been the most motivated student back in the early 90ies. Thus I decided to make up for lost time by diving deeper into this subject and develop my mechanical sympathy. My first stop on this learning tour was the brilliant Code by Charles Petzold. A book about the Apollo Guidance Computer seemed like a natural next step for several reasons. First of all because I'm genuinely fascinated by both space flight and the Apollo program, but also as a way to study a real computer architecture in detail.
Studying an old and obsolete computer architecture is advantageous for a hardware newbies like me. The relative simplicity of the architecture makes it possible to grasp; You don't need to worry about CPU caches, multi-core processes or mesh network topologies. The machines of the 60ies were simple. The abstraction layers between the code and the hardware are also shallow since application programs were written in assembly or machine code. This makes it straightforward to reason about code execution.
Reliable Hardware: Core Rope Memory
The book kicks off with a tour of the hardware in the Apollo Guidance Computer. I was surprised by all the oddities in the architecture, like squeezing 3 bits opcodes and the associated operand address into a 15 bits word. 3 bits doesn't take you far, neither does 12 bit addresses, given that the computer exhibits a blazing 38 Kb of memory. So how can you address 38 Kb with just 12 bits? Math isn't really patchable. Instead, the solution to this addressing problem was to introduce memory banking. Memory banking basically means that the available memory is divided into different slots, the banks, and special hardware registers were used to control which bank the 12 bit address should reference. While it solves the addressing problem, memory management in the Apollo Guidance Computer definitely feels complicated.
Another interesting surprise was the usage of core rope memory that the Apollo Guidance Computer pioneered. Core rope memory is read-only and makes up the majority of the memory in the guidance computer. A rope memory is indeed just what it sounds like: you thread thin wires through ferrite rings. Those rings are paired, one for '1' and one for '0'. The coolest thing (yes, your mileage regarding coolness may vary) is that these cores, or rings, may be re-used; A single core could be shared for the encoding of 24 different machine words. Pretty cool.
Despite it's marvelous retro coolness, what else is so great about core memory? Well, let me tell you: software reliability. A core rope memory has limited failure modes, in particular when compared to, let's say, a memory system built of transistors. Another great advantage is that rope memory will maintain its content no matter how many times the computer restarts. And of course rope is immune to possible radiation errors in our hostile space. Any drawbacks? Well, a patch to the software now involves threading rope through cores. Say farewell to continuous delivery.
Beyond Raw Performance
Compared to today's desktop computers, the Apollo Guidance Computer is of course a sorry excuse for a computer. But, and this is one of Frank O'Brien's main points, performance comparisons with today's computers aren't fair. The Apollo Guidance Computer wasn't designed as a general purpose computer - instead it had a number of very specific goals - and while performance was a challenge, safety was even more important. In fact, you'll find that safety critical computers, like the Apollo Guidance Computer, are often built on yesterday's technology. The reason is because it's extremely expensive to verify and validate that all components meet their expected operating requirements. I spent several years designing safety critical systems for the railway industry so I'm all too familiar with these constraints. It's a conservative strategy, but vital to these kinds of systems no matter how frustrating such systems may be to us as developers.
The Apollo Guidance Computer also had to operate in a hostile environment and still maintain high availability. I mean, you don't want a a single program bug to put your spaceship at risk. Neither do you want your guidance computer to grind to a halt because it decided to install El Capitan as you approach the lunar surface with 5,500 ft/sec. Yet, in a way, that sort of happened, which brings us to the error handling strategies of the Apollo Guidance Computer.
Most of you are probably familiar with the infamous program alarms during the lunar decent of Apollo 11. What actually happened was that the rendezvous radar, the one used when re-docking with the command module left in lunar orbit, flooded the computer with timer interrupts. You see, the Apollo Guidance Computer used cooperative multitasking, which means each program has to behave correctly and give up the processor on time. Those timer interrupts stole 15% of the available processor time. That mean the mission programs didn't get the chance to execute until completion. As a result, the Executive (i.e. the real-time task scheduler) kept jobs in the task queue. Fortunately, the built-in error handling strategies dealt beautifully with the situation. After a software restart, the programs were re-started according to their priorities. Since it was a software restart, as opposed to a hardware reset, the computer was up and running again in few seconds.
I think that there's much value to such a re-start on error strategy. In fact, modern languages like Erlang help systems achieve remarkable up-time with a similar approach. It's a simple programming model, which makes it much more likely that the system actually resumes in a known state after the re-start. I can only speculate, but I think the lunar module would have been in a much worse situation if the programmers had attempted to detect jobs that run longer than expected and then auto-terminate them. It's a complex pattern to get right and you don't want complexity to kick in once the unexpected happens, which is pretty much the definition of an error condition.
Understanding the Mission Programs
The Apollo Guidance Computer doesn't make for light reading. The first parts of the book read more like an engineering manual. Frank O'Brien takes a systematic approach and each major hardware part or category in the instruction set are discussed in detail. It's great because you really get a deeper understanding. The possible drawback is that it's hard to get the overall picture.
However, once you've pushed yourself through the technical details you're rewarded with Part 4: Mission Programs and Operations. This part alone is worth the price of the book. It's just brilliant. Here O'Brien walks us through each part of the mission. Everything from the initial launch, the lunar landing, the lunar rendezvous, and of course the re-entry into earth's atmosphere. In these chapters you'll learn the basics of space navigation, the optimal re-entry angle should you ever find yourself outside earth's gravity (spoiler: 6.48 degrees), and much more. This is a part I'll make sure to re-read soon.
To summarize, I'm a big fan of the Apollo Guidance Computer (both the machine and the book). The Apollo mission was an amazing achievement and a book like this brings well-deserved attention to the engineering marvel behind its guidance computer. I find it genuinely fascinating how the team behind it managed to come up with several technologies that have only become mainstream in recent years, like for example virtual machines (yes, the Apollo Guidance Computer had an interpreter that modeled a completely different hardware architecture implemented in software). If you're interested in the Apollo program or wants a deeper understanding of computer technology, then this book is for you. Just be prepared to focus because it isn't an easy read.
Reviewed August 2016