2024 reads - mesaoptimizer

Here’s a list of what I believe to be the best literature I read in 2024.

Books:

A Philosophy of Software Design, by John Ousterhout
Probably one of the best books on software design I’ve ever read. I wish I had read this in 2018, when I first began seriously building my skill of programming systems. I realized that most of my aversion to languages like Python came from dealing with other people’s badly designed code, not due to the language’s features itself. I became a lot more willing to work in arbitrary languages, and my focus shifted from language features to the design of the codebase, when it comes to evaluating how sustainable it would be for me to work on a programming project.
Probability: A Logic of Science, by E. T. Jaynes
While I only read the first eight to nine chapters (and by ‘read’ I mean watching lecture videos, reading the chapter closesly, husing Claude’s help when stuck trying to understand something – which overall took about 1.5 weeks per chapter, when doing this almost full-time), I believe that I got a much better understanding of probability theory due to it. If I had the free time, I’d love to get through the rest of the book one day.
A Mathematician’s Lament and Measurement, by Paul Lockhart
Helped me wrap my mind around mathematics (or at least the Platonist view of mathematics), which in turn has helped me study difficult mathematical topics.
There is no Antimemetics Division, by qntm
This novel is a celebration of what humanity can accomplish, even when facing utterly impossible odds. It is also a celebration of scientists and researchers and engineers, as key players in steering the world into better trajectories. It is not a masterpiece – the second part of the novel feels messier compared to the first part – but its still the only science fiction novel I have found myself rereading parts of, again and again.
Simulator-Based Optimization, by Gosavi
I wanted to understand Markov Decision Processes, and found Richard Sutton’s RL book to be vaguely written and with significantly worse explanations for RL in comparison to this book.

Essays:

Gwern, On Having Enough Socks
Gwern, Evolution as backstop for reinforcement learning
Gwern, Timing Technology: Lessons from the Media Lab
Tsvi-Benson Tilsen, Koan: divining alien datastructures from RAM activations
Will Larson, Layers of Context
Nuño Sempere, Unflattering aspects of Effective Altruism
Yosefk, Fun won’t get it done
Valentine, Here’s the exit.

Papers:

Ousterhout, Always Measure One Level Deeper
The paper that introduced me to measuring performance and benchmarking, and giving me a basic idea of why and how I’d want to do so.
Ousterhout, Scripting
This paper clarified my muddled thoughts on languages and whether or not the safety provided by static typing (with or without type inference) is irreplaceable. I now am more willing to use scripting languages and am less worried about the ‘footguns’ that the dynamic typing would enable.
Chollet, On the Measure of Intelligence
Until I read this paper, I had a vague notion in my head that there was a specific phase shift between a model that doesn’t have general intelligence, and a model that does. That general intelligence was a capability that was binary: you either had it or you didn’t. Reading this paper made me less certain of that belief, and now I consider it more plausible that generalizability is more of a gradient, and in fact, ‘superintelligence’ involves a mind that can generalize much farther than humans can, given the same amount of information.
Repenning and Sterman, Nobody Ever Gets Credit for Fixing Problems That Never Happened
A valuable paper, but more of an incremental update in helping me build a coherent model of operations.
Weatherson, The End of Decision Theory
This paper proposes one way to think about how decision theory research is useful. In essence, decision theories are idealizations, and are descriptive, not prescriptive. This bit is important: it seems like a lot of people seem to consider decision theories as prescriptive theories that would help them decide what to do in real life. Unfortunately, most decision theories make extreme assumptions about the capabilities of the agent involved, and therefore it is a mistake to attempt to read an FDT paper and then murder people because that would supposedly lead to you not having suffered due to them in the first place (this is a real example).
Barnett and Gillen, Without fundamental advances, misalignment and catastrophe are the default outcomes of training powerful AI
One of the best modern attempts at trying to describe the MIRI model of why AI alignment is difficult.