Posted on 2024-05-16

[2024-04-30 Tue]

  1. Curio[u]sity as the fundamental tool / orientation / weapon that underlies deconfusion

    1. Not a desperation-based seeking, and not a blind attention-flow-based consumption, but a sort of grounded and reflective state from which one is curious and seeks to understand

  2. Succinct thought experiments as fundamental tools for deconfusion

    1. Concrete thought experiments to separate out conflicting intuitions

  3. Reading The Sense Of Physical Necessity: A Naturalism Demo (Introduction) - LessWrong…

    1. Yes, I consider Naturalism a skill intertwined or a core facet of deconfusion

    2. Logan talks about the idea of making something ‘truly part of you’.

      1. It seems pretty unlikely that most cognitive tools we have and use can be efficiently made ‘truly part of us’. Alternately, it seems likely to me that the amount of cognitive scaffolding necessary for us to orient and steer reality is immense and trying to make things truly part of ourselves in general seems massively inefficent, much more so than using spaced repetition in general

      2. But yes. For a few things? It makes a lot of sense. I’d count my investigation into moral cognition, and my desire to investigate surrogation, and my time spent in 2023 investigating deconfusion, as instances of me attempting this

        1. Not that I succeeded to an extent that I feel satisfied about

    3. This is actually an interesting field report on attempting to investigate (but not necessarily deconfuse) an inchoate concept

[2024-05-01 Wed]

  1. Deconfusion is hard

    1. You have access to very little information, and are working on the edges of human knowledge, dealing with things that people may have been confused about for millenia

    2. One huge common bottleneck for deconfusion likely is math and theoretical computer science understanding. A lot of times, understanding a concept (why does an apple fall down to the ground?) relies on a vast edifice of observations and mathematical scaffolding such that you can build a theory on top of it that makes sense and gives you predictions

      1. Until then, it is likely that you’ll be absorbed in a form of investigation (of the sort that Logan points at in their Naturalism sequence)

      2. And even then, this investigation is not enough without the mathematical scaffolding

        1. Just because you have observed all these disparate pieces of evidence for ‘a thing called gravity’, you won’t be able to coherently describe it in understandable terms without some mathematical scaffolding

        2. You can have non-formalist theoretical scaffolding (like Newton’s three ‘laws’), and that does constitute a progress in our understanding of the situation, as long as you have some causal explanation for why the heck things are the way they are (otherwise you are just doing a linear regression and summarizing the resulting model)

        3. So to a certain extent you do have an incremental progress in understanding, but progressing to the point that mathematical scaffolding is involved results in a massive increase in the level of understanding and the quality and precision of predictions you can make based on your theory

          1. See Faraday’s notebooks that some dude read and turned into Maxwell’s equations

            1. In general distillation is incredibly valuable and also incredibly undervalued

              1. Maybe I could carve out a niche where what I do is essentially distillation work

    3. Holdable thought experiments in general serve as an advancement in deconfusion research, because it is communicable to others and they can see what you are pointing at

      1. Also, it serves as a powerful waypoint as you try to build up scaffolding and understanding about things

      2. In decision theory, for example, the Newcomb’s problem served as one such tool

      3. Other such tools may be the blegg-rube question in Yud’s Words sequence, the “Do I look fat in this jeans?” question (for surrogation), maybe the questions in Thinking Physics (although I’m not sure and would like to check), the mirror asymmetry question in the beginning of Good and Real

    4. The mentorship issues that the people usually attracted to this can exacerbate this difficulty too

  2. Butterfly Ideas posits that the thing that is the source of your curiousity, your new idea, your inspiration, your confusion, that thing is fragile, that exposing it to other people’s opinions will usually crush the butterfly, and due to this you lose the ability to follow the butterfly idea until you have a more coherent and communicable and robust-to-scrutiny version of it

    1. I think that maybe it is a better idea for individuals trying to learn and master deconfusion to be able to have more resilience in the face of outside scrutiny of their intuitions and feelings

    2. This is important because having such an ability seems like a proxy for being able to notice such intuitions, to return to them again and again because you find yourself continually interested in them, to allow them the space to unfurl in your mind

    3. A fundamental orientation of curiosity instead of rejection (when it comes to claims or beliefs) seems useful – although instead of setting yourself up to be psyopped by memeplexes that eat up massive amounts of your time and attention, you can attempt to use whatever strategy you want to investigate what is going on

      1. But even then, overall, an orientation of curiosity in general is very much the antithesis of a sort of rushed or hurried attempt at investigation

      2. After all, it is kind of like introspection, or Naturalism. You are trying to understand, and sense-making cannot be rushed.

  3. Building gears-based models is expensive and involves what is effectively a capital investment

    So, there’s this inherent problem with deep gearsy models, where you have to convey a bunch of upstream gears (and the evidence supporting them) before talking about the downstream questions of interest, because if you work backwards then peoples’ brains run out of stack space and they lose track of the whole multi-step path. But if you just go explaining upstream gears first, then people won’t immediately see how they’re relevant to alignment or timelines or whatever, and then lots of people just wander off. Then you go try to explain something about alignment or timelines or whatever, using an argument which relies on those upstream gears, and it goes right over a bunch of peoples’ heads because they don’t have that upstream gear in their world-models.

    Yudkowsky and Christiano discuss “Takeoff Speeds” - LessWrong 2.0 viewer (quoted from a gwern comment)

    1. This matters a lot for deconfusion-style research

    2. I couldn’t make headway into decision theory or even understand what the hell it is all about and why would one care about all this, until I had enough “upstream gears” to be able to make some headway into this

    3. This was also the case for deconfusion – I read some adam shimi essays but actually understanding and valuing deconfusion came about after I had some mentorship that built up the foundational experiences I would use to have the intuition to bootstrap my own understanding and vision for deconfusion

    4. So, what could we conclude from this?

      1. Tentatively, I’d believe that the most valuable work one can do as a part of deconfusion research, aside from actually continuing to track the confusions they are trying to understand and work towards doing so, is to spend time learning critical gears-ey models that seem core or relevant to the things they are trying to do

        1. You usually get evidence for some field being relevant and therefore you’d actually care about wanting to understand what is going on – if you find yourself using ‘shoulds’ to motivate yourself, it is an indication that perhaps this isn’t really relevant (and that you should investigate why you feel this way, and perhaps figure out why you ended up in this situation in the first place)

      2. For example, one could want to significantly improve their understanding of computability theory or mathematical logic, because they feel like it is relevant to the things they are thinking about and investing in improved understanding here would be quite valuable

        1. This can be modeled as a form of investment, because it is, given opportunity costs and the almost discrete nature of the investment (badly learning, or learning half of basic computability theory, for example, is not as useful as learning a ‘discrete’ chunk of it that is useful in its entirety)

        2. In general I’d recommend having some time everyday maintaining your gears, some time everyday investing in learning fundamental gears, and some time everyday trying to do deconfusion (assuming you are attempting to do deconfusion full time)

[2024-05-06 Mon]

  1. There’s an interesting dichotomy here in the MIRI research style that seems to involve, on one hand, go as concrete as possible when it comes to your proposals for solutions, and on the other hand, a focus on mathematical abstractions

    Fundamentally, the whole problem here is, “You’re allowed to look at floating-point numbers and Python code, but how do you get from there to trustworthy nanosystem designs?” So saying “Well, we’ll look at some thoughts we can understand, and then from out of a much bigger system will come a trustworthy output” doesn’t answer the hard core at the center of the question. Saying that the humans will have AI support doesn’t answer it either. – 2021 miri conversations

    1. Why is that?

      1. I get that concreteness helps with proposals and being able to grasp onto things, and think about them, and especially when it comes to deconfusion

      2. And I guess the mathematical abstractions are another set of thinking tools intended to help think about this, that have high value

    2. This style of research seems not very amenable to the sort of incremental strategy that academia tends to foster, I think

    3. I want to read arbital – it likely has some writing on concreteness I want to reread

      1. I didn’t find anything relevant, which is interesting and strange

      2. Okay I got it – Executable philosophy

        • Many academic philosophers haven’t learned the programmers’ discipline of distinguishing concepts that might compile. If we imagine rewinding the state of understanding of computer chess to what obtained in the days when Edgar Allen Poe proved that no mere automaton could play chess, then the modern style of philosophy would produce, among other papers, a lot of papers considering the ‘goodness’ of a chess move as a primitive property and arguing about the relation of goodness to reducible properties like controlling the center of a chessboard.

        There’s a particular mindset that programmers have for realizing which of their own thoughts are going to compile and run, and which of their thoughts are not getting any closer to compiling. A good programmer knows, e.g., that if they offer a 20-page paper analyzing the ‘goodness’ of a chess move in terms of which chess moves are ‘better’ than other chess moves, they haven’t actually come any closer to writing a program that plays chess. (This principle is not to be confused with greedy reductionism, wherein you find one thing you understand how to compute a bit better, like ‘center control’, and then take this to be the entirety of ‘goodness’ in chess. Avoiding greedy reductionism is part of the skill that programmers acquire of thinking in effective concepts.)

        Many academic philosophers don’t have this mindset of ‘effective concepts’, nor have they taken as a goal that the terms in their theories need to compile, nor do they know how to check whether a theory compiles. This, again, is one of the foundational reasons why despite there being a very large edifice of academic philosophy, the products of that philosophy tend to be unuseful in AGI.

      3. Here’s another, from Methodology of unbounded analysis

        The pitfall of residual terms.

        Besides “simplifying away the confusing part of the problem”, another way that unbounded thinking can “bounce off” a confusing problem is by creating a residual term that encapsulates the confusion. Currently, there are good unbounded specifications for Cartesian non-self-modifying : if we allow the agent to use unlimited computing power, don’t allow the environment to have unlimited computing power, don’t ask the agent to modify itself, separate the agent from its environment by an impermeable barrier through which only sensory information and motor outputs can pass, and then ask the agent to maximize a sensory reward signal, there’s . If we then introduce permeability into the Cartesian boundary and allow for the possibility that the agent can take drugs or drop an anvil on its own head, nobody has an unbounded solution to that problem any more.

        So one way of bouncing off that problem is to say, “Oh, well, my agent calculates the effect of its motor actions on the environment and the expected effect on sensory information and the reward signal, plus a residual term γ which stands for the expected utility of all effects of the agent’s actions that change the agent’s processing or destroys its hardware”. How is γ to be computed? This is left unsaid.

        In this case you haven’t omitted the confusing part of the problem, but you’ve packed it into a residual term you can’t give an effective specification for calculating. So you no longer have an unbounded solution—you can’t write down the Python program that runs given unlimited computing power—and you’ve probably failed to shed any important light on the confusing part of the problem. Again, one of the warning signs here is that the paper is very easy to write, and reading it does not make the key problem feel less like a hard opaque ball.

        1. So unbounded analysis allows for concreteness, which allows for making it easier to think about things

      4. This is interesting, and I think I would like to dive deeper into this theme of concreteness as it relates to deconfusion, but perhaps not immediately

  2. I think an important realization is that math is a thinking tool, an aid for clear thought, similar to other thinking tools such as unbounded analysis or concreteness

    1. or thought experiments

    2. Basically, Daniel Dennett’s thoughts on thinking tools seems quite appropriate here

[2024-05-07 Tue]

  1. Dropped Dennett’s Intuition Pumps book after reading this quote

    A young child is asked what her father does, and she answers, “Daddy is a doctor.” Does she believe what she says? In one sense, of course, but what would she have to know to really believe it? (What if she’d said, “Daddy is an arbitrager” or “Daddy is an actuary”?) Suppose we suspected that she was speaking without understanding, and decided to test her. Must she be able to produce paraphrases or to expand on her claim by saying her father cures sick people? Is it enough if she knows that Daddy’s being a doctor precludes his being a butcher, a baker, a candlestick maker? Does she know what a doctor is if she lacks the concept of a fake doctor, a quack, an unlicensed practitioner? For that matter, how much does she need to understand to know that Daddy is her father? (Her adoptive father? Her “biological” father?) Clearly her understanding of what it is to be a doctor, as well as what it is to be a father, will grow over the years, and hence her understanding of her own sentence, “Daddy is a doctor,” will grow. Can we specify—in any nonarbitrary way—how much she must know in order to understand this proposition “completely”? If understanding comes in degrees, as this example shows, then belief, which depends on understanding, must come in degrees as well, even for such mundane propositions as this. She “sorta” believes her father is a doctor—which is not to say she has reservations or doubts, but that she falls short of the understanding that is an important precondition for any useful concept of belief.

    1. The issue here is that it seems pretty obvious to me that the words uttered are not necessarily the things communicated. The child is stating something to the effect of “the person I relate to in these ways, who I call Daddy, said that he is a ‘doctor’, although I don’t know what that means”.

    2. There’s something coherent being communicated here, but Dennett’s thought experiment implies that we use an arbitrary standard for what the sentence is supposed to communicate or intend to communicate.

    3. The words are not the thoughts behind them. It is a surrogation error.

[2024-05-14 Tue]

  1. Deconfusion: concreteness, unbounded analysis / toy models, thought experiments, sub-problems and factorization

  2. deconfusion seems more a grab bag of thinking tools than I had thought before

    1. Although on the other hand it also seems significantly more… integrating? There’s possibly a pattern underlying this

  3. Deconfusion: paradoxes, conflicting intuitions, confusions

  4. deconfusion: philosophy, math, logic, epistemology

[2024-05-15 Wed]

  1. Is deconfusion bottlenecked on object-level theoretical and empirical work?

    1. That is, the sort of research that involves Darwinian collection and cataloguing of specimens?

    2. Recall that it took astronomy and physics advances for Laplace to posit a mechanical (clockwork) universe, and since then we’ve had more and more evidence that this is the case

    3. quantum mechanics would have gone the way of ‘superposition’ / wave collapse without Everett’s proposal – and even then few people really believed him

      1. On that note, let’s say you succeed at deconfusion work. You figured out an answer.

      2. What’s the likelihood that you can steer the world to actually rely on your answer?

        1. Not that this question matters since MIRI hasn’t solved corrigibility or a limited pivotal act level alignment, but it still is impressive just how difficult it has been for people to understand and wrap their head around MIRI’s model of things

      3. Lots of deconfusion progress seems to have been downstream of philosophical progress, regardless of the field: Everett, Laplace, Daniel Dennett (consciousness), Judea Pearl (causality), Nick Bostrom (existential risk), Eliezer Yudkowsky (most of ASI alignment stuff)

      4. Although they also seemed to have been doing a lot of engineering stuff beforehand?

        1. Judea Pearl (B.S. electrical engineering, M.S. electrical engineering, M.S. physics, Ph.D. electrical engineering)

        2. Eric Drexler (B.S inderdisciplinary sciences, M.S. Astro/aerospace engineering, Ph.D. MIT Media Lab (nanotech stuff))

        3. Nick Bostrom (B.A. ?, M.A. philosophy and physics, M.Sc. computational neuroscience, Ph.D. philosophy)

        4. Daniel Dennett (B.A. philosophy, D.Phil. philosophy)

      5. Aside from Daniel Dennett, most of the other people seem to have been collecting academic credentials

      6. Anyway, the point is that maybe deconfusion progress is bottlenecked on more object level work?

        1. On one hand, Judea Pearl and Eric Drexler seemed to have needed the object level work to coalesce their philosophical insights, to support them

        2. On the other hand, how much of deconfusion really is bottlenecked on such empirical evidence? The point of deconfusion really is that you understand, and understanding usually seems to involve having causal models that don’t rely on concrete pieces of evidence you see in reality

        3. And yet, it must have been really hard for people to grok and argue for heliocentrism before the mathematician and astronomer Kepler started finding evidence to support heliocentrism

          The discovery of the phases of Venus was one of the more influential reasons for the transition from geocentrism to heliocentrism.[10] Sir Isaac Newton’s Philosophiæ Naturalis Principia Mathematica concluded the Copernican Revolution. The development of his laws of planetary motion and universal gravitation explained the presumed motion related to the heavens by asserting a gravitational force of attraction between two objects

          Copernican Revolution - Wikipedia

        4. Copernicus proposed a heliocentric model, which was rough, and most likely incorrect in many ways, and people built on top of it, refining it.

          1. The evidence that Kepler saw that convinced him of heliocentrism over geocentrism worked because he had competing theories in his mind

          2. So to a certain extent I think it still is important to have philosophical progress made farther than empirical progress such that empirical evidence is used to select between plausible hypotheses

          3. On the other hand, it probably is stupid hard and also hard to make progress on, the philosophical models you propose, without some valuable pieces of empirical data backing things up

  2. I’m tempted to let other people do the mathematical logic and deconfusion work (main body) and perhaps switch to a more aggressive strategy of being generally capable of making things happen in ML and software engineering, given the rather fucked status quo of the frontier labs

    1. We seem to lack people who are generally competent and are not confused and are relatively aligned with the notion of trying to stay alive

    2. To be fair, we seem to have an even greater lack of people doing deconfusion and mathematical logic style theoretical research.

[2024-05-16 Thu]

  1. Book recommendation: Physics Avoidance, Mark Wilson

    1. Rec by Adam Shimi