Diagrams of linear regression

I made a big diagram describing some assumptions (MLR1-6) that are used in linear regression. In my diagram, there are categories (in rectangles with dotted lines) of mathematical facts that follow from different subsets of MLR1-6. References in brackets are to Hayashi (2000).


A couple of comments about the diagram are in order.

  • UU,YY are a n×1n \times 1 vectors of random variables. XX may contain numbers or random variables. β\beta is a K×1K \times 1 vector of numbers.
  • We measure: realisations of YY, (realisations of) XX. We do not measure: β\beta, UU. We have one equation and two unknowns: we need additional assumptions on UU.
  • We make a set of assumptions (MLR1-6) about the joint distribution f(U,X)f(U,X). These assumptions imply some theorems relating the distribution of bb and the distribution of β\beta.
  • In the diagram, I stick to the brute mathematics, which is entirely independent of its (causal) interpretation.1
  • Note the difference between MLR4 and MLR4’. The point of using the stronger MLR4 is that, in some cases, provided MLR4, MLR2 is not needed. To prove unbiasedness, we don’t need MLR2. For finite sample inference, we also don’t need MLR2. But whenever the law of large numbers is involved, we do need MLR2 as a standalone condition. Note also that, since MLR2 and MLR4’ together imply MLR4, clearly MLR2 and MLR4 are never both needed. But I follow standard practise (e.g. Hayashi) in including them both, for example in the asymptotic inference theorems.
  • Note that since XXX’X is a symmetric square matrix, QQ has full rank KK iff QQ is positive definite; these are equivalent statements (see Wooldridge 2010 p. 57). Furthermore, if XX has full rank KK, then XXX’X has full rank KK, so MLR3* is equivalent to MLR3 plus the fact that QQ is finite (i.e actually converges).
  • Note that given MLR2 and the law of large numbers, QQ could alternatively be written E[XX]E[X’X]
  • Note that whenever I write a plimp\lim and set it equal to some matrix, I am assuming the matrix is finite. Some treatments will explicitly say QQ is finite, but I omit this.
  • Note that by the magic of matrix inversion, ((XX)1)kk=1i=1n(xkixˉk)2((X'X)^{-1})_{kk} = \frac{1}{\sum_{i=1}^n (x_{ki} - \bar x _{k})^2}. 2
  • Note that these expressions are equal: bkβkse(bk)=(bkβk)nKU^U^((XX)1)kk\frac{b_k -\beta_k}{se(b_k)} = \frac{(b_k - \beta_k)\sqrt{n-K}} {\hat{U'} \hat{U} ((X'X)^{-1})_{kk}}. Seeing this helps with intuition.

The second diagram gives the asymptotic distribution of the IV and 2SLS estimators.3


I made this is PowerPoint, not knowing how to do it better. Here is the file.

  1. But of course what really matters is the causal interpretation.

    As Pearl (2009) writes, “behind every causal claim there must lie some causal assumption that is not discernible from the joint distribution and, hence, not testable in observational studies”. If we wish to interpret β\beta (and hence bb) causally, we must interpret MLR4 causally; it becomes a (strong) causal assumption.

    As far as I can tell, when econometricians give a causal interpretation it is typically done thus (they are rarely explicit about it):

    • MLR1 holds in every possible world (alternatively: it specifies not just actual, but all potential outcomes), hence UU is unobservable even in principle.
    • yet we make assumption MLR4 about UU

    This talk of the distribution of a fundamentally unobservable “variable” is a confusing device. Pearl’s method is more explicit: replace MLR{14}\{1 \quad 4\} with the causal graph below, where :=:= is used to make it extra clear that the causation only runs one way. MLR1 corresponds to the expression for YY (and, redundantly, the two arrows towards YY), MLR4 corresponds to the absence of arrows connecting XX and UU. We thus avoid “hiding causal assumptions under the guise of latent variables” (Pearl). (Because of the confusing device, econometricians, to put it kindly, don’t always sharply distinguish the mathematics of the diagram from its (causal) interpretation.)

  2. Think about it! This seems intuitive when you don’t think about it, mysterious when you think about it a little, and presumably becomes obvious again if you really understand matrix algebra. I haven’t reached the third stage. 

  3. For IV, it’s even clearer that the only reason to care is the causal interpretation. But I follow good econometrics practice and make only mathematical claims. 

January 10, 2018

The expected value of the long-term future, and existential risk

I wrote an article describing a simple model of the long-term future. Here it is:


A number of ambitious arguments have recently been proposed about the moral importance of the long-term future of humanity, on the scale of millions and billions of years. Several people have advanced arguments for a cluster of related views. Authors have variously claimed that shaping the trajectory along which our descendants develop over the very long run (Beckstead, 2013), or reducing extinction risk, or minimising existential risk (Bostrom, 2002), or reducing risks of severe suffering in the long-term future (Althaus and Gloor, 2016) are of huge or overwhelming importance. In this paper, I develop a simple model of the value of the long-term future, from a totalist, consequentialist, and welfarist (but not necessarily utilitarian) point of view. I show how the various claims can be expressed within the model, clarifying under which conditions the long-term becomes overwhelmingly important, and drawing tentative policy implications.

December 28, 2017

Relationships between the axiomatic systems of modal propositional logic

I made a diagram of this, based on Sider’s Logic for philosophy. An orange arrow from sytems S to system S’ means anything that is provable (and hence valid) in S is provable (and valid) in S’. I don’t add lables to the orange arrows since their meanings are clear. A green arrow from axiom schema to another says that the second schema is provable from the first in a particular system which I label.


December 26, 2017