Philosophy success stories

Philosophical problems are never solved for the same reason that treasonous conspiracies never succeed: as successful conspiracies are never called “treason,” so solved problems are no longer called “philosophy.”

— John P. Burgess


  1. The consequences of defeatism
  2. My approach
    1. Identifiable successes
    2. From confusion to consensus
    3. No mere disproofs
  3. Successes: my list so far
  4. Related posts

In this new series of essays, I aim to collect some concrete examples of success stories of philosophy (more below on quite what I mean by that). This is the introductory chapter in the series, where I describe why and how I embarked on this project.

Most academic disciplines love to dwell on their achievements. Economists will not hesitate to tell you that the welfare theorems, or the understanding of comparative advantage, were amazing achievements. (In Economics rules Dani Rodrik explicitly talks about the “crown jewels” of the discipline). Biology has the Nobel Prize to celebrate its prowess, and all textbooks duly genuflect to Watson and Crick and other heroes. Physics and Mathematics are so successful that they needn’t brag for their breakthroughs to be widely admired. Psychologists celebrate Kahneman, linguists Chomsky.

Philosophy, on the other hand, like a persecuted child that begins to internalise its bullies’ taunts, has developed an unfortunate inferiority complex. As if to pre-empt those of the ilk of Stephen Hawking, who infamously pronocuned philosophy dead, philosophers are often the first to say that their discipline has made no progress in 3000 years. Russell himself said in The Problems of Philosophy:

Philosophy is to be studied not for the sake of any definite answers to its questions, since no definite answers can, as a rule, be known to be true, but rather for the sake of the questions themselves.

This view is very much alive today, as in Van Iwagen (2003):

Disagreement in philosophy is pervasive and irresoluble. There is almost no thesis in philosophy about which philosophers agree.

Among some writers, one even finds a sort of perverse pride that some topic is “one of philosophy’s oldest questions” and “has been discussed by great thinkers for 2000 years”, as if this were a point in its favour.

The consequences of defeatism

This state of affairs would be of no great concern if the stakes were those of a mere academic pissing contest. But this defeatism about progress has real consequences about how the discipline is taught.

The first is history-worship. A well-educated teenager born this century would not commit the fallacies that litter the writings of the greats. The first sentence of Nicomachean Ethics is a basic quantificational fallacy. Kant’s response to the case of the inquiring murderer is an outrageous howler. Yet philosophy has a bizarre obsession with its past. In order to teach pre-modern texts with a straight face, philosophers are forced to stretch the principle of charity beyond recognition, and to retrofit newer arguments onto the fallacies of old. As Dustin Locke writes here, “The principle of charity has created the impression that there is no progress in philosophy by preserving what appear to be the arguments and theories of the great thinkers in history. However, what are being preserved are often clearly not the actual positions of those thinkers. Rather, they are mutated, anachronistic, and frankensteinian reconstructions of those positions.” Much time is wasted subjecting students to this sordid game, and many, I’m sure, turn their backs on philosophy as a result.

The second, related consequence is the absence of textbooks. No one would dream of teaching classical mechanics out of Principia or geometry out of Euclid’s Elements. Yet this is what philosophy departments do. Even Oxford’s Knowledge and Reality, which is comparatively forward-looking, has students read from original academic papers, some as old as the 1950s, as you can see here. It’s just silly to learn about counterfactuals and causation from Lewis 1973 (forty-four years ago!). Thankfully, there is the Stanford Encyclopedia, but it’s incomplete and often pitched at too high a level for beginners. And even if Stanford can be counted as a sort of textbook, why just one? There should be hundreds of textbooks, all competing for attention by the clarity and precision of their explanations. That’s what happens for any scientific topic taught at the undergraduate level.

My approach

Identifiable successes

In this series, I want to focus on succcess stories that are as atomic, clear-cut, and precise as possible. In the words of Russell:

Modern analytical empiricism […] differs from that of Locke, Berkeley, and Hume by its incorporation of mathematics and its development of a powerful logical technique. It is thus able, in regard to certain problems, to achieve definite answers, which have the quality of science rather than of philosophy. It has the advantage, in comparison with the philosophies of the system-builders, of being able to tackle its problems one at a time, instead of having to invent at one stroke a block theory of the whole universe. Its methods, in this respect, resemble those of science.

Some of the greatest philosophical developments of the modern era, both intellectually speaking and social-impact wise, were not of this clear-cut kind. Two examples seem particularly momentous:

  • The triumph of naturalism, the defeat of theism, and the rise of science a.k.a “natural philosophy”.
  • The expanding circle of moral consideration: to women, children, those of other races, and, to some extent, to non-human animals. (See Pinker for an extended discussion).

These changes are difficult to pin down to a specific success story. They are cases of society’s worldview shifting wholesale, over the course of centuries. With works such as Novum Organum or On the Subjection of Women, philosophising per se undoubtedly deserves a share of the credit. Yet the causality may also run the other way, from societal circumstances to ideas; technological and political developments surely had their role to play, too.

Instead I want to focus on smaller, but hopefully still significant success stories, whose causal story should hopefully be easier to extricate.

From confusion to consensus

The successes need to be actual successes of the discipline, not just theories I think are successful. For example, consequentialism or eliminativism about caustion don’t count, since there is considerable debate about them still1. Philosophers being a contrarian bunch, I won’t require complete unanimity either, but rather a wide consensus, perhaps something like over 80% agreement among academics at analytic departments.

Relatedly, there needs to have been actual debate and/or confusion about the topic, previous to the success story. This is often the hardest desideratum to intuitively accept, since philosophical problems, once solved, tend to seem puzzlingly unproblematic. We think “How could people possibly have been confused by that?”, and we are hesitant to attribute basic misunderstandings to great thinkers of the past. I will therefore take pains to demonstrate, with detailed quotes, how each problem used to cause real confusion.

No mere disproofs

In order to make the cases I present as strong as possible, I will adopt a narrow definition of success. Merely showing the fallacies of past thinkers does not count. Philosophy has often been able to conclusively restrict the space of possible answers by identifying certain positions as clearly wrong. For example, no-one accepts Mill’s “proof” of utilitarianism as stated, or Anselm’s ontological argument. And that is surely a kind of progress2, but I don’t want to rely on that here. When physics solved classical mechanics, it did not just point out that Aristotle had been wrong, rather it identified an extremely small area of possibility-space as the correct one. That is the level of success we want to be gunning for here. For the same reason, I also won’t count coming up with new problems, such as Goodman’s New Riddle of Induction, as progress for my purposes.

Successes: my list so far

Here are the individual success stories, in no particular order:

  1. Predicate logic: arguably launched analytic philosophy, clarified ambiguities that had held back logic for centuries
  2. Computability: a rare example of an undisputed, non-trivial conceptual analysis
  3. Modal logic and its possible world semantics: fully clarified the distinction between sense and reference, dissolved long-standing debates arising from modal fallacies.
  4. The formalisation of probability: how should we reason about unsure things? Before the 1650s, everyone from Plato onwards got this wrong.
  5. Bayesianism: the analysis of epistemic rationality and the solution to (most of) philosophy of science.
  6. Compatibilism about free will (forthcoming)

It’s very important to see these five stories as illustrations of what success looks like in philosophy. The list is not meant to be exhaustive. Nor are all five stories supposed to follow the same pattern of discovery; on the contrary, they are examples of different kinds of progress.

Related posts

These posts don’t describe success stories, but are related:

  1. Over the course of writing this series, I have frequently found to my consternation that topics I thought were prime candidates for success stories were in fact still being debated copiously. Perhaps one day I’ll publish a list of these, too. In case it wasn’t clear, by the way, this series should not be taken to mean that I am a huge fan of philosophy as an academic discipline. But I do think that, in some circles, the pendulum has swung too far towards dismissal of philosophy’s achievements. 

  2. In fact, there’s likely been far more of this kind of progress than you would guess from reading contemporary commentaries of philosophers of centuries past, as Dustin Locke argues here

December 3, 2017

Modesty and diversity: a concrete suggestion

In online discussions, the number of upvotes or likes a contribution receives is often highly correlated with the social status of the author within that community. This makes the community less epistemically diverse, and can contribute to feelings of groupthink or hero worship.

Yet both the author of a contribution and its degree of support contain bayesian evidence about its value. If the author is a widely respected expert, the amount of evidence is arguably so large that it should overwhelm your own inside view.

We want each individual to invest the socially optimal amount of resources into critically evaluating other people’s writing (which is higher than the amount that would be optimal for individual epistemic rationality). Yet we also all and each want to give sufficient weight to authority in forming our all-things-considered views.

As Greg Lewis writes:

The distinction between ‘credence by my lights’ versus ‘credence all things considered’ allows the best of both worlds. One can say ‘by my lights, P’s credence is X’ yet at the same time ‘all things considered though, I take P’s credence to be Y’. One can form one’s own model of P, think the experts are wrong about P, and marshall evidence and arguments for why you are right and they are wrong; yet soberly realise that the chances are you are more likely mistaken; yet also think this effort is nonetheless valuable because even if one is most likely heading down a dead-end, the corporate efforts of people like you promises a good chance of someone finding a better path.

Full blinding to usernames and upvote counts is great for critical thinking. If all you see is the object level, you can’t be biased by anything else. The downside is you lose a lot of relevant information. A second downside is that anonymity reduces the selfish incentives to produce good content (we socially reward high-quality, civil discussion, and punish rudeness.)

I have a suggestion for capturing (some of) the best of both worlds:

  • first, do all your reading, thinking, upvoting and commenting with full blinding
  • once you have finished, un-blind yourself and use the new information to
    • form your all-things-considered view of the topic at hand
    • update your opinion of the people involved in the discussion (for example, if someone was a jerk, you lower your opinion of them).

To enable this, there are now two user scripts which hide usernames and upvote counts on (1) the EA forum and (2) LessWrong 2.0. You’ll need to install the Stylish browser extension to use them.

November 8, 2017

Why don't we like arguments from authority?


  1. A tension between bayesiansim and intuition
  2. Attempting to reconcile the tension
    1. Argument screens of authority
    2. Ain’t nobody got time for arguments
    3. Free-riding on authority?
  3. What to do?

A tension between bayesiansim and intuition

When considering arguments from authority, there would appear to be a tension between widely shared intuitions about these arguments, and how Bayesianism treats them. Under the Bayesian definition of evidence, the opinion of experts, of people with good track records, even of individuals with a high IQ, is just another source of data. Provided the evidence is equally strong, there is nothing to distinguish it from other forms of inference such as carefully gathering data, conducting experiments, and checking proofs.

Yet we feel that there would be something wrong about someone who entirely gave up on learning and thinking, in favour the far more efficient method unquestionably adopting all expert views. Personally, I still feel embarrassed when, in conversation, I am forced to say “I believe X because Very Smart Person Y said it”.

And it’s not just that we think it unvirtuous. We strongly associate arguments from authority with irrationality. Scholastic philosophy went down a blind alley by worshipping the authority of Aristotle. We think there is something espistemicaly superior about thinking for yourself, enough to justify the effort, at least sometimes.1

Attempting to reconcile the tension

Argument screens of authority

Eliezer Yudkowsky has an excellent post, “Argument screens off authority”, about this issue. You should read it to understand the rest of my post, which will be an extension of it.

I’ll give you the beginning of the post:

Scenario 1: Barry is a famous geologist. Charles is a fourteen-year-old juvenile delinquent with a long arrest record and occasional psychotic episodes. Barry flatly asserts to Arthur some counterintuitive statement about rocks, and Arthur judges it 90% probable. Then Charles makes an equally counterintuitive flat assertion about rocks, and Arthur judges it 10% probable. Clearly, Arthur is taking the speaker’s authority into account in deciding whether to believe the speaker’s assertions.

Scenario 2: David makes a counterintuitive statement about physics and gives Arthur a detailed explanation of the arguments, including references. Ernie makes an equally counterintuitive statement, but gives an unconvincing argument involving several leaps of faith. Both David and Ernie assert that this is the best explanation they can possibly give (to anyone, not just Arthur). Arthur assigns 90% probability to David’s statement after hearing his explanation, but assigns a 10% probability to Ernie’s statement. Read more

I think Yudkowsky’s post gets things conceptually right, but ignores the important pragmatic benefits of arguments from authority. At the end of the post, he writes:

In practice you can never completely eliminate reliance on authority. Good authorities are more likely to know about any counterevidence that exists and should be taken into account; a lesser authority is less likely to know this, which makes their arguments less reliable. This is not a factor you can eliminate merely by hearing the evidence they did take into account.

It’s also very hard to reduce arguments to pure math; and otherwise, judging the strength of an inferential step may rely on intuitions you can’t duplicate without the same thirty years of experience.

And elsewhere:

Just as you can’t always experiment today, you can’t always check the calculations today. Sometimes you don’t know enough background material, sometimes there’s private information, sometimes there just isn’t time. There’s a sadly large number of times when it’s worthwhile to judge the speaker’s rationality. You should always do it with a hollow feeling in your heart, though, a sense that something’s missing.

These two quotes, I think, overstate how often checking for yourself2 is a worthwhile option, and correspondingly underjustify the claim that you should have a “hollow feeling in your heart” when you rely on authority.

Ain’t nobody got time for arguments

Suppose you were trying to decide which diet is best for your long-term health. The majority of experts believe that the Paleo diet is better than the Neo diet. To simplify, we can assume that either Paleo provides VV units more utility than Neo, or vice versa. The cost of research is CC. If you conduct research, you act according to your conclusions, otherwise, you do what the experts recommend. We can calculate the expected value of research using this value of information diagram:

EV(research)EV(research) simplifies to VpqVkp+VkCVpq-Vkp+Vk-C.

If we suppose that

  • the probability that the experts are correct is p=0.75p = 0.75
  • conditional on the experts being correct, your probability of getting the right answer is q=0.9q = 0.9
  • conditional on the experts being incorrect, your probability of correctly overturning the expert view is k=0.5k = 0.5

How long would it take to do this research? For a 50% chance of overturning the consensus, conditional on it being wrong, a realistic estimate might be several years to get a PhD-level knowledge in the field. But let’s go with one month, as a lower bound. We can conservatively estimate that to be worth $ 5000. Then you should do the research if and only if V>80,000V > 80,000. That number is high. This suggests it would likely be instrumentally rational to just believe the experts.

Of course, this is just one toy example with very questionable numbers. (In a nascent field, such as wild animal suffering research, the “experts” may be people who know little more than you. Then pp could be low and kk could be higher.) I invite you to try your own parameter estimates.

There are also a number of complications not captured in this model:

  • If the relevant belief is located in a dense part of your belief-network, where it is connected to many other beliefs, adopting the views of experts on individual questions might leave you with inconsistent beliefs. But this problem can be avoided by choosing belief-nodes that are relatively isolated, and by adopting entire world-views of experts, composed of many linked beliefs.
  • In reality, you don’t just have a point probability for the parameters pp, qq, kk, but a probability distribution. That distribution may be very non-robust or, in other words, “flat”. Doing a little bit of research could help you learn more about whether experts are likely to be correct, tightening the distribution.

Still, I would claim that the model is not sufficiently wrong to reverse my main conclusion.

At least given numbers I find intuitive, this model suggests it’s almost never worth thinking independently instead of acting on the views of the best authorities. Perhaps thinking critically should leave me with a hollow feeling in my heart, the feeling of goals ill-pursed? Argument may screen off authority, but in the real world, ain’t nobody got time for arguments. More work needs to be done if we want to salvage our anti-authority intuitions in a Bayesian framework.

Free-riding on authority?

Here’s one attempt to do so. From a selfish individual’s point of view, V is small. But not so for a group.

Assuming that others can see when you pay the cost to acquire evidence, they come to see you as an authority, to some degree. Every member of the group thus updates their beliefs slightly based on your research, in expectation moving towards the truth.

More importantly, the value of the four outcomes from the diagram above can differ drastically under this model. In particular, the value of correctly overturning the expert consensus can be tremendous. If you publish your reasoning, the experts who can understand it may update strongly towards the truth, leading the non-experts to update as well.

It is only if we consider the positive externalities of knowledge that eschewing authority becomes rational. For selfish individuals, it is rational to free-ride on expert opinion. This suggests that our aversion to arguments from authority can partially be explained as the epistemic analogue of our dislike for free-riders.

This analysis also suggests that most learning and thinking is not done to personally acquire more accurate beliefs. It may be out of altruism, for fun, to signal intelligence, or to receive status in a community that rewards discoveries, like academia.

Is the free-riding account of our anti-authority intuitions accurate? In a previous version of this essay, I used to think so. But David Moss commented:

Even in a situation where an individual is the only non-expert, say there are only five other people and they are all experts, I think the intuition against deferring to epistemic authority would remain strong. Indeed I expect it may be even stronger than it usually is. Conversely, in a situation where there are many billions of non-experts all deferring to only a couple of experts, I expect the intuition against deferring would remain, though likely be weaker. This seems to count against the intuition being significantly driven by positive epistemic externalities.

This was a great point, and convinced me that at the very least, the free-riding picture can’t fully explain our anti-authority intuitions. However, my intuitions about more complicated cases like David’s are quite unstable; and at this point my intuitions are heavily influenced by bayesian theory as well. So it would be interesting to get more thoughtful people’s intuitions about such cases.

What to do?

It looks like the common-sense intuitions against authority are hard to salvage. Yet this empirical conclusion does not imply that, normatively, we should entirely give up on learning and thinking.

Instead the cost-benefit analysis above offers a number of slightly different normative insights:

  • The majority of the value of research is altruistic value, and is realised through changing the minds of others. This may lead you to: (i) choose questions that are action-guiding for many people, even if they are not for you (ii) present your conclusions in a particularly accessible format.
  • Specialisation is beneficial. It is an efficient division of labour if each person acquires knowledge in one field, and everyone accepts the authority of the specialists over their magisterium.
  • Reducing C can have large benefits for an epistemic community by allowing far more people to cheaply verify arguments. This could be one reason formalisation is so useful, and has tended to propel formal disciplines towards fast progress. To an idealised solitary scientist, translating into formal language arguments he already knows with high confidence to be sound may seem like a waste of time. But the benefit of doing so is that it replaces intuitions others can’t duplicate without thirty years of experience with inferential steps that they can check mechanically with a “dumb” algorithm.

A few months after I wrote the first version of this piece, Grew Lewis wrote (my emphasis):

Modesty could be parasitic on a community level. If one is modest, one need never trouble oneself with any ‘object level’ considerations at all, and simply cultivate the appropriate weighting of consensuses to defer to. If everyone free-rode like that, no one would discover any new evidence, have any new ideas, and so collectively stagnate. Progress only happens if people get their hands dirty on the object-level matters of the world, try to build models, and make some guesses - sometimes the experts have gotten it wrong, and one won’t ever find that out by deferring to them based on the fact they usually get it right.

The distinction between ‘credence by my lights’ versus ‘credence all things considered’ allows the best of both worlds. One can say ‘by my lights, P’s credence is X’ yet at the same time ‘all things considered though, I take P’s credence to be Y’. One can form one’s own model of P, think the experts are wrong about P, and marshall evidence and arguments for why you are right and they are wrong; yet soberly realise that the chances are you are more likely mistaken; yet also think this effort is nonetheless valuable because even if one is most likely heading down a dead-end, the corporate efforts of people like you promises a good chance of someone finding a better path.

I probably agree with Greg here; and I believe that the bolded part was a crucial and somewhat overlooked part of his widely-discussed essay. While Greg believes we should form our credences entirely based on authority, he also believes it can be valuable to deeply explore object-level questions. The much more difficult question is how to navigate this trade-off, that is, how to decide when it’s worth investigating an issue.

  1. This is importantly different from another concern about updating based on other people’s beliefs, that of double counting evidence or evidential overlap. Amanda Askell writes: “suppose that as I’m walking down the street I meet six people in a row who all tell me that a building four blocks away is on fire. I reasonably assume that some of these six people have seen the fire themselves or that they’ve heard that there’s a fire from different people who have seen it. I conclude that I’ve got good testimonial evidence that there’s a fire four blocks away. But suppose that none of them have seen the fire: they’ve all just left a meeting in which a charismatic person Bob told them that there is a fire four blocks away. If I knew that there wasn’t actually any more evidence for the fire claim than Bob’s testimony, I would not have been so confident that there’s a fire four blocks away.

    In this case, the credence that I ended up with was based on the testimony of those six people, which I reasonably assumed represented a diverse body of evidence. This means that anyone asking me what makes me confident that there’s a fire will also receive misleading evidence that there’s a diverse body of evidence for the fire claim. This is a problem of evidential overlap: when several people independently tell me that they have some credence in P, I have a reasonable prior about how much overlap there is in their evidence. But in cases like the one above, that prior is incorrect.”

    The problem of evidential overlap stems from reasonable-seeming but incorrect priors about the truth of a proposition, conditional on (the conjunction of) various testimonies. The situations I want to talk about concern agents with entirely correct priors, who update on testimony the adequate Bayesian amount. In my case the ideal Bayesian behaves counterintuitively, in Amanda’s example, Bayesianism and intuition agree since bad priors lead to bad beliefs. 

  2. In this post, I use “checking for yourself”, “thinking for yourself”, “thinking and learning”, etc., as a stand-in for anything that helps evaluate the truth-value of the “good argument” node in Yudkowsky’s diagram. This could include gathering empirical evidence, checking arguments and proofs, as well as acquiring the skills necessary to do this. 

November 8, 2017