Grant that “animals with a kidney” and “animals with a heart” designate the same set. They have the same extension. Yet their meaning is clearly different.1 In On Sense and Reference, (“Über Sinn und Bedeutung”, 1892) Frege had already noticed this.
Classical predicate logic’s achievement was to give a precise and universal account of how the designation of a sentence depends on the designation of its parts. It was a powerful tool for both deduction and clarification, revealing the ambiguity of ordinary language. I discuss this in detail in the first success story.
Classical logic was developed to model the reasoning needed in mathematics, where the difference between meaning and designation is unimportant. Outside of mathematics, where meaning and designation can come apart, classical logic was inadequate. A formal account of meaning was lacking. Frege called it sense (“Sinn”). According to Sam Cumming, “Frege left his notion of sense somewhat obscure”. Frege appeared to endorse the criterion of difference for senses:
Two sentences S and S* differ in sense if and only if some rational agent who understood both could, on reflection, judge that S is true without judging that S* is true.
This is not adequately formal. Letting meaning depend on the conclusions of some “rational agent” leaves it at the level of intuition. The criterion does not even attempt to give a formal model of meaning; it simply gives a condition for meanings to differ.
Meaning began to seem metaphysically suspect, like a ghostly “extra” property tacked on to every predicate. SEP tells us:
Intensional entities have of course featured prominently in the history of philosophy since Plato and, in particular, have played natural explanatory roles in the analysis of intentional attitudes like belief and mental content. For all their prominence and importance, however, the nature of these entities has often been obscure and controversial and, indeed, as a consequence, they were easily dismissed as ill-understood and metaphysically suspect “creatures of darkness”2 (Quine 1956, 180) by the naturalistically oriented philosophers of the early- to mid-20th century.
The contribution of possible worlds semantics was to give a precise formal description of these “creatures of darkness”, bringing them into the realm of respectability.
Simply: intensions are extensions across possible worlds.
Sider (Logic for Philosophy p.290) writes:
we relativize the interpretation of predicates to possible worlds. The interpretation of a two-place predicate, for example, was in nonmodal predicate logic a set of ordered pairs of members of the domain; now it is a set of ordered triples, two members of which are in the domain, and one member of which is a possible world. When ⟨u1,u2,w⟩ is in the interpretation of a two-place predicate R, that represents R’s applying to u1 and u2 in possible world w. This relativization makes intuitive sense: a predicate can apply to some objects in one possible world but fail to apply to those same objects in some other possible world. These predicate-interpretations are known as “intensions”. The name emphasizes the analogy with extensions, which are the interpretations of predicates in nonmodal predicate logic. The analogy is this: the intension of a predicate predicate can be thought of as determining an extension within each possible world”.
Aristotle famously used the case of a sea-battle to (seemingly) argue against the law of the excluded middle:
Let me illustrate. A sea-fight must either take place to-morrow or not, but it is not necessary that it should take place to-morrow, neither is it necessary that it should not take place, yet it is necessary that it either should or should not take place to-morrow. Since propositions correspond with facts, it is evident that when in future events there is a real alternative, and a potentiality in contrary directions, the corresponding affirmation and denial have the same character.
This is the case with regard to that which is not always existent or not always nonexistent. One of the two propositions in such instances must be true and the other false, but we cannot say determinately that this or that is false, but must leave the alternative undecided. One may indeed be more likely to be true than the other, but it cannot be either actually true or actually false. It is therefore plain that it is not necessary that of an affirmation and a denial one should be true and the other false. For in the case of that which exists potentially, but not actually, the rule which applies to that which exists actually does not hold good. The case is rather as we have indicated.
People appear to have been confused about this for many centuries. It doesn’t help that Aristotle wrote very ambiguously. Colin Strang (1960) tells us:
VERY briefly, what Aristotle is saying in De Interpretatione, chapter ix is this: if of two contradictory propositions it is necessary that one should be true and the other false, then it follows that everything happens of necessity; but in fact not everything happens of necessity; therefore it is not the case that of two contradictory propositions it is necessary that one should be true and the other false; the propositions for which this does not hold are certain particular propositions about the future.
The reader is warned that what Aristotle is saying is ambiguous (cf. Miss Anscombe, loc. cit. p. 1).
The interpretative problems regarding Aristotle’s logical problem about the sea-battle tomorrow are by no means simple. Over the centuries, many philosophers and logicians have formulated their interpretations of the Aristotelian text (see Øhrstrøm and Hasle 1995, p. 10 ff.).
The SEP article is very long, and features Leibniz and some pretty funky-looking graphs. I recommend it if you want to experience some confusion.
Aristole’s could be taken to reason thus:
If Battle, then it cannot be that No Battle
If if cannot be that no Battle, then necessarily Battle
If Battle, then Necessarily Battle
But this is an obvious modal fallacy, drawing on the ambiguity of (1) between
The true statement □(B∨¬B) which implies □(B→¬¬B)
The false statement (B→□¬¬B)⟺(B→□B)
Philosophy is littered with variations on this confusion between necessity of the consequence and necessity of the consequent.
Modality de dicto vs modality de re
As the SEP page on Medieval theories of modality will amply demonstrate, confusion reigned long after Aristotle’s day. Quine (Word and Object) was baffled by talk of a difference between necessary and contingent attributes of an object, but used some quite fallacious arguments in attacking that difference:
Perhaps I can evoke the appropriate sense of bewilderment as
follows. Mathematicians may conceivably be said to be necessarily
rational and not necessarily two-legged; and cyclists
necessarily two-legged and not necessarily rational. But what
of an individual who counts among his eccentricities both
mathematics and cycling? Is this concrete individual necessarily
rational and contingently two-legged or vice versa?
Just insofar as we are talking referentially of the object, with
no special bias towards a background grouping of mathematicians
as against cyclists or vice versa, there is no semblance
of sense in rating some of his attributes as necessary and others
as contingent. Some of his attributes count as important and
others as unimportant, yes, some as enduring and others as
fleeting; but none as necessary or contingent.
“Most philosophers are now convinced, however, that Quine’s “mathematical cyclist” argument has been adequately answered by Saul Kripke (1972), Alvin Plantinga (1974) and various other defenders of modality de re.”
Sentences like (15) in which properties are ascribed to a specific individual in a modal context are said to exhibit modality de re (modality of the thing). Modal sentences that do not, like
Necessarily, all dogs are mammals: □∀x(Dx→Mx)
are said to exhibit modality de dicto (roughly, modality of the proposition).
As Plantiga writes Quine has us confused:
The essentialist, Quine thinks, will presumably accept
(35) Mathematicians are necessarily rational but not necessarily
(36) Cyclists are necessarily bipedal but not necessarily rational.
But now suppose that
(37) Paul J. Swiers is both a cyclist and a mathematician.
From these we may infer both
(38) Swiers is necessarily rational but not necessarily bipedal
(39) Swiers is necessarily bipedal but not necessarily rational
which appear to contradict each other twice over.
This argument is unsuccessful as a refutation of the essentialist.
For clearly enough the inference of (39) from (36) and (37) is
sound only if (36) is read de re; but, read de re, there is not so much
as a ghost of a reason for thinking that the essentialist will accept it.
But possible worlds semantics also illuminates the intuition that was likely behind Quine’s dismissal of de re modality. SEP:
Possible world semantics provides an illuminating analysis of the key difference between [modality de re and modality de dicto]: The truth conditions for both modalities involve a commitment to possible worlds; however, the truth conditions for sentences exhibiting modality de re involve in addition a commitment to the meaningfulness of transworld identity, the thesis that, necessarily, every individual (typically, at any rate) exists and exemplifies (often very different) properties in many different possible worlds.
Ordinary-language predicates can be ambiguous between sense and reference. Ordinary-language names can also be ambiguous in the same way, as with “Hesperus = Phosoporus”. But Kripke himself (!) didn’t appear to see this, and it took the development of two-dimensional semantics (Stanford, see also Sider’s Logic for Philosophy, chapter 10, and Chalmers). I don’t count this as a success story because 2D semantics has yet to gain consensus approval. ↩
In Quantifiers and Propositional Attitudes (1956) Quine wrote: “Intensions are creatures of darkness, and I shall rejoice with the reader when they are exorcised, but first I want to make certain points with help of them.” My understanding is that Quine had a pre-possible worlds understanding of “intensions”, equivalent to Frege’s senses and hence still informal. So in today’s usage the quote would be rendered as “Meanings are creatures of darkness”. Quine was writing in 1956. Kripke published Semantical Considerations on Modal Logic in 1963. ↩
Wittgenstein wrote that “philosophy is a battle against the bewitchment of our intelligence by means of language”. Ordinary language developed to work in ordinary contexts. When we deal with philosophically tricky issues, however, ordinary language rarely coincides with the underlying concepts in a one-to-one mapping. Sometimes ordinary language will use two different words for the same concept. This case rarely leads to problems. But when instead ordinary terms are ambiguous between two or more meanings, this is fertile ground for confusion. A lot of good philosophy disambiguates between these meanings to dissolve apparent paradoxes.
“I decided to do that of my own free will” (Could-have-been-otherwise vs unconstrained)
“We should expect humans to behave selfishly” (Is vs ought)
Sometimes people find my purported success stories mathematical rather than philosophical. I’ve even been accused of lumping the whole of mathematics into philosophy. I see why this intuition is compelling. Logic, the analysis of computability, Bayesianism and so on just look mathsy. It seems natural to cluster them with maths rather than philosophy. And that definitely makes sense in some contexts.
Here, I’m trying to understand how philosophy works, and what it can do for us when it’s successful. In that context, I claim, these stories should be clustered with philosophy. We should look beyond superficial patterns, like what the work looks like on the printed page, and instead ask: what kind of cognitive work is being done?
Now is a good time to ask: what do we call mathematics? In primary school, you might get away with defining mathematics as that which deals with quantity or number. But modern mathematics goes far beyond that. Wikipedia tells us: “Starting in the 19th century, when the study of mathematics increased in rigour and began to address abstract topics such as group theory and projective geometry, which have no clear-cut relation to quantity and measurement, mathematicians and philosophers began to propose a variety of new definitions. Some of these definitions emphasize the deductive character of much of mathematics, some emphasize its abstractness, some emphasize certain topics within mathematics”.
I want to emphasise that whenever something is sufficiently formal, we tend to call it mathematical. Mathematics uses the form of strings to manipulate them according to perfectly precise rules. (I hope this is uncontroversial. I take no view on whether mathematics is only formalism).
Before we knew how to reason about the trajectories of medium-sized objects, we speculated and used vague verbiage. Since classical mechanics was solved, we use coordinates and derivatives. Object trajectories have been mathematised. But nothing about the subject matter of trajectories has changed, or (I claim) was distinctive in the first place. Formalisation is just what is looks like to fully solve a conceptual problem. Once we fully understood trajectories, they “became part of mathematics”.
Here’s another example. Logic has nothing to do with quantity or number, but is often called mathematical, and ‘→’ and ‘¬’ are said to be mathematical symbols. Sider (Logic for Philosophy) writes: “Modern logic is called “mathematical” or “symbolic” logic, because its method is the mathematical study of formal languages. Modern logicians use the tools of mathematics (especially, the tools of very abstract mathematics, such as set theory) to treat sentences and other parts of language as mathematical objects.” But logic is just culmination of a long-standing project: to distinguish good from bad arguments. Formal logic means we have succeeded fully. We have wholly clarified certain kinds of deductive reasoning.
I don’t mean to claim that all of mathematics should be clustered with philosophy. I just mean the initial mathematisation of a previously informal area of study. Once the formal cornerstones have been laid, philosophy really does hand off to mathematics. My rough picture of intellectual progress is the following:
Confusion reigns. People get lost in vague verbiage, and there is no standard way to adjudicate disagreements.
Much work is done in the service of clarification. Ultimately, maximal clarification is achieved through formalisation.
With a formal system at hand, people go to town with it, proving things left and right, extending the system, and so on.
We begin to view this area of study as mathematical or even part of mathematics.
Stage (1) is what most people think philosophy looks like. I say: it’s philosophy when it’s still failing. Stage (2) is successful philosophy (or at least one kind it). But the philosophical nature of the contribution in (2) is often forgotten in the subsequent wave of mathematical enthusiasm for steps (3) and (4).
I hope I’ve now built the intuition enough to move on to the success stories that people have found most counter-intuitive.
With the analysis of computability, the philosophical work of clarification was to formalise the notion of effective calculability with a Turing machine. This allowed mathematical work to be done with the formal notion. In this case, Turing did step (3) immediately, in the same paper, he went on to prove many results about Turing machines. So Turing’s paper is, in some sense, first some philosophy, then some mathematics. Wikipedia tells us that Hilbert’s problems ranged greatly in precision. Some of them are propounded precisely enough to enable a clear affirmative or negative answer, while others had to be substantially clarified. The Entscheidungsproblem was more philosophical because it involved significant work of clarification. And it’s a particularly cool story, because the precisification proposed by Turing turned out to (i) gain virtually universal approval and (ii) have wide philosophical significance and applicability.
In the case of the development of probability theory, it’s emphatically not the case that, pre-Pascal, people were disagreeing on a point of mathematics. They were much more deeply confused. They just had no appropriate notion of probability or expected value, and were trying to cobble together solutions to particular problems using ad-hoc intuitions. Because Pascal launched probability theory, it seems only natural to view his first step as part of probability theory. But in an important sense the first step is very different. It’s much more philosophical.
The point has been made often and well (Wittgenstein, Ramsey, Muehlhauser, Yudkowsky), that conceptual analysis is doomed by resting on falsified assumptions about human cognition, and a mistaken view of the nature of empirical categories.
A first problem is with necessary and sufficient conditions:
Category-membership for concepts in the human brain is not a yes/no affair, as the “necessary and sufficient conditions” approach of the classical view assumes. Instead, category membership is fuzzy. (Muehlhauser)
This first problem could be solved with a new type of conceptual analysis, one admitting of degree. However, a deeper problem arises from the requirement that an analysis admit of no intuitive counterexamples:
[…] most of our empirical concepts are not delimited in all possible directions. Suppose I come across a being that looks like a man, speaks like a man, behaves like a man, and is only one span tall – shall I say it is a man? Or what about the case of a person who is so old as to remember King Darius? Would you say he is an immortal? Is there anything like an exhaustive definition that finally and once for all sets our mind at rest? ‘But are there not exact definitions at least in science?’ Let’s see. The notion of gold seems to be defined with absolute precision, say by the spectrum of gold with its characteristic lines. Now what would you say if a substance was discovered that looked like gold, satisfied all the chemical tests for gold, whilst it emitted a new sort of radiation? ‘But such things do not happen.’ Quite so; but they might happen, and that is enough to show that we can never exclude altogether the possibility of some unforeseen situation arising in which we shall have to modify our definition. (Waismann)
Waismann called this feature of our language open texture.
Is all conceptual analysis therefore useless? The view has some appeal. If all we want is to dissolve philosophical confusions through clarification of ambiguities; this can be achieved by stipulating definitions that allow us to be as precise as we want, after which we can abandon other verbiage. Hence SEP tells us:
Another view, held at least in part by Gottlob Frege and Wilhelm Leibniz, is that because natural languages are fraught with vagueness and ambiguity, they should be replaced by formal languages. A similar view, held by W. V. O. Quine (e.g., , ), is that a natural language should be regimented, cleaned up for serious scientific and metaphysical work.
My view is the following: abandoning ambiguous terms in favour of more precise, stipulatively defined ones, i.e. regimentation, is always a legitimate philosophical move. Pragmatically, however, there are costs to doing so. Technical texts with a lot of jargon are difficult to read for a reason. It takes time to communicate the definitions of one’s terms to others. And it takes longer still until our audience gains intuitive familiarity with the new terminology, and can manipulate it with speed and accuracy.
When deciding which words to use, we face a trade-off between precision on the one hand, and agreement with intuitive terminology on the other.
Programming languages are an example of the fully regimented extreme. There is no ambiguity, but coding must be learnt the hard way. The language of small children or pre-scientific civilisations (“a whale is heavier than a bowling ball”), on the other hand, is completely intuitive.
The old view of conceptual analysis, requiring necessary and sufficient conditions, and admitting of no counter-examples, was an attempt to achieve both complete precision and complete intuitiveness. But from its failure it does not follow that all our old words should be regimented away. In some cases that may be the best we can do; some unsalvageable concepts, like what it means for a storm-cloud to be angry, are to be consigned to the dustbin of language. But for other terms, like “causation”, it’s not a foregone conclusion that the optimal way to navigate the trade-off is to abandon the word. We may do better to keep the word, along with its “good enough” definition. Conceptual analysis, on a more modest and fruitful view, is a tool that can help us to find such opportunities.
In general, therefore, I don’t find conceptual analysis particularly exciting. If the use of regimented language dissolves a controversy of analysis, it’s clear that nothing of “philosophical” importance was hanging in the balance in the first place. However, conceptual analyses can be pragmatically useful, and indeed there have been a number of examples I enjoyed. In what follows I list some intellectual phenomena I consider examples of conceptual analysis, and comment on them.
The “analysis of knowledge merry-go-round” (Weatherson 2003), has rightly been much derided.
Thirty years ago this journal published the most influential paper of modern analytic epistemology - Edmund Gettier’s ‘Is Justified True Belief Knowledge?’. In it Gettier refuted a classic theory of propositional knowledge by constructing thought experiments to test the theory. A cottage industry was born. Each response to Gettier was quickly met by a new Gettier-style case. In turn there would be a response to the case, a further Gettier scenario, and a reiteration of the process. The industry’s output was staggering. Its literature became so complicated, its thought experiments so baroque, that commonsense was stretched beyond limit.
This is a clear example where regimentation is appropriate. Our epistemic state can be fully described by our beliefs and our evidence. What about “knowledge”? Commit it then to the flames!
Quoting from an essay I wrote in 2017:
We want a theory of when it is rational to have an outright belief. It seems like we might easily get this from our theory of when it is rational to have a graded belief. Simply say, “it is rational to believe something simpliciter iff it is rational to believe it with a probability p>y.” Let’s call this the threshold view. We won’t be able to put an exact number on y. This merely points to the fact that outright belief-language is somewhat vague. Similarly, in “a person is bald iff they have fewer than z hairs on their head”, z is imprecisely specified, but we still understand what it means to be bald, and we know that 10<z<106.
But the cases of preface and lottery appear to show that the threshold view is false.
Consider the lottery: “Let the threshold y required for belief be any real number less than 1. For example, let y = 0.99. Now imagine a lottery with 100 tickets, and suppose it is rational for you to believe with full confidence that the lottery is fair and that as such there will be only one winning ticket. […] So, it is rational for you to have 0.99 confidence that ticket #1 will not win, 0.99 confidence that ticket #2 will not win, and so on for each of the other tickets. According to the [threshold view], it is rational for you to believe each of these propositions, since it is rational for you to have a degree of confidence in each that is sufficient for belief. But given that rational belief is closed under conjunction, it is also rational for you to believe that (ticket #1 will not win and ticket #2 will not win . . . and ticket #100 will not win)” (Foley 1992). However, this is a contradiction with your belief that the lottery is fair, i.e., that exactly one ticket will win the lottery. Thus y cannot be 0.99. The same conclusion can be reached for any probability y<1: simply create a lottery with 1/(1-y) tickets, and argue as before. Thus the threshold cannot be any less than 1. This clearly will not do, as it violates our intuitions about everyday uses of ‘believe’, as in “I believe it will rain tomorrow”.
Similarly, consider now the preface: “You write a book, say a history book. In it you make many claims, each of which you can adequately defend. In particular, suppose it is rational for you to have a degree of confidence x in each of these propositions, where x is sufficient for belief but less than l. Even so, you admit in the preface that you are not so naive as to think that your book contains no mistakes. You understand that any book as ambitious as yours is likely to contain at least a few errors. So, it is highly likely that at least one of the propositions you assert in the book, you know not which, is false. Indeed, if you were to add appendices with propositions whose truth is independent of those you have defended previously, the chances of there being an error somewhere in your book becomes greater and greater. Nevertheless, given that rational belief is closed under conjunction, it cannot be rational for you to believe that your book contains any errors” (Foley 1992). Thus, if it is rational to believe each of the propositions that make up your book, then it is also rational to believe their conjunction, despite your having a low degree of confidence in that conjunction. Indeed, as before, your degree of confidence in the conjunction can be made arbitrarily low by adding more chapters to the book.
“After all, what reasons do we have to be interested in an [invariantist] theory of rational belief [simpliciter] if we have an adequate [invariantist] theory of rational degrees of belief? Does the former tell us anything useful above and beyond the latter? Is it really needed for anything? It doesn’t seem to be needed for the theory of rational decision making.” (Foley 1992). The fact that our ordinary-language usage of ‘belief’ cannot fully account for the laws of probability is simply a kink of ordinary language. Ordinarily, we do not speak about things like very long conjunctions concerning lottery tickets. The shorthand word ‘belief’ deals well with most cases we do ordinarily encounter. In other cases, we can simply retreat to using the language of degrees of belief.
Yeah, we don’t need to conceptually analyse ‘belief’. It’s probably outright harfum to keep using that word.
Humans have long understood that animals come in relatively sharply delineated clusters. By using a word for, say, “pig” and another for “dog”, we are making use of these categories. More recently, modern biology has developed the concept of “species”. Wikipedia explains that “a species is often defined as the largest group of organisms in which two individuals can produce fertile offspring, typically by sexual reproduction”.
This definition can be viewed as a proposed conceptual analysis of our pre-scientific, or folk-biological, concept of “type of animal”.
This analysis does really well, on hundreds of folk biological categories! We are so used now to the concept of species that this remarkable fact may appear obvious. There are some problem cases, too: Elephants are three species; while a caterpillar and a butterfly can be the same species.
What is more, even the more regimented concept of species is too imprecise for some use cases. Wikipedia says: “For example, with hybridisation, in a species complex of hundreds of similar microspecies, or in a ring species, the boundaries between closely related species become unclear.”
The definition of temperature as mean molecular kinetic energy can be viewed as a conceptual analysis. Wikipedia says that temperature is “a physical quantity expressing the subjective perceptions of hot and cold”. And by and large it does excellently. However, it fails with spicy (“hot”) food.
Does this exception mean we need to regiment away common-sense notions of hot and cold? No! This illustrates how analysis that admit of exceptions can still be useful.
Speed and acceleration
The concepts of classical physics are just a refinement of the concepts of daily life and are an essential part of the language which forms the basis of all natural science.
Speed is the first derivative of location with respect to time, and acceleration is the second derivative.
This is a conceptual analysis so successful that the definition resulting from the analysis has replaced, in most adult speakers, the intuitive notion. (Something we have already seen to some extent with species). For this reason, it’s hard to see that it was, in fact, a conceptual analysis.
Where can we find remnants of the pre-scientific, ur-intuitive notion of speed? The theories of Aristotle and small children seem like a good place to start in search of this pristine naiveté.
Per Macagno 1991, Aristotle had piecemeal correct intuitions about speed, but he did not see the more general point:
Although Aristotle discusses in detail when a motion is faster than another
by considering the space traversed and the corresponding time, he never arrived at what is so elementary for us: V=S/T. He considers several cases; for instance, in the case S2=S1, velocity V2 is larger than V1 if T2<T1. To divide a distance by a time was not an acceptable operation, if it was considered at all […]
Children were shown
two parallel train tracks with a locomotive
on each of them. The two locomotives could
start from the same or different points, could
stop at the same or different points, and
could go the same or different distances.
They could start at the same or different
times, could stop at the same or different
times, and could travel for the same or different
total time. Finally, they could go at
the same or different speeds. […]
On the time concept, the state before full
mastery seemed to be one in which time and
distance were only partially differentiated.
This was evident in the use of the distance
rule to judge time by a large number of 11-
Similarly, I would expect (although citation needed) that many children who have a good grasp of the difference between position and speed (i.e. they would not say that whichever train ended farther ahead on the tracks travelled for the longer time, or the faster speed, or the greater distance), still do not clearly distinguish speed from acceleration. For instance, they might say: “whoa, the car went so fast just then - I was really pressed up against my seat.”
Once we have conceptually analysed speed and acceleration as the first and second derivatives of position with respect to time, we have a powerful new formal tool. We can use this tool to create new concepts which did not exist in natural language. For example, the third time-derivative of position is jerk. Understanding jerk has many uses, for instance to build quadcopters and other drones.
Children are likely to judge the speed from temporal precedence and say, “It went faster because it arrived earlier.” In the Japanese language, the two words meaning fast in speed and early in temporal precedence, respectively, have the same pronunciation, i.e., hayai. On the other hand, in the Thai language, these two words are differentiated in their pronunciation as well as in meaning; the one that means high speed is re0 and the other one that means temporal precedence is khon. […]
The Japanese and Thai children were shown the same visual displays of moving objects and asked to compare the speed of those moving objects. The results significantly indicate that Thai children’s concept of speed is further advanced than that of Japanese children.
The Japanese children are to the Thai children like Aristotle is to a modern student armed with the formal notion of acceleration. It’s possible to go beyond ordinary English with the formal language of physics, but it’s also possible to lag behind ordinary English with (children’s understanding of) Japanese. Similarly, “the Pirahã language and culture seem to lack not only the words but also the concepts for numbers, using instead less precise terms like “small size”, “large size” and “collection”.”
The epsilon-delta definition of a limit
See my other post on the success story of predicate logic.
See my other post on the success story of computability.
The conceptual analysis of causation fills many a textbook. Here I’ll focus on just the counterfactual analyses, that is, analyses of causal claims in terms of counterfactual conditionals.
A first attempt might be:
Where c and e are two distinct actual events, c causes e if and only if, if c were not to occur e would not occur.
But cases of Preemption offer a counter-example (SEP):
Suppose that two crack marksmen conspire to assassinate a hated dictator, agreeing that one or other will shoot the dictator on a public occasion. Acting side-by-side, assassins A and B find a good vantage point, and, when the dictator appears, both take aim. A pulls his trigger and fires a shot that hits its mark, but B desists from firing when he sees A pull his trigger. Here assassin A’s actions are the actual cause of the dictator’s death, while B’s actions are a preempted potential cause.
To deal with Preemption, we can move to the following account:
[Lewis’s] truth condition for causal dependence becomes:
(3) Where c and e are two distinct actual events, e causally depends on c if and only if, if c were not to occur e would not occur.
He defines a causal chain as a finite sequence of actual events c, d, e,… where d causally depends on c, e on d, and so on throughout the sequence. Then causation is finally defined in these terms:
(5) c is a cause of e if and only if there exists a causal chain leading from c to e.
But take the following case:
A person is walking along a mountain trail, when a boulder high above is dislodged and comes careering down the mountain slopes. The walker notices the boulder and ducks at the appropriate time. The careering boulder causes the walker to duck and this, in turn, causes his continued stride. (This second causal link involves double prevention: the duck prevents the collision between walker and boulder which, had it occurred, would have prevented the walker’s continued stride.) However, the careering boulder is the sort of thing that would prevent the walker’s continued stride and so it seems counterintuitive to say that it causes the stride.
Some defenders of transitivity have replied that our intuitions about the intransitivity of causation in these examples are misleading. For instance, Lewis (2004a) points out that the counterexamples to transitivity typically involve a structure in which a c-type event generally prevents an e-type but in the particular case the c-event actually causes another event that counters the threat and causes the e-event. If we mix up questions of what is generally conducive to what, with questions about what caused what in this particular case, he says, we may think that it is reasonable to deny that c causes e. But if we keep the focus sharply on the particular case, we must insist that c does in fact cause e.
Aha, but we simply need to modify the marksman case to get a case of late preemption:
Billy and Suzy throw rocks at a bottle. Suzy throws first so that her rock arrives first and shatters the glass. Without Suzy’s throw, Billy’s throw would have shattered the bottle. However, Suzy’s throw is the actual cause of the shattered bottle, while Billy’s throw is merely a preempted potential cause. This is a case of late preemption because the alternative process (Billy’s throw) is cut short after the main process (Suzy’s throw) has actually brought about the effect.
Lewis’s theory cannot explain the judgement that Suzy’s throw was the actual cause of the shattering of the bottle. For there is no causal dependence between Suzy’s throw and the shattering, since even if Suzy had not thrown her rock, the bottle would have shattered due to Billy’s throw. Nor is there a chain of stepwise dependences running cause to effect, because there is no event intermediate between Suzy’s throw and the shattering that links them up into a chain of dependences. Take, for instance, Suzy’s rock in mid-trajectory. Certainly, this event depends on Suzy’s initial throw, but the problem is that the shattering of the bottle does not depend on it, because even without it the bottle would still have shattered because of Billy’s throw.
To be sure, the bottle shattering that would have occurred without Suzy’s throw would be different from the bottle shattering that actually occurred with Suzy’s throw. For a start, it would have occurred later. This observation suggests that one solution to the problem of late preemption might be to insist that the events involved should be construed as fragile events. Accordingly, it will be true rather than false that if Suzy had not thrown her rock, then the actual bottle shattering, taken as a fragile event with an essential time and manner of occurrence, would not have occurred. Lewis himself does not endorse this response on the grounds that a uniform policy of construing events as fragile would go against our usual practices, and would generate many spurious causal dependences. For example, suppose that a poison kills its victim more slowly and painfully when taken on a full stomach. Then, the victim’s eating dinner before he drinks the poison would count as a cause of his death since the time and manner of the death depend on the eating of the dinner.
Lewis then further modifies his theory:
The central notion of the new theory is that of influence.
(7) Where c and e are distinct events, c influences e if and only if there is a substantial range of c1, c2, … of different not-too-distant alterations of c (including the actual alteration of c) and there is a range of e1, e2, … of alterations of e, at least some of which differ, such that if c1 had occurred, e1 would have occurred, and if c2 had occurred, e2 would have occurred, and so on.
Where one event influences another, there is a pattern of counterfactual dependence of whether, when, and how upon whether, when, and how. As before, causation is defined as an ancestral relation.
(8) c causes e if and only if there is a chain of stepwise influence from c to e.
One of the points Lewis advances in favour of this new theory is that it handles cases of late as well as early pre-emption. (The theory is restricted to deterministic causation and so does not address the example of probabilistic preemption described in section 3.4.) Reconsider, for instance, the example of late preemption involving Billy and Suzy throwing rocks at a bottle. The theory is supposed to explain why Suzy’s throw, and not Billy’s throw, is the cause of the shattering of the bottle. If we take an alteration in which Suzy’s throw is slightly different (the rock is lighter, or she throws sooner), while holding fixed Billy’s throw, we find that the shattering is different too. But if we make similar alterations to Billy’s throw while holding Suzy’s throw fixed, we find that the shattering is unchanged.
At this point, I’m hearing distinct echoes of the knowledge merry-go-round. After over forty years of analyses of causation, it’s a good time to ask ourselves: what would be the value of success in this enterprise? What would be the use of a conceptual analysis that captured all these strange edge cases?
I think the value would be very limited. We are able to fully describe any situtation without making use of the word “causation” (see below). Why then spend all this time considering baroque thought experiments? In the case of Suzy and Billy’s bottle, I honestly haven’t got that strong an intuition of what was the cause of the shattering. I think it’s playing games with open texture.
How is it that we can eliminate1 causation from our language? To describe what actually happens in the world, including in the above cases, we only need to describe each counterfactual situation. Brain Tomasik describes one way of doing so:
But if we had a complete physical model of the multiverse (e.g., a giant computer program that specified how the multiverse evolved), [we could ] change the program to remove X in some way and see if Y still happens.
Alternatively, you could specify your model using a causal graph. Once the causal graph is fully specified, it’s an empty question what truly caused the bottle to shatter.
How the most successful conceptual analyses become definitions
The analysis of limit, has become a universally accepted definition. The same thing is in the (largely completed) process of happening for the analysis of computability. Soare 1996 draws the analogy beautifully:
In the early 1800’s mathematicians were trying to make precise the intuitive notion of a continuous function, namely one with no breaks. What we might call the “Cauchy-Weierstrass Thesis” asserts that a function is intuitively continuous iff it satisfies the usual formal episilon-delta-definition found in elementary calculus books.
Similarly, what we might call the “Curve Thesis” asserts that the intuitive notion of the length of a continuous curve in 2-space is captured by the usual definition as the limit of sums of approximating line segments. [Kline 1972: “Up to about 1650 no one believed that the length of a curve could equal exactly the length of a line. In fact, in the second book of La Geometrie, Descartes says the relation between curved lines and straight lines is not nor ever can be known.”]
The “Area Thesis” asserts that the area of an appropriate continuous surface in 3-space is that given by the usual definition of the limit of the sum of the areas of appropriate approximating rectangles.
These are no longer called theses, rather they are simply taken as definitions of the underlying intuitive concept.
This idea has a good pedigree: in the words of Russell: “The law of causation, […] is a relic of a bygone age, surviving, like the monarchy, only because it is erroneously supposed to do no harm. […] In the motions of mutually gravitating bodies, there is nothing that can be called a cause, and nothing that can be called an effect; there is merely a formula.” For more discussion see Stanford and Judea Pearl. ↩