As Elijah Millgram reports, the idea of coherence has received widespread use in many areas of philosophy.(1) He argues that coherentist approaches have suffered from a woeful lack of specification of what coherence is and of how we can tell whether one theory in science, or ethics, or everyday life is more coherent than another. The exception he recognizes is the computational treatment of coherence problems that Karsten Verbeurgt and I have developed.(2) On this account, a coherence problem consists of a set of elements connected by positive and negative constraints, and a solution consists of partitioning the elements into two sets (accepted and rejected) in a way that maximizes satisfaction of the constraints. Algorithms have been developed that efficiently compute coherence by maximizing constraint satisfaction.
Millgram, however, doubts that this characterization of coherence is fully adequate for philosophical purposes. His main objection is that it is not appropriate for epistemology because it provides no guarantee that the most coherent available theory will be true. He contends that if a philosopher wants to invoke coherence, the price of the ticket is to provide a specification as concrete and precise as the constraint-satisfaction one, but surmises that alternative characterizations will have similar flaws to the ones he thinks he has identified.
I will argue that the constraint satisfaction account of coherence is not at all flawed in the ways that Millgram describes, and in fact satisfies the philosophical, computational, and psychological prerequisites for the development of epistemological and ethical theories. Not only have Verbeurgt and I paid the price of the ticket, it's the right price.
First it is necessary to clear up some confusions about the relation between the general characterization of coherence and the more particular coherence theories that can be applied to philosophical problems. Verbeurgt and I gave the following specification of the general problem of coherence:
COHERENCE. Let E be a finite set of elements {ei} and C be a set of constraints on E understood as a set {(ei, ej)} of pairs of elements of E. C divides into C+, the positive constraints on E, and C-, the negative constraints on E. With each constraint is associated a number w, which is the weight (strength) of the constraint. The problem is to partition E into two sets, A(accepted) and R (rejected), in a way that maximizes compliance with the following two coherence conditions:
1. if (ei, ej) is in C+, then ei is in A if and only if ej is in A.
2. if (ei, ej) is in C-, then ei is in A if and only if ej is in R.
Let W be the weight of the partition, that is, the sum of the weights of the satisfied constraints. The coherence problem is then to partition E into A and R in a way that maximizes W.
By itself, this characterization has no philosophical or psychological applications, because it does not state the nature of the elements, the nature of the constraints, or the algorithms to be used to maximize satisfaction of the constraints. In my new book, I propose that there are six main kinds of coherence: explanatory, deductive, conceptual, analogical, perceptual, and deliberative, each with its own array of elements and constraints.(3) Once these elements and constraints are specified, then the algorithms that solve the general coherence problem can be used to compute coherence in ways that apply to philosophical problems. Epistemic coherence is a combination of the first five kinds of coherence, and ethics involves deliberative coherence as well.
Millgram's main objection to coherence as constraint satisfaction concerns the acceptance of scientific theories, which is largely a matter of explanatory coherence.(4) The theory of explanatory coherence is stated informally by the following principles:
Principle E1. Symmetry. Explanatory coherence is a symmetric relation, unlike, say, conditional probability. That is, two propositions p and q cohere with each other equally.
Principle E2. Explanation. (a) A hypothesis coheres with what it explains, which can either be evidence or another hypothesis; (b) hypotheses that together explain some other proposition cohere with each other; and (c) the more hypotheses it takes to explain something, the lower the degree of coherence.
Principle E3. Analogy. Similar hypotheses that explain similar pieces of evidence cohere.
Principle E4. Data priority. Propositions that describe the results of observations have a degree of acceptability on their own.
Principle E5. Contradiction. Contradictory propositions are incoherent with each other.
Principle E6. Competition. If P and Q both explain a proposition, and if P and Q are not explanatorily connected, then P and Q are incoherent with each other. (P and Q are explanatorily connected if one explains the other or if together they explain something.)
Principle E7. Acceptance. The acceptability of a proposition in a system of propositions depends on its coherence with them.
I will not elucidate these principles here, but merely note how they add content to the abstract specification of coherence as constraint satisfaction. For explanatory coherence, the elements are propositions; and the negative constraints are between propositions that are incoherent with each other, either because they contradict each other or because they compete to explain some other proposition. Positive constraints involve propositions that cohere with each other, linking hypotheses with evidence and also tying together hypotheses that together explain some other proposition. In addition, there is a positive constraint encouraging but not demanding the acceptance of propositions describing the results of observation. Explanatory coherence can be computed using a variety of algorithms that maximize constraint satisfaction, including ones using artificial neural networks.
Now we can consider Millgram's argument that "coherentism does not make a whole lot of sense as an approach to science." (p. 90) For evaluating this claim, we need not merely to look at the abstract characterization of coherence as constraint satisfaction, but also at the theory of explanatory coherence that shows how competing explanatory theories can be evaluated. Millgram's worry is this: Verbeurgt proved that the general problem of coherence is computationally intractable, so the algorithms for computing coherence are approximations; hence the algorithms do not guarantee that we will get the most coherent theory, and the theory that is really the most coherent one might be very different from the one chosen by the algorithms.
That this is not a fatal objection to explanatory coherence is evident from the fact that it would apply to any serious attempt to understand scientific inference. Consider the view that scientific theory choice is based, not on explanatory coherence, but on Bayesian reasoning.(5) On this view, the best theory is the one with the highest subjective probability given the evidence, as calculated by Bayes theorem. It is well known that the general problem of Bayesian inference is computationally intractable, so the algorithms used for computing posterior probabilities have to be approximations.(6) Hence the Bayesian algorithms do not guarantee that we will get the most probable theory, and the theory that is really the most probable one might be very different from the one chosen by the algorithms. If Millgram's argument were effective against the explanatory coherence account of theory choice, it would also be effective against the Bayesian account, and against any other method based on approximation.
Millgram could naturally reply that this argument just shows that Bayesianism is as epistemologically problematic as coherentism and so should also be rejected. But he is implicitly throwing out any procedure rich enough to model scientific theory choice. A model of inference that evaluates competing explanations in a much simpler manner than the Bayesian and coherentist models has been shown to be computationally intractable(7), so we can be confident that any algorithm to evaluate competing scientific theories will involve some approximation to picking the best theory. Naturally, this adds to uncertainty about whether the theory is true, but uncertainty is inherent in any kind of nondeductive inference.
What kind of method would come with a guarantee that it produces true theories? Here are two options:
Rationalist: Start only with a priori true propositions, and deductively derive their consequences.
Empiricist: Start only with indubitable sense data, and only believe what follows directly from them.
Both these methods would guarantee truth, but we know that neither can even begin to account for scientific knowledge. Millgram's argument is not a criticism of a particular theory of coherence, but it is rather an unsatisfiable demand to avoid the inherent uncertainty that accompanies nondeductive inference.
Truth is not the only aim of science, which also values explanatory unification and practical application. Consider the following two propositions:
1. Species evolve by natural selection.
2. Lemurs have toes.
Both of these propositions are true, but the first is far more important to science because, as Darwin himself stated, it explains a great number of previously unconnected facts. It is obvious that explanatory coherence furthers the adoption of theories that encourage explanatory unification, because its principle of data priority, E4, encourages the acceptance of observation propositions, and its principle of explanation, E2, encourages the acceptance of hypotheses that explain many observations.
Does explanatory coherence also further the adoption of true theories? I think the answer is yes, but the argument has to be indirect:
1. Scientists have achieved theories that are at least approximately true.
2. Scientists use explanatory coherence.
3. So explanatory coherence leads to theories that are at least approximately true.
This is not the place to defend premises 1 and 2, which I have done at length elsewhere.(8) Premise 1 is supported primarily by the technological applicability of theories in the natural sciences, and premise 2 is supported by computational modeling of many important historical cases of scientific reasoning. We should not expect a direct defense of the truth achievement of any non-trivial inductive method, so the indirectness of the defense of explanatory coherence as a generator of scientific truths does not undermine coherentism.
Let me now address several of Millgram's minor objections to coherence construed as constraint satisfaction. He contends (p. 85) that COHERENCE is insensitive to the internal structure of scientific theories, but ignores the relevant principle of explanatory coherence. According to principle E2b, hypotheses that together explain a proposition cohere with each other, so there is a positive constraint between them. Hence computing coherence using algorithms for maximizing constraint satisfaction will encourage that they be accepted together or rejected together, encouraging the coherence calculation to be sensitive to internal structure. Millgram's mistake is misconstruing COHERENCE as a general theory of inference, rather than as the mathematical scaffolding that needs to be filled in with an account of the nature of the elements and constraints that are relevant to inference of a particular sort.
Suppose we have two competing theories, T1 consisting of hypotheses H1 and H2, and T2 consisting of hypotheses H3 and H4. Both theories explain two pieces of evidence, E1 and E2, but T2 provides a more unified explanation: H1 explains E1 on its own, and H2 explains E2 on its own; but H3 and H4 together explain E1, and together explain E2. Intuitively, T2 is more coherent than T1 because of its tighter internal structure. The formal characterization of coherence as constraint satisfaction does not in itself provide guidance about which theory to accept, because accepting the first non-unified theory satisfies as many constraints as accepting the second unified one. The program ECHO, however, which uses artificial neural network algorithms to maximize coherence, does prefer T2 to T1, because spreading activation through the network of nodes produces a kind of resonance between H3 and H4 that enables them to become more activated than their competitors, H1 and H2. Whether the other algorithms that maximize coherence more directly behave in similar fashion depends on the impact of the simplicity principle, E2c, according to which the more hypotheses it takes to explain something, the lower the degree of coherence. If the degree of coherence between each of two hypotheses that explain a piece of evidence is exactly half the degree of coherence between a hypothesis and a piece of evidence that it explains all by itself, then the coherence algorithms show no preference for internal structure. But if degree of coherence is reduced by a factor less than the number of hypotheses that together do the explaining, then the coherence algorithms take this into account and prefer theories such as T2 with more internal structure.
Millgram states (p. 89) that "approximating an index of coherence the weight of the best partition does not mean approximating the best partition." Mathematically he is correct, but practically we can acquire evidence that coherence algorithms, including ones proven to approximate an index, do in fact approximate the best partition. This is what computational experiments are for. We can run different coherence algorithms on many different examples, large and small, and see whether they yield partitions that seem to be reasonable given the input. Explanatory coherence has been evaluated for many important cases in the history of science by seeing whether it captures the judgments of important scientists. Given the explanatory relations recognized by scientists such as Copernicus, Newton, Lavoisier, and Darwin, a simulation of explanatory coherence should yield the same judgment as the scientist, and it does. The simulations show that the theory of explanatory coherence is coherent with important cases in the history of science.
For complex theories, the demand to simulate the reasoning of scientists is computationally non-trivial. Chris Eliasmith and I did an explanatory-coherence analysis of the acceptance of the wave theory of light, and challenged advocates of Bayesian inference to produce an analysis that is as historically detailed and computationally feasible.(9). As far as I know, this challenge has not been met, and serious practical limitations on Bayesian inference make me doubt that it can be met.(10)
There are thus good reasons to believe that a coherentist approach that employs both the abstract characterization of coherence as constraint satisfaction and its concrete instantiation in terms of principles for explanatory coherence is adequate for understanding scientific theory choice. Millgram's arguments do not undermine coherentist epistemology: the price of the ticket for invoking coherence has already been paid.
Notes
*I am grateful to Chris Eliasmith and Elijah Millgram for comments on a previous draft. This research is supported by the Natural Sciences and Engineering Research Council of Canada.
(1) E. Millgram, "Coherence: The Price of the Ticket," Journal of Philosophy, 97 (2000): 82-93.
(2) P. Thagard and K. Verbeurgt, "Coherence as Constraint Satisfaction," Cognitive Science 22 (1998): 1-24. Numerous papers on coherence can be found on my Web site, http://cogsci.uwaterloo.ca, which also contains a version of the explanatory coherence program ECHO that can be used over the Web.
(3) P. Thagard, Coherence in Thought and Action (Cambridge, MA: MIT Press, in press; publication in fall, 2000).
(4) P. Thagard, Conceptual Revolutions (Princeton: Princeton University Press, 1992).
(5) P. Maher, Betting on Theories (Cambridge: Cambridge University Press, 1993). C. Howson and P. Urbach, Scientific Reasoning: The Bayesian Tradition (Lasalle, IL: Open Court, 1989).
(6) G. Cooper, "The computational complexity of probabilistic inference using Bayesian belief networks," Artificial Intelligence 42 (1990): 393-405. J. Pearl, Probabilistic Reasoning in Intelligent Systems (San Mateo: Morgan Kaufman, 1988).
(7) Bylander, T., Allemang, D., Tanner, M., & Josephson, J. (1991). "The Computational Complexity of Abduction." Artificial Intelligence, 49 (1991), 25-60. Judgment that a problem is computationally intractable assumes that all members of a specified set of computationally equivalent problems require computations that grow exponentially with problem size. For a discussion of why this assumption is generally accepted, see Thagard, P. "Computational Tractability and Conceptual Coherence: Why Do Computer Scientists Believe that P NP?" Canadian Journal of Philosophy, 23 (1993), 349-364. For an argument that the Bylander et al. model is too simple for abductive reasoning, see Thagard, P., & Shelley, C. P. (1997). "Abductive Reasoning: Logic, Visual Thinking, and Coherence." In M. L. Dalla Chiara, K. Doets, D. Mundici, & J. van Benthem (Eds.), Logic and Scientific Methods (pp. 413-427). Dordrecht: Kluwer, 1997. Of course, if a simple model of abductive inference is computationally intractable, then a more complex model that includes the simple one as a special case will be also.
(8) P. Thagard, Computational Philosophy of Science (Cambridge, MA: MIT Press/Bradford Books, 1988). P. Thagard, Conceptual Revolutions (Princeton: Princeton University Press, 1992). P. Thagard, How Scientists Explain Disease (Princeton: Princeton University Press, 1999).
(9) C. Eliasmith and P. Thagard, "Waves, particles, and explanatory coherence," British Journal for the Philosophy of Science 48 (1997): 1-19.
(10) Thagard, Coherence in thought and action, ch. 8. Bayesian inference is what Daniel Dennett calls a "cognitive wheel", an elegant computational procedure that is psychologically and biologically implausible. In contrast, coherence models have had extensive psychological applications. On analogical coherence, see Holyoak, K. J., & Thagard, P. (1995). Mental Leaps: Analogy in Creative Thought (Cambridge, MA: MIT Press, 1995). On explanatory coherence, see Read, S., & Marcus-Newhall, A., "The role of Explanatory Coherence in the Construction of Social Explanations." Journal of Personality and Social Psychology, 65 (1993), 429-447. On conceptual coherence, see Kunda, Z., & Thagard, P., "Forming Impressions from Stereotypes, Traits, and Behaviors: A Parallel-Constraint-Satisfaction theory." Psychological Review, 103 (1996): 284-308.
Back to Computational Epistemology Laboratory.