1. Cajal, c. 1900: identification of neurons
2. 1950's: some of the first computational modelswere inspired by ideas about neural structure
E. g. Minsky, 1951
3. Perceptrons: Rosenblatt, 1962
Simple learning machines
4. Minsky & Papert 1969
- showed inherent limitations in what perceptrons could learn
- AI in the 1970's emphasized knowledge repesentation
- only a few diehards continued doing neuronal modelling
5. But: Hinton and Anderson, 1981 book.
Parallel Models of Associative Memory
Revival of connectionism: units + links
6. Ideas picked up by Rumelhart and McClelland, who started simulating psychological results using networks
7. c. 1985: Rediscovery of back propagation by Rumelhart et al.: more powerful learning method using networks with multiple layers.
Parallel Distributed Processing: PDP: 1986 books.
8. Claims for revolution in cognitive science:
symbolism versus connectionism?
1. Local: each unit (neuron) represents something cognitive like a concept or a proposition.
2. Distributed: concepts, propositions, images, etc. are distributed over multiple units
Key idea: parallel constraint satisfaction. Coming up with a coherent interpretation requires integrating postive and negative constraints.
Use neuron-like units (nodes) to represent hypotheses about what vertices of the cube are front and back.
Link compatible hypotheses by excitatory links (positive constraints).
Lind incompatible hypotheses by inhibitory links (negative constraints).
Spread activation around network until it settles.
1. Problem: how to decide what to do?
2. Complicated because you have to learn what your goals are.
3. Positive constraints: actions facilitate goals
4. Negative constraints: can't do everything at once.
1. Problem: pick the best representation between competing sets of hypotheses
- explain the most evidence
- be explained: motive
- simplicity: avoid overly complex explanations
- analogy: prefer hypotheses like the ones already known
- explanatory coherence: choose hypotheses on the basis of how well they cohere with others
2. Create network by representing each proposition by a unit. Coherence relations represented by symmetric links.
Evidence units connected to special unit.
3. E.g. who killed the Duke? Butler? Duchess?
4. 3 bottom units all connected to special unit. Activation spreads from evidence units to others that compete. Activation adjusted until all units are stable.
5. Note: weights on links are not learned. Local representation.
Distributed representations:
Kosslyn octopus analogy
Visual analogy: octopus network.
Input layer: at bottom, noting fish.
Hidden layer
Output layer: inform seagulls
Note on terminology: neural networks = connectionism = PDP
1. Each unit has an interpretation.
2. Solution reached by spread of activation
3. Simple example: DECO.
4. This assumes you already know the major constraints.
1. Inspired by brain: no grandmother neuron.
2. Visual analogy: Kosslyn's octopi.
3. How a network get's trained:
Backpropagation: Use errors at the output level to go back and change the weights.
4. Distributed differs from local:
- weights are learned
- hidden units acquire an interpretation
- concepts are represented by patterns of activation
5. E.g NetTalk: Rosenberg & Sejnowski
Input: English words
Ouput: Pronunciation of English words
Hidden: mapping from words to their pronunciation: not rules, mostly.
6. These networks learn to recognize complex patterns. Good for detecting submarines, bombs, etc.
1. Neural plausibility: brain-style computation.
2. Integration of representation and learning (distributed).
3. Ability to do complex problem solving with soft constraints (localist): analogy (e.g. ACME), hypothesis evaluation.
Problem | Elements | Positive constraints | Negative constraints | Accepted as |
---|---|---|---|---|
Truth | propositions | entailment, etc. | inconsistency | true |
Epistemic justification | propositions | entailment, explanation, etc. | inconsistency, competition | known |
Mathematics | axioms, theorems | deduction | inconsistency | known |
Logical justification | principles, practices | justify | inconsistency | justified |
Ethical justification | principles, judgments | justify | inconsistency | justified |
Legal justification | principles, court decisions | justify | inconsistency | justified |
Practical reasoning | actions, goals | facilitation | incompatibility | desirable |
Perception | images | connectedness, parts | inconsistency | seen |
Discourse comprehension | meanings | semantic relatedness | inconsistency | understood |
Analogy | mapping hypotheses | similarity, structure, purpose | 1-1 mappings | corresponding |
Cognitive dissonance | beliefs, attitudes | consistency | inconsistency | believed |
Impression formation | stereotypes, traits | association | negative association | believed |
Democratic deliberation | actions, goals, propositions | facilitation, explanation | incompatible actions and beliefs | joint action |
See Thagard and Verbeurgt, Coherence as Constraint Satisfaction.
4. Graceful degradation: not so fragile as symbolic systems: chunks of network can be removed and answer still found.
5. Content addressable memory: retrieve things based on their meaning, not their location as in a computer.
6. Modelling of human performance: Psychological validity. E.g. learning the past tenses of verbs - fit human data.
7. Provides framework for discussion of innateness. See Jeff Elman et. al., Rethinking Innateness, 1996.
1. incremental learning: backpropagation
2. distributed representations
3. parallel constraint satisfaction
Note that there are non-connectionist ways of doing constraint satisfaction too.
1. local representations do not have full set of logical operators. And = symmetric excitatory link. Not-both = symmetric inhibitory link. If-then = asymmetric excitatory link. But OR and other possibilities not easily represented.
2. distributed representations have difficulty representing relationons, e.g. (bite (dog, boy)) versus (bite (boy, dog)). There are, however, attempts to overcome this, using temporal representations. But the "variable binding problem" is not solved.
Conclusion: connectionist models are currently less powerful informationally than logic, rules, frames, images.
1. Neural networks are Turing complete, i.e. they can do everything a Turing machine (logical model of computing) can do, so they can do any computation.
But this is irrelevant to the question of what computations they can do easily and naturally.
2. While connectionist models are vary good for learning patterns and doing parallel constraint satisfaction, they have not yet been applied to complex sequential or analogical problem solving. Need for hybrid models, e.g. CARE.
1. The brain is much more complex than current networks:
2. Neurons are not like units. Neuronal influences depend on:
3. Some connectionist mechanisms are neurally implausible:
4. But artificial neural network models are becoming more neurologically realistic, e.g. by using spiking neurons.
1. Backpropagation learning too slow, too many trials. Lots of learning is much easier and faster than McClelland's learning principle would allow:
"Adjust the parameters of the mind in proportion to the extent to which their adjustment can produce a reduction in the discrepancy between expected and observed events."
(in van Lehn, ed., Architectures for Intellligence, p. 45).
2. Some phenomena really are better described in terms of high-level rules:
See Smith et al. in Cognitive Science, 1992. E.g. modus ponens, causal rules, statistical rules, linguistic rules (see next week).
3. Connectionist models of learning may not simulate human flexibility very well: e.g. getting anchored in biased positions that they cannot recover from.
a) Connectionism is the new dominant paradigm in cognitive science.
b) Connectionism is a big mistake.
c) There are lots of important ideas in connectionism that need to be integrated with a broader view of the mind that includes explicit, symbolic representations.
There is no single unified theory of cognition. All approaches have definite merits and definite limitations.
The representational/computational view of mind has been very productive, but no particular dominant view has emerged.
1. Prosyletize: defend one approach (e.g. rules, connections) to the limit.
2. Hybridize: develop computational models that combine different techniques. Much work like this is going on.
E.g. PI: rules + concepts + analogs
ARCS: logic + connectionism
Neuro-SOAR: rules + connectionism.
3. Synthesize: produce a unified theory of mind that truly unifies the different approaches. Look to theoretical neuroscience to provide a general theory that encompasses symbolic, connnectionist, and dynamic systems approaches. Eliasmith: Moving beyond metaphors.
4. Criticize: interpret the incompleteness of all current approaches as a sign of the inherent inadequacy of the representational/computational view of mind.
Computational Epistemology Laboratory.
This page updated Feb. 23, 2015