PHIL/PSYCH 256, Weeks 7-8

Connectionism: Introduction

History

1. Cajal, c. 1900: identification of neurons

2. 1950's: some of the first computational modelswere inspired by ideas about neural structure

E. g. Minsky, 1951

3. Perceptrons: Rosenblatt, 1962

Simple learning machines

4. Minsky & Papert 1969

- showed inherent limitations in what perceptrons could learn

- AI in the 1970's emphasized knowledge repesentation

- only a few diehards continued doing neuronal modelling

5. But: Hinton and Anderson, 1981 book.

Parallel Models of Associative Memory

Revival of connectionism: units + links

6. Ideas picked up by Rumelhart and McClelland, who started simulating psychological results using networks

7. c. 1985: Rediscovery of back propagation by Rumelhart et al.: more powerful learning method using networks with multiple layers.

Parallel Distributed Processing: PDP: 1986 books.

8. Claims for revolution in cognitive science:

symbolism versus connectionism?

9. Deep learning

Local ("localist") versus distributed representations:

1. Local: each unit (neuron) represents something cognitive like a concept or a proposition.

2. Distributed: concepts, propositions, images, etc. are distributed over multiple units

Examples of local representations:

Necker cube

Key idea: parallel constraint satisfaction. Coming up with a coherent interpretation requires integrating postive and negative constraints.

Use neuron-like units (nodes) to represent hypotheses about what vertices of the cube are front and back.

Link compatible hypotheses by excitatory links (positive constraints).

Lind incompatible hypotheses by inhibitory links (negative constraints).

Spread activation around network until it settles.

DECO

1. Problem: how to decide what to do?

2. Complicated because you have to learn what your goals are.

3. Positive constraints: actions facilitate goals

4. Negative constraints: can't do everything at once.

Paper on DECO.

ECHO

1. Problem: pick the best representation between competing sets of hypotheses

- explain the most evidence

- be explained: motive

- simplicity: avoid overly complex explanations

- analogy: prefer hypotheses like the ones already known

- explanatory coherence: choose hypotheses on the basis of how well they cohere with others

2. Create network by representing each proposition by a unit. Coherence relations represented by symmetric links.

Evidence units connected to special unit.

3. E.g. who killed the Duke? Butler? Duchess?

4. 3 bottom units all connected to special unit. Activation spreads from evidence units to others that compete. Activation adjusted until all units are stable.

5. Note: weights on links are not learned. Local representation.

Paper on ECHO.

Java implementation of ECHO.

Distributed representations:

Kosslyn octopus analogy

Visual analogy: octopus network.

Input layer: at bottom, noting fish.

Hidden layer

Output layer: inform seagulls

Note on terminology: neural networks = connectionism = PDP

Connectionism: Strengths

Review of Local Representations

1. Each unit has an interpretation.

2. Solution reached by spread of activation

3. Simple example: DECO.

Excitatory links indicate facilitiation.
Inhibitory links indicate incompatibility.
Solution is chosen by relaxing network.

4. This assumes you already know the major constraints.

Distributed representations

1. Inspired by brain: no grandmother neuron.

2. Visual analogy: Kosslyn's octopi.

3. How a network get's trained:

Backpropagation: Use errors at the output level to go back and change the weights.

4. Distributed differs from local:

- weights are learned

- hidden units acquire an interpretation

- concepts are represented by patterns of activation

5. E.g NetTalk: Rosenberg & Sejnowski

Input: English words

Ouput: Pronunciation of English words

Hidden: mapping from words to their pronunciation: not rules, mostly.

6. These networks learn to recognize complex patterns. Good for detecting submarines, bombs, etc.

Strengths of connectionism:

1. Neural plausibility: brain-style computation.

2. Integration of representation and learning (distributed).

3. Ability to do complex problem solving with soft constraints (localist): analogy (e.g. ACME), hypothesis evaluation.

Problem Elements Positive constraints Negative constraints Accepted as

Truth propositions entailment, etc. inconsistency true

Epistemic justification propositions entailment, explanation, etc. inconsistency, competition known

Mathematics axioms, theorems deduction inconsistency known

Logical justification principles, practices justify inconsistency justified

Ethical justification principles, judgments justify inconsistency justified

Legal justification principles, court decisions justify inconsistency justified

Practical reasoning actions, goals facilitation incompatibility desirable

Perception images connectedness, parts inconsistency seen

Discourse comprehension meanings semantic relatedness inconsistency understood

Analogy mapping hypotheses similarity, structure, purpose 1-1 mappings corresponding

Cognitive dissonance beliefs, attitudes consistency inconsistency believed

Impression formation stereotypes, traits association negative association believed

Democratic deliberation actions, goals, propositions facilitation, explanation incompatible actions and beliefs joint action

Problem	Elements	Positive constraints	Negative constraints	Accepted as
Truth	propositions	entailment, etc.	inconsistency	true
Epistemic justification	propositions	entailment, explanation, etc.	inconsistency, competition	known
Mathematics	axioms, theorems	deduction	inconsistency	known
Logical justification	principles, practices	justify	inconsistency	justified
Ethical justification	principles, judgments	justify	inconsistency	justified
Legal justification	principles, court decisions	justify	inconsistency	justified
Practical reasoning	actions, goals	facilitation	incompatibility	desirable
Perception	images	connectedness, parts	inconsistency	seen
Discourse comprehension	meanings	semantic relatedness	inconsistency	understood
Analogy	mapping hypotheses	similarity, structure, purpose	1-1 mappings	corresponding
Cognitive dissonance	beliefs, attitudes	consistency	inconsistency	believed
Impression formation	stereotypes, traits	association	negative association	believed
Democratic deliberation	actions, goals, propositions	facilitation, explanation	incompatible actions and beliefs	joint action

See Thagard and Verbeurgt, Coherence as Constraint Satisfaction.

4. Graceful degradation: not so fragile as symbolic systems: chunks of network can be removed and answer still found.

5. Content addressable memory: retrieve things based on their meaning, not their location as in a computer.

6. Modelling of human performance: Psychological validity. E.g. learning the past tenses of verbs - fit human data.

7. Provides framework for discussion of innateness. See Jeff Elman et. al., Rethinking Innateness, 1996.

Connectionism: Weaknesses

Connectionists' best ideas:

1. incremental learning: backpropagation

this is not so different from behaviorism

2. distributed representations

this is novel and powerful

3. parallel constraint satisfaction

this is novel, powerful, and more like Gestalt psychology than like behaviorism.

Note that there are non-connectionist ways of doing constraint satisfaction too.

Representational limitations

1. local representations do not have full set of logical operators. And = symmetric excitatory link. Not-both = symmetric inhibitory link. If-then = asymmetric excitatory link. But OR and other possibilities not easily represented.

2. distributed representations have difficulty representing relationons, e.g. (bite (dog, boy)) versus (bite (boy, dog)). There are, however, attempts to overcome this, using temporal representations. But the "variable binding problem" is not solved.

Conclusion: connectionist models are currently less powerful informationally than logic, rules, frames, images.

Computational limitations

1. Neural networks are Turing complete, i.e. they can do everything a Turing machine (logical model of computing) can do, so they can do any computation.

But this is irrelevant to the question of what computations they can do easily and naturally.

2. While connectionist models are vary good for learning patterns and doing parallel constraint satisfaction, they have not yet been applied to complex sequential or analogical problem solving. Need for hybrid models, e.g. CARE.

Neurophysiological implausibility

1. The brain is much more complex than current networks:

far more neurons, connections
more than just connections: hormonal action
modularity

2. Neurons are not like units. Neuronal influences depend on:

Changes in fidelity of synaptic transmission.
Changes in type of neurotransmitter.
Changes in signal-noise ratio.
Changes in neuronal excitability.
Importance of neural synchrony.

3. Some connectionist mechanisms are neurally implausible:

Combinations of excitatory and inhibitory links.
Nothing like backpropagation.

4. But artificial neural network models are becoming more neurologically realistic, e.g. by using spiking neurons.

Psychological implausibility

1. Backpropagation learning too slow, too many trials. Lots of learning is much easier and faster than McClelland's learning principle would allow:

"Adjust the parameters of the mind in proportion to the extent to which their adjustment can produce a reduction in the discrepancy between expected and observed events."

(in van Lehn, ed., Architectures for Intellligence, p. 45).

2. Some phenomena really are better described in terms of high-level rules:

See Smith et al. in Cognitive Science, 1992. E.g. modus ponens, causal rules, statistical rules, linguistic rules (see next week).

3. Connectionist models of learning may not simulate human flexibility very well: e.g. getting anchored in biased positions that they cannot recover from.

Conclusion

Possible positions

a) Connectionism is the new dominant paradigm in cognitive science.

b) Connectionism is a big mistake.

c) There are lots of important ideas in connectionism that need to be integrated with a broader view of the mind that includes explicit, symbolic representations.

My conclusion so far:

There is no single unified theory of cognition. All approaches have definite merits and definite limitations.

The representational/computational view of mind has been very productive, but no particular dominant view has emerged.

How to react to this situation?

1. Prosyletize: defend one approach (e.g. rules, connections) to the limit.

2. Hybridize: develop computational models that combine different techniques. Much work like this is going on.

E.g. PI: rules + concepts + analogs

ARCS: logic + connectionism

Neuro-SOAR: rules + connectionism.

3. Synthesize: produce a unified theory of mind that truly unifies the different approaches. Look to theoretical neuroscience to provide a general theory that encompasses symbolic, connnectionist, and dynamic systems approaches. Eliasmith: Moving beyond metaphors.

4. Criticize: interpret the incompleteness of all current approaches as a sign of the inherent inadequacy of the representational/computational view of mind.

Key points in Rumelhart

Brain-style computing differs from previous computional approaches to cognition: knowledge is in the connections.
A connectionist system includes: connected processing units, activations, activation rule, learning rule.
Connectionist models perform constraint satisfaction naturally.
Connectionist models are good for pattern matching and content-addressable memory.
Simple learning mechanisms allow connectionist networks to adapt to their environments.

Phil/Psych 256

Computational Epistemology Laboratory.

Paul Thagard

This page updated Feb. 23, 2015