PHIL/PSYCH 256, Weeks 7-8

Connectionism: Introduction

History

1. Cajal, c. 1900: identification of neurons

2. 1950's: some of the first computational modelswere inspired by ideas about neural structure

E. g. Minsky, 1951

3. Perceptrons: Rosenblatt, 1962

Simple learning machines

4. Minsky & Papert 1969

- showed inherent limitations in what perceptrons could learn

- AI in the 1970's emphasized knowledge repesentation

- only a few diehards continued doing neuronal modelling

5. But: Hinton and Anderson, 1981 book.

Parallel Models of Associative Memory

Revival of connectionism: units + links

6. Ideas picked up by Rumelhart and McClelland, who started simulating psychological results using networks

7. c. 1985: Rediscovery of back propagation by Rumelhart et al.: more powerful learning method using networks with multiple layers.

Parallel Distributed Processing: PDP: 1986 books.

8. Claims for revolution in cognitive science:

symbolism versus connectionism?

9. Deep learning

Local ("localist") versus distributed representations:

1. Local: each unit (neuron) represents something cognitive like a concept or a proposition.

2. Distributed: concepts, propositions, images, etc. are distributed over multiple units

Examples of local representations:

Necker cube

Key idea: parallel constraint satisfaction. Coming up with a coherent interpretation requires integrating postive and negative constraints.

Use neuron-like units (nodes) to represent hypotheses about what vertices of the cube are front and back.

Link compatible hypotheses by excitatory links (positive constraints).

Lind incompatible hypotheses by inhibitory links (negative constraints).

Spread activation around network until it settles.

DECO

1. Problem: how to decide what to do?

2. Complicated because you have to learn what your goals are.

3. Positive constraints: actions facilitate goals

4. Negative constraints: can't do everything at once.

Paper on DECO.

ECHO

1. Problem: pick the best representation between competing sets of hypotheses

- explain the most evidence

- be explained: motive

- simplicity: avoid overly complex explanations

- analogy: prefer hypotheses like the ones already known

- explanatory coherence: choose hypotheses on the basis of how well they cohere with others

2. Create network by representing each proposition by a unit. Coherence relations represented by symmetric links.

Evidence units connected to special unit.

3. E.g. who killed the Duke? Butler? Duchess?

4. 3 bottom units all connected to special unit. Activation spreads from evidence units to others that compete. Activation adjusted until all units are stable.

5. Note: weights on links are not learned. Local representation.

Paper on ECHO.

Java implementation of ECHO.

Distributed representations:

Kosslyn octopus analogy

Visual analogy: octopus network.

Input layer: at bottom, noting fish.

Hidden layer

Output layer: inform seagulls

Note on terminology: neural networks = connectionism = PDP

Connectionism: Strengths

Review of Local Representations

1. Each unit has an interpretation.

2. Solution reached by spread of activation

3. Simple example: DECO.

4. This assumes you already know the major constraints.

Distributed representations

1. Inspired by brain: no grandmother neuron.

2. Visual analogy: Kosslyn's octopi.

3. How a network get's trained:

Backpropagation: Use errors at the output level to go back and change the weights.

4. Distributed differs from local:

- weights are learned

- hidden units acquire an interpretation

- concepts are represented by patterns of activation

5. E.g NetTalk: Rosenberg & Sejnowski

Input: English words

Ouput: Pronunciation of English words

Hidden: mapping from words to their pronunciation: not rules, mostly.

6. These networks learn to recognize complex patterns. Good for detecting submarines, bombs, etc.

Strengths of connectionism:

1. Neural plausibility: brain-style computation.

2. Integration of representation and learning (distributed).

3. Ability to do complex problem solving with soft constraints (localist): analogy (e.g. ACME), hypothesis evaluation.

 Problem Elements  Positive constraints Negative constraints Accepted as
 Truth  propositions  entailment, etc.  inconsistency  true
Epistemic justification propositions entailment, explanation, etc. inconsistency, competition known
Mathematics axioms, theorems deduction inconsistency known
Logical justification principles, practices justify inconsistency justified
Ethical justification principles, judgments justify inconsistency justified
Legal justification principles, court decisions justify inconsistency justified
Practical reasoning actions, goals facilitation incompatibility desirable
Perception images connectedness, parts inconsistency seen
Discourse comprehension meanings semantic relatedness inconsistency understood
Analogy mapping hypotheses similarity, structure, purpose 1-1 mappings corresponding
Cognitive dissonance beliefs, attitudes consistency inconsistency believed
Impression formation stereotypes, traits association negative association believed
Democratic deliberation actions, goals, propositions facilitation, explanation incompatible actions and beliefs joint action

See Thagard and Verbeurgt, Coherence as Constraint Satisfaction.

4. Graceful degradation: not so fragile as symbolic systems: chunks of network can be removed and answer still found.

5. Content addressable memory: retrieve things based on their meaning, not their location as in a computer.

6. Modelling of human performance: Psychological validity. E.g. learning the past tenses of verbs - fit human data.

7. Provides framework for discussion of innateness. See Jeff Elman et. al., Rethinking Innateness, 1996.

Connectionism: Weaknesses

Connectionists' best ideas:

1. incremental learning: backpropagation

2. distributed representations

3. parallel constraint satisfaction

Note that there are non-connectionist ways of doing constraint satisfaction too.

Representational limitations

1. local representations do not have full set of logical operators. And = symmetric excitatory link. Not-both = symmetric inhibitory link. If-then = asymmetric excitatory link. But OR and other possibilities not easily represented.

2. distributed representations have difficulty representing relationons, e.g. (bite (dog, boy)) versus (bite (boy, dog)). There are, however, attempts to overcome this, using temporal representations. But the "variable binding problem" is not solved.

Conclusion: connectionist models are currently less powerful informationally than logic, rules, frames, images.

Computational limitations

1. Neural networks are Turing complete, i.e. they can do everything a Turing machine (logical model of computing) can do, so they can do any computation.

But this is irrelevant to the question of what computations they can do easily and naturally.

2. While connectionist models are vary good for learning patterns and doing parallel constraint satisfaction, they have not yet been applied to complex sequential or analogical problem solving. Need for hybrid models, e.g. CARE.

Neurophysiological implausibility

1. The brain is much more complex than current networks:

2. Neurons are not like units. Neuronal influences depend on:

3. Some connectionist mechanisms are neurally implausible:

4. But artificial neural network models are becoming more neurologically realistic, e.g. by using spiking neurons.

Psychological implausibility

1. Backpropagation learning too slow, too many trials. Lots of learning is much easier and faster than McClelland's learning principle would allow:

"Adjust the parameters of the mind in proportion to the extent to which their adjustment can produce a reduction in the discrepancy between expected and observed events."

(in van Lehn, ed., Architectures for Intellligence, p. 45).

2. Some phenomena really are better described in terms of high-level rules:

See Smith et al. in Cognitive Science, 1992. E.g. modus ponens, causal rules, statistical rules, linguistic rules (see next week).

3. Connectionist models of learning may not simulate human flexibility very well: e.g. getting anchored in biased positions that they cannot recover from.

Conclusion

Possible positions

a) Connectionism is the new dominant paradigm in cognitive science.

b) Connectionism is a big mistake.

c) There are lots of important ideas in connectionism that need to be integrated with a broader view of the mind that includes explicit, symbolic representations.

My conclusion so far:

There is no single unified theory of cognition. All approaches have definite merits and definite limitations.

The representational/computational view of mind has been very productive, but no particular dominant view has emerged.

How to react to this situation?

1. Prosyletize: defend one approach (e.g. rules, connections) to the limit.

2. Hybridize: develop computational models that combine different techniques. Much work like this is going on.

E.g. PI: rules + concepts + analogs

ARCS: logic + connectionism

Neuro-SOAR: rules + connectionism.

3. Synthesize: produce a unified theory of mind that truly unifies the different approaches. Look to theoretical neuroscience to provide a general theory that encompasses symbolic, connnectionist, and dynamic systems approaches. Eliasmith: Moving beyond metaphors.

4. Criticize: interpret the incompleteness of all current approaches as a sign of the inherent inadequacy of the representational/computational view of mind.

Key points in Rumelhart


Phil/Psych 256

Computational Epistemology Laboratory.

Paul Thagard

This page updated Feb. 23, 2015