Cognition is the mental faculty by which we know the world, and cognitive psychology is concerned with the acquisition, representation, transformation, and utilization of knowledge by humans (and animals). Learning is the first step in that process.
of human information processing, the mind performs a
sequence of activities:
Some of these associations are innate or inborn, part of the organism's native biological endowment.
reflex is the simplest possible connection between an
environmental stimulus and an organismic response. Examples
19th-century French physiologist Marie-Jean-Pierre Flourens
conducted a series of classic studies of reflexes in the
decorticate pigeon. He removed both lobes of the cerebral
cortex in the bird, and then attempted to determine which
patterns of behavior remained in the repertoire. Certain
behaviors were preserved:
beings also come "prewired" with a repertoire of reflexes:
automatic responses to stimulation that appear soon after
birth, before the infant has had any opportunity for learning.
|Some of these are reflexes of approach, elicited by weak stimuli, and which have the effect of increasing contact with the stimulus.Among these is rooting: when the infant's cheek is touched, it will turn its head in the direction of the touch and open its mouth; if its mouth makes contact with any object, it will close and begin to suck (this reflex will occur even if the infant is asleep or comatose).|
reflex of approach is grasping: if the palm of the
hand is touched, the fingers will flex and close around
|the grasping reflex can be very strong.|
|Similarly, if the sole of the foot is touched, the response will be "plantarflexion": the toes will stretch and turn downward.|
stimulus-response patterns are reflexes of
avoidance, which are elicited by intense or
noxious stimuli, and have the effect of decreasing
contact with the stimulus.
example, the infant's eyes will close automatically
in response to a bright light, and the mouth will
close at the introduction of an unpleasant taste (e.g.,
|If the palms or soles are scratched, pinched, or pricked, there will be spreading of the fingers or toes, and withdrawal of the hands or feet (in the case of the feet, the toes will also show "dorsiflexion", or turning upward -- the "Babinski reflex").|
|A very interesting set of behaviors is the stepping reflex. Infants appear to "learn to walk", but this appearance is deceiving. If the infant's body is supported, and it is moved forward along a flat surface, it will show synchronized stepping. If its toes strike the riser of a set of stairs, it will lift its feet. Neonates don't learn to walk: they can't walk because their skeletal musculature has not matured so that they can support themselves.|
Despite the large repertoire of reflexes, infants do not show much initiation of directed activity. The behaviors of the young infant are pretty much confined to reflexes, which are gradually replaced with voluntary action.
are an important part of the organism's behavioral repertoire,
but they have their limitations.
involve relatively small portions of the nervous system. In
principle, the reflex arc requires only three neurons
-- though in practice, spinal reflexes involve entire afferent
and efferent nerves, as well as the spinal cord. Other innate
stimulus-response connections consist of more complicated
action sequences, that involve larger portions of the nervous
system, and skeletal musculature.
taxis (plural, taxes) is a gross orientation response:
after presentation of a stimulus, the whole organism turns and
moves. Taxes come in two forms:
are actually lots of other taxes, which can be observed mostly
at the cellular level:
Taxes and Reflexes in the Neonate Kangaroo
The behavior of the newborn kangaroo illustrates an effective combination of reflexes and taxes. The kangaroo, like all marsupials (e.g., the opossum), has no placenta. The female gives birth after one month of gestation, and carries the developing fetus in a pouch. But how does the fetus get into the pouch?
Immediately after birth, the newborn climbs up the mother's abdomen -- perhaps by virtue of a negative geotaxis. If it reaches the opening of the pouch, it reverses its behavior and climbs in -- maybe a positive geotaxis. If it does not encounter the opening of the pouch, it will continue climbing until it reaches the top, stop -- or maybe fall off -- and eventually die. The mother kangaroo has no way of helping the infant -- the appropriate behaviors simply aren't in her instinctual repertoire, and -- a point I'll expand on later -- she has no opportunity to learn them through trial and error.
Once in the pouch, if the neonate encounters a nipple, it will attach to it and begin to nurse -- probably a variant on the (rooting reflex. If not, it will simply stop at the bottom of the pouch and eventually die.
Assuming that all goes well, the baby kangaroos emerges from the pouch after about six more months of gestation.
Note that the neonate gets in the pouch by its own automatic actions, with no assistance from its mother. The behavior is entirely under stimulus control, and if it fails to contact the appropriate stimulus it will simply die.
innate behaviors involve more complicated action sequences,
and more specific, discriminating responses. These are known
as instincts or fixed action patterns. Instincts have
several important properties. As a rule, they are:
different levels at which behavior can be analyzed.
A Nobel Prize for Ethology
Three important ethologists, Konrad Lorenz, Nikolas (Niko) Tinbergen, and Karl Von Frisch, won the 1973 Nobel Prize in Physiology or Medicine for their pioneering research on instincts (four years earlier, in 1969, Tinbergen's father Jan had shared the first Nobel Prize in Economics for his pioneering research on econometrics). For an intellectual biography of Tinbergen, see Niko's Nature: A Life of Niko Tinbergen and His Science of Animal Behaviour by H. Kruuk (2003).
concept of instinct is well illustrated in Konrad Lorenz'
research on imprinting in newly hatched ducks and
geese. Once out of the egg, the hatchling follows the first
moving object it sees. This is usually the mother, but the
hatchling will also follow a wooden decoy, block of wood on
wheels, or even a human -- provided that it is the first
moving object that the bird sees. The emphasis on the "first"
moving object is somewhat overstated, because there is a
critical period for imprinting: the imprinted object
must be present soon after birth; if exposure to a moving
object is delayed for several hours or days, imprinting may
not occur at all. If imprinting occurs, the imprinted object
will be followed even under adverse circumstances, over or
around barriers, etc. When the imprinted object is removed
from the bird's field of vision, the bird will emit a distress
call. If imprinting has occurred to an unusual object that
object will be preferred to the bird's actual parent, or any
other conspecific animal.
The power and perils of imprinting are vividly illustrated by an incident that occurred in Spokane, Washington, in 2009. George Armstrong, a banker, had been watching a female duck nesting on a ledge outside his office window. In the usual course of events, the ducklings would hatch, imprint on their mother, and then follow her as she led them to water. But -- they're on a ledge! And they can't fly yet!. The mother duck knew nothing of this. She's built to wait until her eggs have hatched, and then go to water; and ducklings are built to follow her. The mother jumped off the ledge and -- she's built for that, too -- flew down to the street. The chicks were stranded. Armstrong went out on the street, stood below the ledge, and caught each of the ducklings as they stepped off the ledge, instinctually following their mother (actually, he had to collect a couple from the ledge). Then he served as a crossing guard while the mother collected her young and led them to water. The power of imprinting is that the ducklings will follow their mother -- or Konrad Lorenz everywhere. The peril of imprinting is that the behavior has been selected for a particular environmental niche -- in the case of ducks, the grassy area near water where they usually nest; if that environment changes, for whatever reason, the instinctive behavior may be very maladaptive.
Link to a video of Armstrong catching the ducks: http://blogs.abcnews.com/theworldnewser/2009/05/the-duck-parade.html (sorry about the ad).
are actually two kinds of imprinting.
Imprinting is extremely indiscriminate: basically, the bird imprints on the first object that moves within the critical period. However, other instincts are much more discriminating.
Another good example of an instinct is the alarm reaction in some birds subject to predation by other birds (studied by Tinbergen). If an object passes overhead, the birds will emit a distress call and attempt to escape. However, these birds do not show alarm to just any stimulus: it must have a birdlike appearance; moreover, birdlike figures with short (hawklike) necks elicit alarm, while those with long (gooselike) necks do not (the length and shape of the tail and wings is largely irrelevant).
Imprinting and the alarm reaction involve, basically, only one organism. Other instincts involve the coordinated activities of two (or more) species members.
A good example is food-begging in herring gulls (studied by Tinbergen). Hatchling birds don't forage for their own food, but must be fed a predigested diet by their parents. But the parents do not do this of their own accord. Rather, the chick must peck at the parent's bill: the parent then regurgitates food, and presents it to the chick; the chick then grasps the food and swallows it. But the chick will not peck at any bird-bill. Rather, the bill must have a patch of contrasting color on the lower mandible. The precise colors involved do not matter much, so long as the contrast is salient. Food-begging exemplifies the coordination of instinctive behaviors: the patch is the releasing stimulus for the hatchling to peck; and the peck is the releasing stimulus for the parent to present food.
An excellent example of a complex, coordinated sequence of instinctual behaviors is provided by the "zig-zag" dance, part of the mating ritual of the stickleback fish (Tinbergen).
A male stickleback, when it is ready to mate, develops a red coloration on its belly.
It then establishes its territory by fighting off other sticklebacks. But he fights only sticklebacks, not other species of fish; and only males; and only males who display red bellies and enter his territory in the head-down "threat posture" (other colorations indicate that the other male is not ready to mate; other postures indicate that the other male is only passing through the territory; in either case, there is no territorial fighting).
Experiments by Tinbergen, employing "dummy" models of fish, show that It actually doesn't matter much whether the other fish looks like a stickleback, so long as it has a red-colored belly. Sticklebacks without red bellies may enter this fish's territory, because they don't constitute threats.
Other experiments, in which fish were enclosed in capsules to control their orientation, show that a male who elicits aggression when it enters a territory with its head down will not elicit aggression if it enters the territory with its head level -- perhaps indicating that it is just "passing through".
After the territory has been cleared of threatening males, the male builds a nest out of weeds.
Then he entices a female into the nest -- but only a female stickleback who enters his territory with a swollen abdomen, and in the head-up "receptive posture".
The female enters the nest only if the male displays a red belly, and performs a "zig-zag" dance.
Once in the nest, the female spawns eggs -- but only if she is stimulated at her hind quarters.
Once the eggs are laid, the female leaves the nest and the territory.
The male fertilizes the eggs, fans them to maintain an adequate oxygen supply around them, and cares for the young after hatching (until they're ready to go off to school).
When the young are hatched the red belly fades, and the male no longer incites males and attracts females -- until the next mating cycle starts.
Notice the serial organization to this pattern of stickleback behaviors. It is as if each act is the releasing stimulus for the next one. There is no flexibility in this sequence: once initiated, it does not stop, provided that the appropriate releasing stimulus is present. If any element in the sequence is left out, the entire sequence will stop abruptly. All three parties go through this pattern of behaviors, even if one of them doesn't remotely resemble a stickleback. For example, a female, ready to mate, will enter the nest if she observes a tongue depressor, painted red on one half, imitate the zig-zag dance!
Taxes and instincts are important elements in behavior, especially of invertebrates, birds, and reptiles. Some psychologists and behavioral biologists argue that much human behavior is also instinctual in nature. One of the first to make this argument was MacDougall, who argued that human behavior was rooted in instinctual behaviors related to biological motives. One of his examples, which is offered here without comment (except to note that similar descriptions could be made of the behavior of men), is reminiscent (at least in tone) of what Tinbergen discovered in sticklebacks:
The flirting girl first smiles at the person to whom the flirt is directed and lifts her eyebrows with a quick, jerky movement upward so that the eye slit is briefly enlarged. Flirting men show the same movement of the eyebrows. After this initial, obvious, turning toward the person, in the flirt there follows a turning away. The head is turned to the side, sometimes bent toward the ground, the gaze is lowered, and the eyelids are dropped. Frequently, but not always, the girl may cover her face with a hand and she may laugh or smile in embarrassment. She continues to look at the partner out of the corners of her eyes and sometimes vacillates between looking at, and looking away.
Among modern biological and social scientists, this point of view is expressed most strongly by the practitioners of sociobiology, especially E.O. Wilson, who argue that much human social behavior is instinctive, and part of our genetic endowment. More recently, similar ideas have been expressed by proponents of evolutionary psychology such as Leda Cosmides, John Tooby, and David Buss. At their most strident, evolutionary psychologists claim that our patterns of experience, thought, and action evolved in an environment of early adaptation (EEA) -- roughly the African savanna of the Pleistocene epoch, where homo sapiens first emerged about 300,000 years ago -- and have changed little since then. Although this assertion is debatable, to say the least, the literature on instincts makes it clear that evolution shapes behavior as well as body morphology. Many species possess innate behavior patterns that were shaped by evolution, permitting them to adapt to a particular environmental niche. Given the basic principle of the continuity of species, it is a mistake to think that humans are entirely immune from such influences -- although humans have other characteristics that largely free us from evolutionary constraints. For a discussion of evolutionary psychology, see the lectures on Psychological Development.
Meanings of "Instinct"
The concept of instinct has had a difficult history in psychology, in part because early usages of the term were somewhat circular: some theorists seemed to invoke instincts to explain some behavior, and then to use that same behavior to define the instinct. But, in the restricted sense of a complex, discriminative, innate response to some environmental stimulus, the term has retained some usefulness. For example, the psychologist Steven Pinker has referred to language as a human instinct.
Nevertheless, the term instinct has evolved a number of different meanings, as outlined by the behavioral biologist Patrick Bateson (Science, 2002):
Bateson correctly notes that one meaning of the term does not necessarily imply the others. Taken together, however, the various meanings capture the essence of what is meant by the term "instinct".
Innate response tendencies such as food-begging can be very powerful behavioral mechanisms, especially for invertebrates and nonmammalian vertebrate species. In their natural environment, some species seem to live completely by virtue of reflex, taxis, and instinct.
But at the same time, these innate behavioral mechanisms are extremely limited. They have been shaped by evolution to enable the species to fit a particular environmental niche, which is fine so long as the niche doesn't change. When the environment does change, evolution requires an extremely long time to change behavior (or body morphology, for that matter) accordingly -- much longer than the lifetime of any individual species member.
Consider, for example, the behavior of newborn sea turtles. Female turtles lay their eggs on the beach above the tide line, and these eggs hatch at night in the absence of the parents. As soon as they have hatched, the hatchlings begin walking toward the water (what you might call a "positive aquataxis"): when they reach it, they begin to swim (another innate behavior), and live independently. However, the young turtles are not really walking toward the water: they are walking toward the reflection of the moon on the water (thus, a positive phototaxis). This hatching behavior evolved millions of years ago. Since then, however, the beaches where the turtles hatch have become crowded with hotels, marinas, oil refineries, and other light sources. Accordingly, these days, the hatchling turtles will also move toward these light sources, and die before they ever reach water. The animals' behavior evolved when the only light in the environment was from the sun and the moon, and they just don't know any better. In order to prevent a disaster, beachside hotels and oil refineries now take steps to employ different kinds of light, or block their lights entirely.
Now perhaps, there is some subtle
difference (like polarization) between moonlight and
electrical light. If so, individual animals who can make
this distinction, moving toward one and not the other, will
survive, reproduce, and, over time, generate more individuals
who can make this distinction. But again this takes time
-- assuming that any individual can make the distinction in
the first place. But even so, each individual gets only
one chance. If it makes the right "choice", this
behavioral tendency will pass on to successive generations,
and the species may eventually come to distinguish between
"good" and "bad" light -- provided that the species doesn't go
extinct first. But that just illustrates the point that
evolved behavior patterns take a very long time to change.
In June 2011, a group of diamondback terrapins caused the temporary shutdown of Runway 4 Left at New York's Kennedy International Airport. And it's happened before. The runway crosses a path that the turtles take from Jamaica Bay one side to lay their eggs on the sandy beach on the other side. Usually, in egg-laying season, the runway is not in frequent use, due to prevailing winds. But that day was an exception, and the turtles brought takeoffs and landings to a halt for about an hour until they could be moved to their destination (we don't know what happened when they tried to get back in the water). It's another example of the difficulty that animals have in adjusting evolved patterns of behavior to rapidly changing environmental circumstances. (See "Delays at JFK? This Time, Blame the Turtles" by Andy Newman, New York Times 06/30/2011).
Here's another example: seabirds, like albatrosses, feed their young through the same sort of instinctual food-begging shown by herring gulls. Adult albatrosses forage over open water, dive to catch fish swimming near the surface, and then regurgitate the fish into the mouths of their young. But it's not only fish that are near the surface. There's a lot of garbage in the ocean, as well. The birds don't know the difference -- they're operating solely on reflex. That garbage is of relatively recent vintage, so there hasn't been enough time -- assuming it were even possible -- for the birds to evolve a distinction between fish and garbage. The result is that adult albatrosses pick up garbage and regurgitate it into the bills of their chicks, who promptly die of starvation -- such as this albatross chick photographed on Midway Atoll in the Pacific.
here's yet another example, a little closer to home.
Wind farms like the one in Altamont Pass produce a large
amount of electrical energy for California, reducing carbon
emissions from coal-fired plants, and our dependence on Middle
East oil. But they also create a hazard for birds,
especially raptors, who like to forage for small mammals over
open areas. Never mind that wind farms are built where
there is strong, steady wind, and therefore often on migratory
flight paths. The result is that a large number of
raptors and other birds are killed every year because they run
into the blades of the windmills.
general, we can identify several limitations on innate
Ecologists and evolutionary biologists are becoming increasingly aware of the problems caused by rapid environmental change. The United Nations Summit on Sustainable Development, held in Johannesburg, South Africa, in 2002, drew international attention to the fact that "nature", far from being "natural", has in fact been remade by human hands. According to Andrew C. Revkin, "People have significantly altered the atmosphere, and are the dominant influence on ecosystems and natural selection (see his article, "Forget Nature. Even Eden is Engineered", and other articles in a special section on "Managing Planet Earth", New York Times, 08/20/02). Even in the early part of the 20th century, Revkin notes, the geochemist Vladimir I. Vernadsky had suggested that "people had become a geological force, shaping the planet's future just as rivers and earthquakes had shaped its past". Now in the 21st century, with the growth of megacities, the increase in population, and the disappearance of the forests, to name just a few trends, we are beginning to recognize, and deal with, the impact of human activity on the environment.
The human impact on the environment doesn't just affect the conditions of human existence. Nature is a system, and what we do affects animal and plant life as well, and sometimes in nonobvious ways.
In a recent paper in Trends in Ecology & Evolution (10/02), Paul W. Sherman and his colleagues, Martin A. Schlaepfer and Michael C. Runge, detail a number of "evolutionary traps", mostly caused by the impact of human activity which alters the natural environment -- activity which goes beyond the simple destruction of habitat, which would be bad enough. More subtle changes alter the environment in such a way that a species' evolved patterns of behavior are no longer adaptive, reducing the chances of individual survival and reproduction, and eventually leading to the decline and extinction of the species as a whole. As Sherman puts it, "Evolved behaviors are there for adaptive reasons. If we [disrupt] the normal environment, we can drive a population right to extinction" ("Trapped by Evolution" by Lila Guterman, Chronicle of Higher Education, 10/18/02).
The concept of evolutionary trap is a variant on the more established notion of an ecological trap, in which animals are misled, through human environmental change, to live in less-than-optimal habitats, even though more suitable habitats are available to them. For example, Florida's manatees have progressively moved north, attracted by the warm water discharged by power plants; but when the plant goes down for maintenance, the water cools to an extent that they can no longer survive in it.
Some examples of evolutionary traps:
In vertebrates, and especially mammalian species, everyday action goes beyond such innate behavior patterns. These organisms can also acquire new patterns of behavior through learning.
Psychologists define learning as:
a relatively permanent change in behavior that occurs as a result of experience.
This definition excludes changes in behavior that occur as a result of insult, injury, or disease, the ingestion of drugs, or maturation. Learning permits individual organisms, not just entire species, to acquire new responses to new circumstances, and thereby to add behaviors to the repertoire created by evolution. In addition, social learning permits one individual species member to share learning with others of the same species (this is one definition of culture). The pace of social learning far outstrips that of evolution, so that learning provides a mechanism for new behavioral responses to spread quickly and widely through a population. Although all species are capable of learning, at least to some degree, learning is especially important in the natural lives of vertebrate species, and especially in mammalian vertebrates. Like us. And, it turns out, most human learning is social learning: we learn from each other's experiences, and we have even developed institutions, like libraries and schools, that enable us to share our knowledge with each other.
For a good treatment of instinctual behavior, see N. Tinbergen, The Study of Instinct (1969).
For a positive treatment of sociobiology, see E.O. Wilson, Sociobiology: The New Synthesis (1975).
For extensions of sociobiology to psychology, see The Adapted Mind : Evolutionary Psychology and the Generation of Culture edited by Jerome H. Barkow, Leda Cosmides, and John Tooby (1992), and Evolutionary Psychology: The New Science of the Mind (1999) by David M. Buss.
One important form of learning, classical conditioning, was accidentally discovered by Ivan P. Pavlov, a Russian physiologist who was studying the physiology of the digestive system in dogs (work for which he won the Nobel Prize in Physiology or Medicine in 1904). Pavlov's method was to introduce dry meat powder to the mouth of the dog, and then measure the salivary reflex which occurs as the first step in the digestive process. Initially, Pavlov's dogs salivated only when the meat powder was actually in their mouths. But shortly, they began to salivate before the powder was presented to them -- just the sight of the powder, or the sight of the experimenter, or even the sound of the experimenter walking down the hallway, was enough to get the dogs to salivate. In some sense, this premature salivation was a nuisance. But Pavlov had the insight that the dogs were salivating to events that were somehow associated with the presentation of the food. Thus, Pavlov moved away from physiology and initiated the deliberate study of the psychic reflex -- not, as the term might suggest, something out of the world of parapsychology, but rather a situation where the idea of the stimulus evokes a reflexive response. Pavlov called these responses conditioned (or conditional) reflexes.
In honor of Pavlov's discovery, this form
of learning is now called "classical" conditioning. A
classical conditioning experiment involves the repeated
pairing of two stimuli, such as a bell and food powder.
One of these stimuli naturally elicits some reflex, while the
other one doesn't. With repeated pairings, the
previously neutral stimulus gradually acquires the power to
evoke the reflex. Thus, classical conditioning is a
means of forming new associations between events (such as the
ringing of a bell and the presentation of meat powder) in the
apparatus for Pavlov's experiments included a special harness
to restrict the dog's movement; a tube (or fistula)
placed in its mouth to collect saliva, a mechanical device for
introducing meat powder to its mouth, and some kind of signal
such as a bell. (Some writers have questioned whether
Pavlov actually used a bell, as the myth has it. Pavlov
was actually unclear on this detail in his own writing.
But a 1997 article by the American psychologist R.K. Thomas
documented this historical tidbit conclusively).
procedure just described illustrates the basic vocabulary of
On later trials, we begin to observe a response that resembles the UR, occurring after presentation of the CS but before presentation of the US. This is the first appearance of the CR.
Even later, we may observe the CR immediately after the presentation of the CS, well before the presentation of the US.
The characteristic curve portraying the acquisition of the CR is an ogive, in which there is a slow increase in response strength on the initial trials, followed by a rapid increase in middle trials, and a further slow increase in the last trials.
Actually, learning can occur even before a CS is paired
with a US. When a novel
stimulus (NS), such as Pavlov's bell,
is presented for the very first time, the organism will show an
reflexive orienting response (OR) -- perhaps a startle
response -- to that stimulus. But if that stimulus is
presented repeatedly, all by itself, the
magnitude of the OR will progressively diminish. This is
known as habituation. It counts as
learning because there is a change in
behavior -- in this case, a change in the OR -- that occurs
as a result of experience. Habituation
is the very simplest form of learning, and
has been observed in animals as simple as protozoa (Penard,
1947) -- and since protozoa are
one-celled creatures, you can't get any simpler than that!
If the NS is now paired with a US, so that the NS becomes a CS, conditioning will occur. However, the CR will be acquired at a slower rate than if there had been no prior habituation trials. This phenomenon is known as latent inhibition (Lubow, & Moore, 1959).
Extinction is the process by which the CS loses the power to evoke the CR. Extinction occurs by virtue of unreinforced presentations of the CS -- that is, presentation of the CS alone, without subsequent presentation of the US. When the CS is no longer paired with the US, the CR loses strength relatively rapidly.
On the first extinction trial, there is a strong CR: after all, the organism does not yet "know" that the US has been omitted.
On later trials, the magnitude of the CR falls off,
until the CR disappears entirely.
On extinction trials, the CR loses strength relatively rapidly. But it is not lost entirely, and it is possible to demonstrate that the CR is still present, in a sense, even after it seems to have disappeared.
Habituation can be
thought of as a special case of extinction, in that the organism
learns not to respond to the NS.
Spontaneous recovery is the unreinforced revival of the conditioned response. If, after extinction has been completed, we allow the animal a period of inactivity, unreinforced presentation of the CS will evoke a CR. This CR will be smaller in magnitude than that observed at the end of the acquisition phase, but CR strength will increase with the length of the "rest" interval.
If we continue with unreinforced presentations of the CS, the spontaneously recovered CR will diminish in strength -- it is extinction all over again.
If we continue with new reinforced presentations of the CS, the CR will grow in strength. The reacquisition of a previously extinguished CR is typically faster than its original acquisition, a difference known as savings in relearning.
During extinction, formal extinction trials can continue after the CR has disappeared, a situation known as extinction below zero. Of course, there is no further visible effect on the CR -- it is already at zero strength. However, extinction below zero has two palpable consequences: spontaneous recovery is reduced (though not eliminated), and reacquisition is slower (but still possible).
Spontaneous recovery, savings in relearning, and extinction below zero, have important implications for our understanding of the nature of extinction. Extinction is not the passive loss of the CR: the organism does not "forget" the original association between CS and US, and extinction does not return the organism to the state it was in before conditioning occurred. Spontaneous recovery and savings in relearning are expressions of memory, and they show clearly that the association between CS and US has been retained, even though it is not always expressed in a CR. Rather, it seems clear that the CR is retained but actively suppressed. Extinction does not result in a loss of the CR, but rather imposes an inhibition on the CR. The strength of the inhibition grows with trials, producing the phenomenon of extinction below zero. The inhibition also dissipates over time, producing spontaneous recovery. Thus, reacquisition isn't really relearning. Rather, it is a sort of disinhibition. Both acquisition and extinction, learning and unlearning, are active processes by which the organism learns the circumstances under which the CS and the US are linked.
Other major phenomena of classical conditioning can be observed once the conditioned response has been established. For example, The organism may show generalization of the CR to new test stimuli, other than the original CS, even though there have been no acquisition trials on which these new stimuli have been associated with the US. The extent to which generalization occurs is a function of the similarity between the test stimulus and the original CS.
The generalization gradient is an orderly arrangement of stimuli along some physical dimension (such as the frequency of an auditory stimulus). The more closely the test stimulus resembles the original CS, the greater the CR will be. The generalization gradient provides one check on generalization: having been conditioned to respond to one stimulus, the organism will not respond to any and all stimuli. Response is greatest to test stimuli that most closely resemble the original CS.
Generalization, Frequency, and Musical Pitch
In discussing generalization of response among stimuli, it is easiest to use the example of the frequency of tones, because differences in frequency -- whether a tone is high or low -- are easy to appreciate. And the example is accurate so far as it goes. If you condition an animal to a tone CS of 250 cycles per second (cps; also known as hertz, abbreviated hz, after the physicist Heinrich Rudolf Hertz, 1857-1894), it will emit a stronger conditioned response to a tone of 300 hz than to one of 350 hz -- because a tone of 300 hz more closely resembles a tone of 250 hz than does a tone of 350 hertz.
With humans, though, things
can get a little more complicated, because musical
pitch is also related to the frequency of tones,
but similarity among pitches is not just a matter
of relative frequency.
Thus, when tones are presented in the context of the diatonic scale familiar in Western music, the generalization gradient may be distorted by the vicissitudes of pitch similarities.
Consider an experiment in which a subject is initially conditioned to respond to a tone of 262 hertz, roughly corresponding to Middle C. Such a subject may well show larger conditioned responses to tones of 524 hz (roughly 3rd-space C), 392 hz (second-line G), and 262 hz (1st-line E), than to either B-flat (233 hz) or D (292 hz), even though the former tones are more distant from the original CS, in terms of frequency, than the latter.
However, this may only occur if we establish a musical context for the tones in the first place -- for example, by embedding the C in the other pitches of the diatonic scale. Or by beginning the experiment by playing a tune in the key of C major. There are some experiments here....
provides a further check on generalization. Consider an
experiment in which we present two previously neutral
stimuli: one, the CS+, is always reinforced by
the unconditioned stimulus; the other, the CS-, is
never reinforced. As conditioning proceeds, the CS+ will
come to elicit the CR, but the CS- will not acquire this
power. If the CS+ and CS- are close to each other on the
generalization gradient, both will initially elicit a
conditioned response. But as conditioning proceeds, the
CR to the CS+ will grow in strength, while the CR to the CS-
will extinguish. The CR is only elicited by CSs that are
actually associated with the US.
Before we Habituation
is a very primitive form of learning.
responses can also appear even if they are very dissimilar to
the original conditioned stimulus. Consider the
phenomenon known as sensory preconditioning, which
occurs before acquisition trials in which a CS is paired with
happens in higher-order conditioning, except that the
first two phases are reversed, so that higher-order
conditioning occurs after acquisition trials in which
CS is paired with US.
By means of acquisition, extinction, generalization, discrimination, sensory preconditioning, and higher-order conditioning, stimuli come to evoke and inhibit reflexive behavior even though they may not have been directly associated with an unconditioned stimulus. By means of classical conditioning processes in general, reflexive responses come under the control of environmental events other than the ones with which they are innately associated.
The phenomena of classical conditioning are ubiquitous in nature, occurring in organisms as simple as the sea mollusk and as complicated as the adult human being. Pavlov himself thought that all learning entailed classical conditioning, but this position is too extreme. Still, classical conditioning is important because, in a very real sense,
Classical conditioning underlies many of our emotional responses to events -- our fears and aversions, our joys and our preferences.
The Physiological Basis of Learning
The ability to learn -- to change one's behavior as a result of experience -- obviously must reflect changes in the organism's nervous system, and indeed the ability to learn is an important example of the plasticity of the nervous system -- the ability of the nervous system to be modified. But what exactly is going on in the nervous system when an organism learns something.
The fact that at least some phenomena of classical conditioning can be observed in every organism that has a nervous system has allowed behavioral neuroscientists to gain important insight into precisely how the nervous system is modified when organisms learn something. In work that won the Nobel Prize for Physiology and Medicine in 2000 (shared with Arvid Carlsson and Paul Greengard) Eric Kandel of Columbia University examined synaptic changes in the marine mollusk, Aplysia, as it acquired a simple conditioned response.
The most important of these changes is long-term potentiation, an increase in the sensitivity of a postsynaptic neuron as a result of repeated stimulation by a presynaptic neuron. This is the neural representation of both a simple association -- an association between neurons that is created as a result of repeated pairing of CS and US.
At roughly the same
time as Pavlov was beginning to study classical conditioning,
E.L. Thorndike, an American psychologist at Columbia
University, was beginning to study yet another form of
learning -- what has come to be known as instrumental
conditioning. Beginning in 1898, Thorndike
reported on a series of studies of cats in "puzzle
boxes". The animals were confined in cages whose doors
were rigged to a latch which could be operated from
inside the cage. The animal's initial response to this
situation was agitation -- particularly if it was hungry and a
bowl of food was placed outside the cage. Eventually,
though, it would accidentally trip the latch, open the door,
and escape -- at which point it would be captured and placed
back in the cage to begin another trial.
Over successive trials, Thorndike observed that the latency of the escape response progressively diminished. Apparently, the animals were learning how to open the door -- a learning which seemed to be motivated by reward and punishment.
basis of his studies of cats in puzzle boxes, Thorndike
formulated a set of 8 Laws of Learning, of which three
are particularly important for our purposes:
Beginning in the 1930s, the study of instrumental conditioning was taken up by B.F. Skinner, a radical behaviorist. Behaviorism was a school of psychology founded by John B. Watson, then at Johns Hopkins University, who believed that psychology could become a legitimate science only by eliminating references to hypothetical mental states (which cannot be publicly observed) and confining the analysis to the relations between publicly observable behavior and the publicly observable environmental conditions under which it is observed. (Watson was forced to resign from Hopkins over a sexual scandal, and went on to a career in advertising. He invented the notion of the "coffee break" as a promotion for Maxwell House Coffee.) Like Watson, Skinner thought that behavior could be, and should be, explained solely in terms of the associations between stimuli and responses, and without reference to hypothetical states (such as hunger) existing in a hypothetical mind of an organism (including humans). Thus the term S-R behaviorism. Skinner was something of a visionary, and he is famous for his utopian novel, Walden II, which describes a community organized along behaviorist lines (he was an English major in college, contemplated a career as a writer, and indeed wrote some very beautiful stuff); and for his meditation on human nature, Beyond Freedom and Dignity. Both are very provocative books. A collection of Skinner's scientific papers, most of which are very readable, is entitled Cumulative Record.
A Note on Two "Functionalisms"
Tracing the relations between environmental stimuli (inputs) and organismic responses (outputs) is often called functional behaviorism, or simply functionalism, but this brand of functionalism (which is currently popular among some philosophers of mind and some theorists in artificial intelligence, a branch of cognitive science) should be clearly distinguished from the 19th-century "Chicago functionalism" of John Dewey and James Rowland Angell (Angell was, however, Watson's graduate mentor), which had its roots in the work of William James and which underlies this course.
Thorndike's apparatus into what has become known as the
Skinner box, though Skinner himself did not use the term
and actually disliked it. He preferred the term operant
chamber. A generic operant chamber, intended to house an
animal during learning trials, includes lights for presenting
signals, levers or keys for collecting responses, a hopper for
presenting food pellets, and a floor grid for presenting
The "Superstition" Experiment
B.F. Skinner demonstrated the power of Thorndike's Law of Effect with the following classic "superstition" experiment. A food-deprived (remember, if you're a behaviorist you can't say hungry) pigeon was placed in an operant chamber. As pigeons are wont to do, it displayed a variety of random pigeon behaviors: it wandered around the chamber, it groomed itself, it flapped its wings and stretched its neck, it cooed, and it pecked at various locations. Every 30 seconds, a food pellet was dropped into the hopper of the operant chamber; this occurred regardless of the pigeon's behavior. Over trials, each bird developed a stereotyped pattern of behavior, but the precise nature of this pattern was different for each bird. The only regularity was this: whatever behavior that had been emitted at the time that the first pellet dropped now began to occur more frequently.
This is a classic illustration of the Law of Effect. Initially, the association between behavior and reward was purely accidental. Nevertheless, following the principle that rewarded responses are strengthened, while unrewarded and punished responses are weakened, that particular behavior began to occur more frequently. Therefore, the bird was more likely to be displaying that behavior the next time a food pellet dropped into the hopper. So, that behavior was strengthened even more. Eventually, whatever behavior had originally coincided with reinforcement comes to dominate the behavior of that individual bird -- all because of an initially accidental link between behavior and reward.
And the "Air Crib"
There's a kind of urban legend circulating that Skinner raised his children in an infant-sized Skinner box: it's not true. Skinner, an inveterate tinkerer, did invent what he called the "Air Crib", a climate-controlled environment which he hoped would ease some of the burdens of child-rearing and foster child development. The Air Crib looked like a regular, if somewhat large, crib. It had a ceiling, three opaque walls, and a glass pane which could be opened to move the infant in and out. There were controls for temperature and humidity, a canvas floor, and sheeting which could be removed and washed when soiled. In this way, the infant had considerable freedom of movement. Skinner was publicized the Air Crib in an article in the Ladies Home Journal entitled "Baby in a Box: The Mechanical Baby-Tender" (1945). It has been estimated that at least 300 infants were raised in a version of the Air Crib (see Robert Epstein, "Babies in Boxes", Psychology Today, 1995). And contrary to rumors that Deborah eventually sued her father and committed suicide, she was alive and well in 2004, when she wrote a newspaper Op-Ed piece in the (Manchester) Guardian that was very positive about both Skinner and the device.
The experiment described above illustrates the basic vocabulary of instrumental conditioning, whose terms largely parallel that of classical conditioning -- though be careful, because their meaning sometimes changes slightly.
(Rft) is an event which increases the strength
(probability) of the behavior (the conditioned response) which
A conditioned response (CR) is the behavior which is strengthened by reinforcement. The strength of the CR is usually indicated by response rate, or the frequency with which the organism displays the behavior.
A conditioned stimulus (CS) is an environmental event which leads to the performance of a conditioned response. Put another way, the CS is a signal or cue that the CR will be reinforced. Sometimes, as in Phase 2 of the typical experiment described above, the CS is the operant chamber itself. That is, the presence of the pigeon in the chamber is a cue that key-pecking will produce food. Other times, as in Phase 3 described above, the CS is some discrete feature of the environment -- such as a lighted key, or a buzzer or tone.
These technical definitions of CS and CR give us the term stimulus-response (or S-R) learning theory. The animal learns that emitting the CR (key-pecking) in the presence of the CS (the illuminated key) leads to reinforcement (food in the hopper). Or, to be a strict, radical, Skinnerian, functional behaviorist, reinforcement of the CR in the presence of the CS leads to an increase in the rate of the CR.
Classical conditioning can also be described in S-R terms. The key is to remember how instrumental conditioning defines reinforcement -- as any stimulus that increases the likelihood of the conditioned response. Thus, in classical conditioning, the CR (e.g., salivating) is reinforced by the US (meat powder) in the presence of the CS (the bell). By virtue of this reinforcement, the CR comes to be emitted in the presence of the CS.
that in instrumental conditioning there is no discussion of unconditioned
stimuli or unconditioned responses. This is
because the behaviors in question are not reflexive in nature,
as they are in classical conditioning. Rather, these behaviors
are emitted spontaneously by the organism. They are what we
ordinarily call voluntary, as opposed to the involuntary
behaviors involved in classical conditioning -- except that
radical behaviorists like Skinner didn't like to talk about
"voluntary" responses, or anything else that smacked of "free
will", because they felt that all behaviors were under control
of environmental stimuli and reinforcements.
the major phenomena of instrumental conditioning parallel the
To a great degree, the major phenomena of instrumental conditioning parallel those observed in the classical case: acquisition, extinction, generalization, and discrimination. However, studies of instrumental conditioning also illustrate a new concept: schedules of reinforcement, each schedule resulting in a different pattern of behavior.
The term refers to the contingent relationship between the organism's emission of its response and the environment's delivery of reinforcement. In the continuous case, reinforcement is delivered after every CR. In the partial case, reinforcement is occasionally withheld. Partial reinforcement retards acquisition, but it also increases resistance to extinction.
Continuous and partial reinforcement are also terms that occur
in the vocabulary of classical conditioning, and they have the
same effects. But there is another category of
reinforcement schedules, intermittent reinforcement,
that is unique to instrumental conditioning. There are four
general types of intermittent schedules of reinforcement.
The Cumulative Record
In textbook figures that depict the effects of various schedules of reinforcement, the organism's cumulative responses are plotted as a function of time (plotted on the horizontal or X axis). This is known as a cumulative record of responses. Every time the organism makes a response, the line moves up a notch on the vertical (Y) axis. Thus, a horizontal tracing means that the organism has made no responses. The slope of the tracing indicates the response rate: shallow slopes indicate a slow rate of response (relatively few responses per unit time), while steep slopes indicate a relatively rapid response rate (relatively many responses per unit time).
B.F. Skinner invented the cumulative record technique, and the term served as the title for his autobiography.
schedule of reinforcement produces its own characteristic
pattern of behavior. For example, DRL schedules typically
produce a string of "ritualistic" responses, that are
ineffective in terms of controlling reinforcement but
nevertheless effectively fill the long interval between
The Matching Law and the Monty Hall Problem
Animals (and humans) can also be put on concurrent schedules of reinforcement. For example, pecking a green key might be reinforced on a VI5 schedule, while pecking a red key might be reinforced on a VI10 schedule. In such cases, the organism will distribute its responses between the two keys in proportion to their rate of reinforcement -- for example, pressing the red key about twice as frequently as the green key. The fact that animals will distribute their responses in proportion to the rate at which those responses are reinforced is called the matching law, which was first announced by Richard Herrnstein (1970), B.F. Skinner's protege at Harvard; see also the review by Peter deVilliers (1977) -- who was, in turn, Herrnstein's protege.
The matching law, in turn, was one of the first contacts between experimental psychology and neoclassical economic theory, as it seemed to reveal a fundamental, perhaps universal, law governing rational choice.
An interesting illustration of the matching law is provided when pigeons are confronted with a version of the Monte Hall problem, popularized by Let's Make a Deal, a television game show. The show's host, Monte Hall, would offer a contestant a valuable prize, such as a car or a vacation, which is hidden behind one of three closed curtains; behind another curtain is nothing; but behind the third curtain is a booby-prize, like a goat. After the contestant makes his choice, Hall opens one of the curtains to reveal nothing, and then offers the contestant the opportunity to change his mind. Note that, at this point, the prize lies behind one of the remaining curtains, while the goat is behind the other one.
Most contestants choose to stick with their original choice (pose this to your friends, and see what they do). But this is the wrong choice. The prior probability that the prize lies behind the contestant's original choice is 1/3. But that's the probability that the prize lies behind any of the curtain. Accordingly, the probability that the prize lies behind the other curtain -- the one that the contestant did not originally choose -- has now doubled to 2/3. Many people don't get this, even after multiple trials with the problem. But it turns out that pigeons catch on pretty quickly -- they're really good at matching responses to reinforcement rates, perhaps because they don't overanalyze the problem, using erroneous theories that lead them to misestimate probabilities. We'll return to the liabilities of estimation later, in the lectures on "Thought and Language".
By means of instrumental conditioning in general, and schedules of reinforcement in particular, voluntary behaviors come under the control of environmental events. The phenomena of instrumental conditioning are ubiquitous, or nearly so: every vertebrate organism, and some invertebrates as well, is capable of acquiring behaviors under conditions of reward and punishment.
Thorndike and Skinner believed that most adaptive behavior is the product of instrumental conditioning. Again, their position is probably too extreme. But the laws of instrumental conditioning do appear to account for the acquisition, maintenance, and loss of both adaptive and maladaptive voluntary behavior -- habitual behaviors of all sorts, and actions performed under conditions of incentive.
In several respects, classical and instrumental conditioning appear to represent two quite different forms of learning.
|Reinforcement is not contingent on the organism's behavior. The US is delivered following the CS, no matter what the organism does.||Reinforcement is contingent on the organism's behavior. The "reward" or punishment is not delivered unless the organism makes the response to be conditioned.|
|The response to be conditioned is elicited involuntarily by the US.||The response to be conditioned is spontaneously emitted by the organism as a "voluntary" behavior.|
|The response being conditioned is "involuntary" (or reflexive) in nature.||The response being conditioned is a "voluntary" (or spontaneous) response.|
|Because classical conditioning is limited to involuntary, reflexive responses, relatively few responses can be conditioned.||Because instrumental conditioning is open to any behavior (or combination of behaviors) the organism is capable of emitting, a large, possibly infinite, number of responses can be conditioned.|
the two forms of conditioning represent quite different procedures
for studying learning:
On the other hand, it seems equally likely that in instrumental conditioning the organism is forming an association between two stimuli -- between the CS and the reinforcement.
Ultimately, as Donahoe and Vegas argue, it may be that classical and instrumental conditioning are simply two forms of the same underlying learning process. But for now, the procedural differences between them are great enough that we will continue to consider them to be different forms of learning. As will be argued later, in classical conditioning the organism learns to predict events; in instrumental conditioning the organism learns to control them.
Although classical and instrumental conditioning appear (to me, anyway) to represent two different forms of learning, most examples of adaptive behavior appear to involve combinations of classical and instrumental conditioning. That is, through classical conditioning the organism learns to anticipate some future event; through instrumental conditioning it learns to cope with that event.
This sort of
combination has been studied in the laboratory in the form of
avoidance learning. The procedure in a typical
avoidance learning experiment is as follows:
theory of avoidance learning proposed by O. Hobart
Mowrer (1947) illustrates how avoidance combines classical and
instrumental conditioning. According to Mowrer, by
virtue of the pairing of the tone CS with the shock US two
kinds of learning occur.
In Hilgard's view, these theories were distinguished by three theoretical preferences:
Hilgard points out that all of these theories were "behaviorist" in nature, in that they took behavior, rather than introspections, as their data. There's a difference between between methodological and radical behaviorism.
One might also suggest that S-R and cognitive theorists differ
in their choice of experimental subjects -- S-R theorists
preferring nonhuman animals, and cognitive theorists preferring
humans, as subjects. But this is a false
In any event, Hilgard noted that all learning theorists must accept all of the same facts discovered through research; they differ in terms of interpretation. And all learning theorists seek to answer the same small set of questions:
Note what is missing here: there is nothing about the brain
(the term barely appears in Hilgard's index). Partly, of
course, this reflected the primitive state of neuroscience at
the time. But the reasons went deeper than that.
For the most part, the classical learning theories have been
confined to the dustbin of history. But it's worth
reviewing at least some of them, for their relevance to the
modern cognitive psychology of learning and memory.
Herewith are some summary notes, based mostly on the 3rd edition
of Hilgard's Theories of Learning, published in 1966,
before the cognitive revolution really took hold in
psychology. This edition was the first to be co-authored
with Gordon H. Bower, his Stanford colleague. Bower, for
his part, had begun his career as a mathematical psychologist
focused on animal learning, and became a distinguished
first-generation cognitive psychologist whose most famous
research focused on verbal learning and memory.
First things first: We have to start with Pavlov, whose studies
of classical conditioning got the whole ball
rolling. Of course, Pavlov wasn't a psychologist at
all. he was a physiologist, who worked first on the
cardiovascular and circulatory systems, and then on the
gastrointestinal system. I usually give the beginning of
Pavlov's work on conditioned eflexes as 1898, the same year as
Thorndike, with the first publication being Wolfson's
dissertation published in 1899.
First, a couple of notes on terminology:
By the time he published Conditioned Reflexes (1927) and Lectures on Conditioned Reflexes (1928), Pavlov had developed pretty much the entire vocabulary of conditioning and learning.
You could take Pavlov out of physiology and into psychology, but you couldn't take physiology out of Pavlov. Of all the classical learning theorists, Pavlov is the only one to have taken specific positions on the neural basis of conditioning.
|For an appreciation of Pavlov's
contributions to psychology, written by a leading
psychologist of the Soviet era, see Razran, G.
(1965). Russian physiologists' psychology and
American experimental psychology. Psychological
While Pavlov dominated learning theory in the Soviet Union, Thorndike's theory dominated in the United States. Thorndike called his theory connectionism, because learning was held to strengthen the associations between sensory stimuli and motor responses. In order to avoid confusing Thorndike's "connectionism" with the "modern" connectionism initiated by Rumelhart and McClelland, it's probably best to think of Thorndike's theory as the "mother" of all stimulus-response (S-R) theories of learning.I've already listed Thorndike's eight laws of learning, which I'll just list here again without much further comment.
There are three primary laws
And the subordinate laws:
These laws were set out fairly early in Thorndike's career, and subsequent research led to the revision or abandonment of some of them.
Watson, the founder of behaviorism, never developed a
full-fledged theory of learning.
That job fell to B.F. Skinner, most prominently in his Behavior of Organisms (1938). Skinner's is an S-R theory, but he rejected the idea of "no stimulus, no response", by which earlier behaviorists had assumed that every response was preceded by some stimulus, even if that stimulus couldn't be identified. Instead, Skinner focuses on two types of response:
And in another counterintuitive move, Skinner distinguished between two types of primary reinforcers:
While Thorndike's Law of Effect gives rise to the impression that positive reinforcers are pleasant, or satisfy some biological motive, while negative reinforcers are unpleasant, Skinner is a true behaviorist, rejecting all reference to mental states. Reinforcements are known only by their effects: something is reinforcing if it increases the probability of the response with which it is paired.
As noted earlier, Skinner and his students and colleagues placed great emphasis on the schedule of reinforcement -- that is, the precise relationship between response and reinforcement (e.g., Ferster & Skinner, 1957). Each of these schedules produced a corresponding pattern of behavior.
Hull also classifies as an S-R theorist, as in his famous formulation:
SER = SHR X D.
But that little element D, distinguishes Hull from the other S-R behaviorists, because it posits that learning is a function of an internal physiological (if not mental) state, drive. And it's the presence of this drive state that makes reinforcements reinforcing. So, by postulating an internal state, Hull makes it clear that learning isn't just a matter of associating stimuli and responses. And it offers a non-circular definition of reinforcement: reinforcements reduce physiological drives.
Hull's mathematico-deductive theory of
learning (1940) is, in some ways, a masterpiece of
quantitative psychological theory, expressly inspired by, and
explicitly modeled on, Newton's Principia and Whitehead
and Russell's Principia Mathematica with each of its
elements stated verbally, then translated into symbolic logic,
followed by experimental tests conducted on a variant of the
verbal-learning paradigm known as role learning --
essentially, an extension of Ebbinghaus's method.
(Earlier, Hull had adapted Ebbinghaus's method for the study of
concept acquisition -- ever the tinkerer, inventing the memory
drum in the process and creating a whole industry of
makers of equipment for university psychology laboratories.
Hull's research gave rise to the standard, ogival, form of the learning curve showing the acquisition of a response over time. Actually, there has been some confusion over the shape of the learning curve. Often, the curve is described as negatively accelerated, with large gains on initial trials followed by smaller gains as learning approaches asymptote. But Culler & Girden (1951), in an exhaustive analysis of published learning curves (following Culler, 1928), determined that it is ogival after all.
Hull's system attracted a great number of adherents, and he
gained additional fame after leaving Wisconsin (where he got his
PhD, with Joseph Jastrow as his advisor) to Yale, where his
colleagues at the Institute of Human Relations applied his
drive-reduction theory to a wide variety of issues in
personality and social behavior -- most famously, Miller and
Dollard's work on frustration and aggression and on conflict
(approach-approach, approach avoidance, and
avoidance-avoidance). Together, these two lines of
research laid the foundation for a translation of Freudian
psychoanalytic theory into the vocabulary of Hull's S-R theory
Unfortunately, the mathematical rigor of his theory proved its undoing. In a famous paper, Gleitman, Nachmias, and Neisser (1954) showed that Hull's theory of extinction was simply wrong. It contained a number of internal, logical contradictions; and its empirical predictions proved to be simply wrong. A theory that can't explain extinction isn't a very good theory of learning, after all. And by this time, any Skinnerian functional behaviorism was at its apex -- soon to be overthrown itself, by the cognitive revolution in psychology.
The cognitive revolution was foreshadowed by the genuinely
cognitive theory of learning proposed by E.C. Tolman (who had
been Gleitman's teacher at Berkeley). As a learning
theorist, Tolman was the chief competitor to both Hull and
Tolman is best known for his studies of latent learning, discussed later, which cast doubt on the role of reinforcement. Here, I'l talk in general terms about his theoretical approach.
Everybody's got their method. Pavlov had dogs in harnesses; Thorndike had cats in puzzle-boxes; Skinner had his operant chamber. Tolman had the maze -- a series of alleys and choice points where his rats could -- well, make choices. In fact, Tolman used the same maze throughout his career. It was a thing of real beauty, with lots of alleys and choice points, which could be walled off with curtains to create different pathways from start box to goal box (diagram courtesy of UCB Prof. Donald Riley, who was one of Tolman's students).
Tolman's research program focused on three aspects of learning.
Tolman considered himself a behaviorist, but it is clear that
he was a behaviorist of quite a different stripe than
When psychology was ready for the cognitive revolution, Tolman,
and a few others (like Jerome Bruner) had pointed the way.
A final note: There's a reason that the Education/Psychology Building at UCB is named after him. Along with Brunswik, Tolman was probably Berkeley's most famous psychologist: his experiments, from almost 100 years ago, are still described in introductory textbooks. But Tolman's contributions to the University go far beyond the experiments on latent learning. In the late 1940s and early 1950s, at the height of the McCarthy Period in American politics, the Regents of the University of California (there was only Berkeley and UCLA then) required all UC faculty to sign a loyalty oath. Tolman viewed this as an infringement of academic freedom, and (along with some other faculty) refused to sign. He was then dismissed from his post, and took a visiting position at Harvard (where he had gotten his PhD under Munsterberg). He then sued the University for reinstatement. In Tolman v. Underhill (1955), the Supreme Court overturned the loyalty oath, and required the University to reinstate him and the other plaintiffs.
A Note on Functionalism
This is as good a place as any to make some remarks about a general trend in learning theory what is known as functionalism, and clear up some misunderstandings about it.
As a "school" of psychology, functionalism was skeptical of the structuralist claim that we can understand mind in the abstract. Based on Charles Darwin's (1809-1882) theory of evolution, which argued that biological forms are adapted to their use, the functionalists focused instead on what the mind does, and how it works. While the structuralists emphasized the analysis of complex mental contents into their constituent elements, the functionalists were more interested in mental operations and their behavioral consequences. Prominent functionalists were:
Psychological functionalism is often called "Chicago functionalism", because its intellectual base was at the University of Chicago, where both Dewey and Angell were on the faculty (functionalism also prevailed at Columbia University). It is to be distinguished from the functionalist theories of mind associated with some modern approaches to artificial intelligence (e.g., the work of Daniel Dennett, a philosopher at Tufts University), which describe mental processes in terms of the logical and computational functions that relate sensory inputs to behavioral outputs.
The functionalist point of view can be
summarized as follows:
So where's the confusion? The confusion comes
from another form of functionalism, "philosophical"
functionalism, which holds a prominent position in
cognitive science -- in particular, those proponents of
what John Searle calls "strong artificial
intelligence" . Essentially, functionalists
identify mental states with certain input-output
functions, irrespective of the medium which performs
those functions. It the follows that any physical system
which performs those functions has mental states --
regardless of whether that physical system is a brain, a
computer, or -- to take a vivid image -- a bunch of beer
cans connected by string and powered by windmills.
The connection to Stimulus-Response theory is
obvious. Philosophical functionalism does have one
advantage over behaviorism, in that it at least
acknowledges the existence and causal power of of mental
So don't get confused. When somebody identifies himself as a "functionalist", these days, he's likely to be a philosopher who identifies mind with certain functions, and who thinks that computers can have minds. And he's also likely to be inclined toward something like stimulus-response behaviorism.
But, as Dewey and his friends understood, functionalism doesn't have to stand for any such thing. In the American tradition of Dewey and James, functionalism can just be an umbrella term for a particular approach to learning, memory, and other aspects of mind and behavior:
So far, we have simply described the phenomena of conditioning -- acquisition, extinction, generalization, discrimination, reinforcement, and the like. But what actually happens in learning? Or, put another way, what is the organism learning from experience?
Learning was once thought to be as automatic as reflexes, taxes, and instincts. Just as these are innate stimulus-response associations, part of the organism's biological endowment, so classical and instrumental conditioning was thought to represent acquired stimulus-response connections, formed as a result of experience but no less automatic.
name implies, S-R learning theory holds that what is learned
in conditioning is an association between a stimulus and a
response -- an association that is strengthened by
One important line of research challenged the arbitrariness assumption that organisms could learn to attach any response in their repertoire to any stimulus in the environment, by showing that some conditioned responses are easier to acquire than others.
research begins with work by the American psychologist John
Garcia and his colleagues on a phenomenon known as taste-aversion
learning (or bait shyness). Before
Garcia became a
of Tolman's at UC Berkeley, he grew up on
a sheep ranch in the American southwest, where ranchers
routinely used poison to control coyotes and other
predators. Garcia knew from this experience that when
animals eat poisoned food or drink poisoned liquids, and
nonetheless survive, they will avoid that substance later
(hence the term, "bait-shyness"). Garcia and his
associates developed a laboratory analogue of bait-shyness
in an attempt to study the anticipatory nausea which
some cancer patients develop in the course of receiving
chemotherapy. Garcia's paradigm was a variant on
classical fear conditioning:
In other words, the animals formed associations between shock and sight and sound, and between nausea and taste; but they made no connection between nausea and taste, or between shock and sight and sound. This outcome violates the arbitrariness assumption of traditional S-R theories of learning, because all elements of the compound CS occur at precisely the same time and place. Thus, they all have precisely the same spatial and temporal contiguity with respect to the US. Therefore, under the assumption of arbitrariness or equipotentiality, they should all have been equally powerful as CSs. But they were not.
This experimental outcome is commonly interpreted as indicating that the potency of a stimulus is related to the evolutionary history of the species. Rats are nocturnal animals, and under ordinary circumstances choose their food according to its taste. Therefore, their evolution has disposed them to form associations between the taste of food and its gastrointestinal consequences, but not between sight or sound and nausea. The explanation is supported by experiments on birds (like quail), who are sight-feeders. They quickly form associations between nausea and visual stimuli, but not between nausea and taste.
From Coyotes to Sheep to Wolves
Garcia became interested in bait shyness because of its use by sheepherders and other ranchers in the natural control of coyotes and other predators, but you don't have to be a predator to be susceptible to bait shyness.
In 2007, Morgan Doran, a farm advisor with the University of California Agricultural and Natural Resources Cooperative Extension, based in Davis, began a program of research on bait-shyness in sheep. Sheep and goats are often used for brush control and weed abatement -- you can see them, for example, in the Oakland and Berkeley Hills in an attempt to prevent wildfires from spreading through dry overgrowth. And vintners have been interested in using this same technique for weed control in vineyards.
That's all very good on paper, but the practical problem is how to get the sheep to eat the weeds, and not the very tasty tender shoots of young grapevines!
In Doran's study, a group of sheep are allowed to feed freely on vine leaves, and then they are fed a capsule filled with lithium chloride -- which, while not lethal, induces pretty severe nausea. A control group is also allowed to feed on the grape leaves, but gets a placebo capsule. Results from a pilot study indicates that the sheep will, in fact, avoid the grape leaves in the field, and focus their feeding on the leaves.
A similar project is underway in Marin County's dairyland, where cattle have been trained to prefer a particular kind of thistle.
Turning the tables, bait-shyness (and preparedness) has been enrolled in the effort to protect the Mexican wolf, which was hunted to near extinction by ranchers seeking to protect their cows and sheep from predation. An experiment with captive Mexican wolves shows promise in getting the animals to avoid sheep, and might be effective in wildlife management as well.
Who says that animal research has no practical significance!? Or that's it's bad for the animals.
In addition to violating the assumption of arbitrariness, the outcome of the Garcia experiment also violates the assumption of association by contiguity. Recall that all the elements of the compound CS were presented simultaneously. Therefore, all elements of the compound CS were equally contiguous with the US -- temporally contiguous, because they occurred close together in time; and spatially contiguous, because they occurred close together in space. But despite being equally contiguous, not all potential CSs acquired the power to evoke a CR.
Moreover, consider the special circumstances of the X-ray condition. X-rays require a long time to take effect, about 30 minutes, by which time the rats may well be in another part of the compartment, some distance from the food source. Therefore, the bright, noisy sweet water was separated from nausea by an appreciable distance in time (and, for that matter space). Even so, an association was formed between taste and nausea. Apparently, then conditioning is possible even in the absence of temporal (or spatial) contiguity, a point to which we will return later.
Finally, just for good measure, the outcome of the Garcia experiment violates Thorndike's Law of Exercise. Thorndike concluded that stimulus-response associations were strengthened with repetition, but Garcia's rats formed strong taste-nausea associations after only a single trial. If evolution has predisposed rats to form associations between the taste of food and its gastrointestinal consequences, it has also predisposed rats to form these associations quickly and over long delays.
The arbitrariness assumption was also challenged by research by Robert C. Bolles on species-specific defense reactions. Boles noted that pigeons could quickly learn to flap their wings or stretch their necks (both behaviors preparatory to flight) to avoid shock, but could not learn to peck a key to avoid shock (even though, as Skinner discovered, pigeons quickly learn to peck a key to obtain food). Moreover, rats quickly learn to jump and run (both part of the "flight" reaction to stress), but are slower to learn to press a key, to avoid shock. Again, outcomes like these violate the arbitrariness assumption: because all the behaviors are in question are in the species' repertoire, and reinforcement is equally contingent on each type of response, all the stimulus-response connections should be equally associable. Nevertheless, some S-R connections are easier to form than others. Bolles concluded that the ease of conditioning depends on the natural defense reactions of each species. Pigeons learn to hop, flap their wings, and stretch their necks, because these behaviors are preparatory to flight. Rats quickly learn to jump or run, because these behaviors are part of their innate response to threat.
both classical and instrumental conditioning, learnable
associations are not arbitrary. It is not the case
that just any CS can be attached to just any
CR. Taken together, these results illustrate what
Martin E.P. Seligman, Paul Rozin, and James Kalat have
called the preparedness principle.
The biological constraints on learning are important, but in order to understand what the organism is really learning we also have to understand the organism's internal cognitive workings -- what the organism's mental states are, the nature of its internal, mental representation of the world, and what's it is trying to do over the course of learning. These aspects of the learning process are revealed by studies of the cognitive constraints on learning.
example, the principle of association by contiguity,
already challenged by Garcia's experiments on taste-aversion
learning, is further undermined by certain peculiarities of
kinds of results highlight the distinction between contiguity
Put another way, conditioning occurs when the CS acts as a signal that the US is forthcoming. In backwards conditioning, however, the CS signals that the US is not forthcoming. In backwards fear conditioning, the CS actually serves as a safety signal -- informing the animal that the shock will not be forthcoming for a while. The CS has value as a signal only when there is a contingent relationship between the CS and the US, regardless of whether the CS and US are temporally and spatially contiguous. The conclusion is that contingency is more important than contiguity: conditioning occurs only when the CS predicts the US. When the CS is uninformative about the US, no conditioning occurs. And when the CS predicts the absence of the US, as in extinction or backwards conditioning, the CR is actually inhibited.
A compelling demonstration of the role of contingency in classical conditioning was provided in a classic experiment by Robert Rescorla (1967), for his doctoral dissertation at the University of Pennsylvania (after many years at Yale, Rescorla returned to his alma mater in a faculty role). In this experiment, Rescorla varied the predictability of a shock US, given the presentation of a tone CS.
In one condition of the experiment, the CS was a perfect predictor of the US, in that the CS always immediately preceded the US (that is, within 1 second or so). No CS was ever presented that was not immediately followed by a US; and no US was ever presented that was not immediately preceded by a CS. Thus, expressed in terms of probabilities:
p(US | CS) = 1.0; and
[Read this as "the probability that the US will occur given the prior occurrence of the CS is 1".]
p(US | no CS) = 0.0.
[Read this as "the probability that the US will occur given no prior occurrence of the CS is 0"]
This condition resulted in very good conditioning.
In another condition of the experiment, the CS was a less-than-perfect predictor, because Rescorla interspersed a number of unreinforced CSs -- that is, CSs that were not immediately followed by USs. Thus, of all the CSs that were presented, half were not followed by USs. However, the US never occurred unless it was immediately preceded by a CS. Again, expressed in terms of probabilities:
p(US | CS) = 0.5 and p(US | no CS) = 0.0.
This condition still resulted in fairly good conditioning.
In a third condition of the experiment, the CS rendered ineffective as a predictor of the US, because Rescorla interspersed a number of unsignalled USs -- in fact, half of the USs -- USs that were not immediately preceded by CSs. Now, the situation was that CSs and USs occurred randomly, independently of each other. Expressed in terms of probabilities:
p(US | CS) = 0.5 and p(US | no CS) = 0.5.
Under these conditions, no conditioning occurred, even though the CS and US were frequently presented together in the same place at the same time.
The upshot of Rescorla's experiment, which stands as a modern classic in psychology, is that conditioning is not simply the formation of an association between spatially and temporally contiguous stimuli. Rather, conditioning occurs only when the CS provides information about the US. The amount of information provided may be estimated as the difference between two probabilities:
p(US | CS) - p(US | no CS).
Conditioning occurs only if, and to the degree that, the CS is a reliable predictor of the US. Put another way, conditioning occurs only if the US is more likely following a CS than in the absence of the CS. What's amazing about this is that it appears that even organisms as simple as the white rat, or simpler, are in some sense computing the conditional probabilities involved. The computation is not necessarily conscious, of course -- the rats haven't taken Statistics 2, after all. But it is a computation nonetheless.
The importance of the predictive relationship between the CS and the US is underscored by two other phenomena discovered by Leo Kamin.
first experiment concerned the phenomenon of overshadowing.
Consider two standard conditioning preparations:
But what happens if, after we condition the organism to the compound, we test the two elements separately? When we do, we get a good CR to the light, but not to the tone. This is not a problem of differential preparedness, as in the Garcia experiment, because neither light nor tone is particularly prepared or contraprepared to serve as a signal for shock. Instead, once more, the result violates the assumption of association by contiguity. Both the light and the tone were equally contiguous with the shock. But it appears that the more salient, noticeable CS, in this case the bright light, overshadows the less salient or noticeable one. Both are contiguous with the shock, and both are good predictors of the shock as well, but conditioning occurs to the CS that is more salient.
The second experiment concerned the phenomenon of blocking. As background to this research, recall that in standard classical fear conditioning, a footshock US is preceded by a tone or light CS. Under these conditions, we get good conditioning of fear, as represented by such conditioned emotional responses as heart-rate acceleration, in response to previously neutral CSs.
We now give an animal acquisition trials
with a compound CS, consisting of a tone and a
light presented simultaneously, followed by a shock US in
the usual manner. After 16 pairings of tone and light
followed by shock, we test the animal's response to a
variety of stimuli:
p(shock | noise) = p(shock | noise + light) = 1.0.
the outcome would be different under different conditions.
Conditioning occurs only when the CS signals a change in the US.
Kamin concluded, further that conditioning only occurs when the US surprises the organism. In the presence of a surprising event, the organism then searches the environment for possible predictors of that event. Among these, it will pay attention to the most reliable predictor, which becomes the effective CS. If there is more than one reliable predictor, it will attend to most salient predictor, leading to the phenomena observed in the "overshadowing" experiment. And it will ignore stimuli that lack predictive power, leading to the phenomena observed in the blocking" experiment.
Kamin's experiments are important because they
simultaneously undermine three assumptions of classical S-R
Similar considerations apply to instrumental conditioning. The behaving organism is searching for predictability, but it is also searching for control. It wants to know what to do about forthcoming events, not just where and when to expect them. In instrumental conditioning, the organism is acquiring these expectancies of control.
role of these expectations can be seen clearly in the
phenomenon of learned helplessness, discovered by
Martin E.P. Seligman, Steven Maier, and Bruce Overmaier when
they were graduate students at the University of
Pennsylvania, working under Richard L. Solomon.
Mowrer's two-factor theory of avoidance learning, discussed
above, predicts that avoidance learning will be facilitated
if the organism has already undergone fear conditioning. The
idea is that the organism already knows to fear the CS, and
all it has to do is to learn to avoid the US. To test
Mowrer's theory, they performed the following experiment:
Proper avoidance responding can be established in dogs who have been pretreated with inescapable shock, but only by forcibly dragging the dogs from one side of the shuttlebox to the other.
Why does this happen? Seligman and his associates reasoned that learned helplessness reflects the acquisition of negative expectations of control. In classical fear conditioning, the shock is both inescapable and unavoidable. Tone is followed by shock, and there is nothing the animal can do about it, because in classical conditioning reinforcement is not contingent on the subject's behavior. It is only contingent on the CS. Accordingly, the animal in such a situation acquires a negative expectation that nothing can be done about the shock. This negative expectation, in turn, generalizes to the avoidance learning situation.
Learned helplessness is significant because it may underlie certain forms of clinical depression. But it also has great theoretical significance, because it shows that instrumental behavior is determined by the organism's expectancies, not by environmental events.
Helplessness at the World Trade Center
In the aftermath of the terrorist attacks of September 11, 2001, emergency-service workers at the World Trade Center employed "search and rescue" dogs to locate victims, living and dead, who might have been buried under the rubble. These animals were trained through instrumental conditioning procedures to sniff out human bodies: basically, when they found a body they received a reward (a similar training procedure is used for the "drug-sniffing" dogs employed by the police). At the WTC, however, there were very few such bodies to be found -- not because there weren't any victims, of course, but because the victims' bodies had been pulverized into dust by the collapse of the building. As a result, the search-and-rescue dogs became obviously depressed -- because they were not able to do the job they were trained to do. In the language of learned helplessness, the animals were not able to engage in behaviors that controlled reward. In order to maintain the animals' motivation for the job, emergency-service workers would sometimes lie down in the rubble -- just to give the dogs somebody to find -- or, in the language of learned helplessness, to maintain their sense of control.
bottom line is that conditioning is the wrong
metaphor for learning. A better metaphor might be computing.
The learning organism is trying to figure things out, and it
does this by, in some sense, computing conditional
Predictability and controllability are central to
conditioning, but they also have
clinical implications. I referred earlier to a body of
research, initiated in Pavlov's laboratory on experimental
neurosis. Inspired by Seligman's
learned helplessness model of depression, which focused on uncontrollabile
aversive events, Mineka and
Kihlstrom (1978) proposed that experimental neurosis was
caused by exposure to unpredictable
Whereas a history of uncontrollable aversive events can lead to
depression, as Seligman argued, Mineka and Kihlstrom suggested
that unpredictable aversive events are a source of anxiety.
A similar point can be made with respect to the role of reinforcement in learning. The conventional view, expressed in Thorndike's Law of Effect, which says that nothing is learned in the absence of reinforcement. In classical conditioning, the CS must be followed by a reinforcing US. In instrumental conditioning, the CR must be followed by reward or punishment. However, a number of experiments now make clear that reinforcement is not necessary for learning to occur.
for example, two phenomena of classical conditioning
A similar point is
made with respect to instrumental learning by classic
studies on latent learning performed by Edward C.
Tolman of the University of California, Berkeley (after whom
the Education/Psychology Building at Berkeley is
named). Tolman's experiments involved a maze-learning
procedure, in which hungry rats were placed in the start box
of a maze, and food placed in the goal box. Over
trials, the rats would learn, through trial and error, the
route through the maze. In theory, these responses --
turn left here, turn right there, go straight, whatever --
were reinforced by the delivery of food in the goal
box. Intuitively, this makes sense, but Tolman asked
whether the reinforcement was really necessary for learning
The experiment, by Tolman and Honzik,
involved three groups of rats:
Tolman concluded that the animals in this group learned how to get from the start box to the goal box on the first 10 trials, but just needed a reason to do it. This reason was provided on Trial 11 and subsequent trials. In other words, Tolman's animals learned the maze without any reinforcement. Over 10 trials of exploration, they developed a "mental map" of their environment, which was subsequently available for use for a variety of purposes. However, they didn't perform a goal-directed response until the introduction of reinforcement established a goal.
Put another way,
Reinforcement controls performance rather than learning.
A similar point was made in research on rhesus monkeys published in the early 1950s by Harry Harlow of the University of Wisconsin (later to become famous for his studies of "monkey love" and "motherless monkeys". In one set of studies, Harlow presented his monkeys with a wooden "puzzle lock" consisting of a series of latches which, when moved in the right order, would open a door. Some animals were rewarded with food (rhesus monkeys love FrootLoops) for making correct moves; others received no reward at all. Harlow observed no difference in the monkeys' problem-solving behavior. In fact, if they were hungry, hunger appeared to interfere with solving the puzzle. If they were not hungry, but were "rewarded" with food anyway, they usually stored the food for later consumption. Harlow concluded that the monkeys were simply curious about the puzzle. In his view, curiosity is an aspect of intrinsic motivation, or the desire to perform an activity without the promise or prospect of reward. This is not to say that animals are not also motivated by extrinsic considerations such as hunger and thirst, only that these are not the only rewards. Considering only extrinsic motivation such as hunger, Harlow's monkeys learned whether they were rewarded or not.
The point of all of this is that organisms are built to learn from experience, and they do this naturally, in the ordinary course of everyday living, without requiring reinforcement, by computing the contingent probabilities among the objects and events that they observe in their environments. This learning mechanism is sometimes known as statistical learning, because the organism samples the environment and then makes probabilistic inferences about what is going on in it -- what are technically known as the transitional probabilities from one thing to another (Aslin & Newport, 2012).
Here's an example of statistical learning in the domain of language. As we'll see later in the lectures on Language and Comunication, an early phase in language learning occurs when an infant learns to recognize the particular phonemes -- basic sound units -- and combinations of phonemes that occur in his or her native language. Saffran, Aslin, and Newport (1996) presented eight-month-old human infants with a steady stream of speech-like sounds consisting of four randomly ordered three-syllable nonsense words, such as:
Note that, in such a string, the transitional probabilities of syllables within words (e.g., pabi within the word pabiku is a perfect 1.0, while the transitional probability of syllables across words (e.g., tuda between the words golatu and daropi is only 0.33). They then tested the infants' recognition of individual worlds by presenting them with "legal" real words, like pa bi ku, and non-legal "part-words" like tu da ro.
How do you test word-recognition in infants? One way is to give them an artificial nipple to suck on, and measure the rate at which they do so: when they're surprised, they stop sucking for a moment. In this experiment, the infants were placed in front of a blinking light, and changes in their looking behavior were used as an index of surprise.
Anyway, the upshot of the experiment was that, after only two minutes of exposure, the infants were able to discriminate between legal and non-legal words. Learning occurred, a very sophisticated learning at that, just by listening to the audio stream, without any reinforcement at all.
Other experiments have shown similar learning effects with sequences of musical tones as well as syllables; and in the visual domain, as infants learned the spatial arrangements of shapes in scenes.
And it's been shown that statistical learning extends to neonates as well as to infants.
Moreover, infants can generalize from the stimulus materials to which they've been exposed to novel stimulus materials. For example, infants who have been exposed to one set of pseudowords in a pattern such as dadapi or pabibi also recognized novel pseudowords arranged in the same AAB or ABB pattern, such as kikino or golala. In other words, they acquired something like a concept or a rule that went beyond the specific instances to which they had been exposed to cover novel elements or combinations of elements.
In statistical learning, infants are doing exactly what Pavlov's dogs and Thorndike's cats and Rescorla and Kamin's rats were doing: learning the structure of the world, acquiring expectations about what goes with what and what is going to happen next, simply through observation.
Learning occurs naturally in most behaving organisms. Some species are so well adapted to their environmental niches, and their environmental niches are so stable, that they have little need (or opportunity) to learn much more than where they are likely to find food. For other species, a capacity for altering behavior through learning is itself an important adaptation. Through the experience of various contingencies, organisms acquire information about events in their environment, and about the outcomes of behavior. Reinforcement merely motivates the organism to act on what it learns, in order to achieve certain outcomes, and avoid others.
plays a particularly limited role in language
learning. Babies do not learn their native language
through trial and error, mediated by reinforcement.
Rather, they simply pick up language by being exposed to
it. Human babies seem to be innately programmed to
learn natural language, merely through exposure to a
may not be necessary for learning, but practice
is. Hardly anything is learned in a single trial, and
that is especially true for complex motor and cognitive skills
like learning to play a musical instrument or reading
music. In a famous paper, Anders Ericsson and his
colleagues (1993), interviewed musicians and determined that,
by age 20, the best violinists had engaged in deliberate
practice for a cumulative amount of more than 10,000 hours,
compared to 7,800 hours for merely "good" violinists, and
4.600 hours for the least-accomplished group. Assuming
that they began playing the violin at 5 years of age, that
comes to more than 666 hours per year, or about an hour per
day, every day, week in an week out. Findings such as
these led Eriksson (2007) to conclude that "extended and
intense practice" was the feature that most distinguished
elite performers from "normal adults". Ericcson's
research, in turn, formed the basis of the 10,000 Hour
Rule" popularized by Malcolm Gladwell in his book, Outliers
(2008). That is, it appears to take about 10,000 hours
to become an expert at something. And indeed, when you
examine the histories of elite performers, 10,000 hours seems
about right -- the equivalent of about 250 40-hour
Of course, talent matters, too. A twin study by Mosing et al. found that individual differences in musical ability -- defined as the ability to make subtle discriminations of pitch, rhythm, and melody -- had a substantial genetic component, accounting for about 50% of population variance (for more on how such calculations are made, see the lectures on Psychological Development). Most of the remaining variance was accounted for by the nonshared environment. Somewhat surprisingly, Mosing et al. reported that music practice had no effect on musical ability. That is to say, there was no difference in test performance between monozygotic twins who differed in the amount of musical practice (e.g., between two twins, one of whom became an orchestra musician, and the other of whom became a brain surgeon). Interestingly, Mosing et al also found a substantial genetic contribution to the amount of practice that their subjects engaged in, explaining about 69% of population variance.
further doubt on the 10,000 Hour Rule was cast by a
meta-analysis of studies of expertise by a meta-analysis of
expertise studies by Macnamara et al. (2014). These
investigators surveyed a large number of studies of the
effects of practice on skilled performance, covering games,
music, sorts, education, and professional activities.
Across 88 studies involving more than 11,000 subjects, they
found that the average correlation between deliberate practice
and performance was .35, explaining about 12% of total
variance. This outcome, they claim, is inconsistent with
Ericsson's claim that individual differences in performance
are mostly explained by individual differences in practice.
has to be said that the claim that practice has no effect on
expertise, and that all the action is in the genes -- which is
what Mosing et al. expressly state in the title of their paper
-- is implausible on the face of it.
Usually, we think of learning as entailing the direct experience of environmental events, organismal responses, and their outcomes. In classical conditioning, Pavlov's dog gets the food after hearing the bell. In instrumental conditioning Thorndike's cats get freedom after pressing the latch. But can animals learn from the experience of other animals? This is the question of vicarious or observational learning.
The phenomenon of observational learning was first demonstrated convincingly in the laboratory by Susan Mineka, who was then at the University of Wisconsin (she is now at Northwestern University), in a study of snake fear in rhesus monkeys. Rhesus monkeys born and raised in the wild are universally afraid of snakes. This is quite adaptive: after all, the monkeys live in an environment where there are lots of deadly snakes, vipers as well as constrictors. Therefore, traditional theory has held that the fear of snakes in rhesus monkeys is innate, programmed by evolution in much the same way that instincts are. The only problem with the theory is that monkeys who are born and raised in laboratory conditions do not fear snakes. When exposed to a snake, they show no signs of fear. Therefore, it seems that snake-fear must be acquired through experience. But, if you think about it, it's not entirely clear how you learn from experience to fear a deadly snake. Because after the first encounter, you're dead (snakes are like that). Therefore, Mineka proposed that monkeys acquire their fear of snakes vicariously, from observing the reactions of other monkeys when they encounter snakes. Thus, snake fear is not innate, but a learned part of what might be thought of as "monkey-culture".
Mineka conducted an ingenious series of experiments to investigate the social learning of snake fear in rhesus monkeys. For her test of fear, she employed a piece of equipment known as the Wisconsin General Test Apparatus (WGTA), in which the monkey is seated in a restraining chair, something like a baby's high chair, while being presented with various stimuli and making responses. Mineka offered the monkeys a highly desirable food treat (Fruit Loops are dandy for this purpose), but in order to obtain the treat it had to reach past a snake or some other object. Response latency, or the time it took the animal to reach past the object, was the measure of fear: the longer the latency, the more fear.
Mineka's initial study compared monkeys reared in the wild and in the lab in their response to various test stimuli such as real, toy, and model snakes (the real snake was a small boa constrictor), black and yellow cords, and a painted wood block. As expected, the wild-reared monkeys were more afraid of the snakes than were the lab-reared monkeys.
her first vicarious conditioning study, Mineka paired a
(snake-phobic) wild-reared adult with a (non-snake-phobic)
lab-reared adolescent (in her first study, the adult was
actually the parent of the adolescent).
You've got to be taught to hate and fear
You've got to be taught from year to year
It's got to be drummed in your dear little ear
You've got to be carefully taught
You've got to be taught to be afraid
Of people whose eyes are oddly made
And people whose skin is a different shade
You've got to be carefully taught
You've got to be taught before it's too late
Before you are six or seven or eight
To hate all the people your relatives hate
You've got to be carefully taught
You've got to be carefully taught
Link to a recording of William Tabbert singing this song, from the original Broadway cast.
performed a number of variants on this basic experiment,
with increasingly sophisticated methods, to explore the
parameters of observational conditioning.
answer is no: Vicarious fear conditioning occurs only to
snakes and snakelike objects. It does not occur to the
humans, perhaps the most powerful and dramatic example of
observational learning occurs in the domain of
language. By the time they are 4 or 5 years of age,
every normal human child has become a fluent speaker of his
or her native language -- that is, whatever language the
child's parents and others speak in his or her presence.
By contrast, even our closest primate relatives, chimpanzees, have no ability to learn language. They may learn some "words" in the form of symbols, spoken or visual, that represent things like bananas. But even after years of effortful training, they have essentially no ability to use syntactical rules to form and understand meaningful sentences. When it comes to language, the "smartest" chimpanzee can't hold a candle to the dullest human 5-year-old.
In fact human language learning is so effortless and automatic that many linguists speculate that there is an innate capacity for language -- a "language acquisition device" that is a product of evolution, and which is a unique feature of human nature. Knowledge of English or Chinese or Swahili or Farsi isn't innate, but the mechanism that allows children to learn these languages does appear to be.
Put another way, language acquisition is highly prepared in humans. Just like rhesus monkeys are highly prepared to learn to fear snakes, so human beings are prepared to learn language. In chimpanzees, the best we can say is that language learning is unprepared, and it may even be contraprepared -- which is why chimpanzees can't learn syntax no matter how much training they receive.
Social interaction is critical to language acquisition: without models. Not only does the child require exposure to spoken language (and thus to the people who speak it), but the child needs to be exposed to what others are doing, and looking at, when they speak. You can't just play a CD of spoken English under the child's crib and expect it to learn semantics and syntax (though it will learn the basic sound patterns). The child has to interact with other people. And these people don't even have to speak. Deaf children whose parents and teachers use sign language, will effortlessly pick up the semantics and syntax of sign language, just like hearing children pick up whatever language their parents speak.
And this interaction has to occur within a particular interval of time -- roughly, before the onset of puberty. "Wild" children, who are raised in isolation from others until they reach adolescence, never really "get" language. Within the more normal range of human experience, children who are raised in a bilingual environment -- say, with parents who speak both English and Spanish -- will effortlessly learn both languages, and speak both without an accept. But if the learning of one language is delayed -- say, until high school or college -- it is very hard to gain facility in the second language, and the person is likely to speak it with a decided accent. So, as with imprinting, there appears to be a critical period in language learning.
The capacity to learn language appears to be innate, a gift of human evolution. And there is a critical period in language learning. But despite this innate component, language acquisition requires exposure to a linguistic environment. In this sense, it fits the true definition of learning as a change in knowledge that occurs as a result of experience. And instead of being taught deliberately, through the direct experience of rewards and punishments, is occurs vicariously -- just by virtue of observation, without any particular reinforcement.
As language acquisition illustrates, observational learning is particularly important in humans. If you think about it, we do not learn all that much through the direct experience of trial and error, reward and punishment. Rather, most of our learning comes through interactions with others. To take a somewhat extreme example, physicians don't learn how to perform surgery by trial and error. Rather, they learn surgery by watching experienced surgeons perform, and by being taught by them. When a surgeon takes a scalpel to his or her first patient, he or she already knows what to do and how to do it.
Bandura, of Stanford University, argues that human social
learning takes two forms:
Consciousness also plays an important role in learning by precept. To deliberately teach someone something presupposes that you are aware of it yourself. Without conscious awareness, there could be no conscious intent, and so no sponsored teaching of the sort that is critical to learning by precept.
Although most studies of learning performed before 1950 employed lower animals such as rats, dogs, and pigeons for subjects, the ultimate object of inquiry was humans. The major theories of learning assumed, explicitly or implicitly, that the same principles of learning adduced to explain simple behavior in these species would also be found relevant to complex human behavior. This program of application to the human case was pursued most prodigiously by B.F. Skinner, in his analyses of personality and social behavior (1953) and language (1957). According to Skinner, human behavior is performed under the conditions of stimulus control. Rather than focusing on internal dispositions such as traits and motives, or cognitive constructs such as expectation, a proper analysis of personality will focus on the individual's reinforcement history, as well as on discriminative stimuli and reinforcement contingencies present in the current environment. Human behavior is complex only insofar as the stimulus conditions in which it occurs are complex.
Other investigators also took up the Skinnerian program. For example, Staats and Staats (1963) attempted to apply the principles of learning to problems in personality, motivation, and social interaction, among other topics. Their work is not exactly Skinnerian in nature, because it attempts to come to grips with certain aspects of language that are outside the scope of Skinner's analysis. Nevertheless, the list of psychologists whom they cite as the inspiration for their efforts begins with Skinner, and includes most of major figures identified with the behaviorist analysis of learning. Staats' most recent statement of his theory, in fact, is entitled Social Behaviorism (1975).
At the same time, it became clear that certain aspects of complex human behavior resisted conventional behavioral analysis. As one example, already discussed, language does not seem to be acquired through the principles of conditioning and reinforcement that are central to behaviorist analyses. The same is true of many human social behaviors. The problem of accounting for learning without direct experience of reinforcement ultimately lead to the development of a different cognitive theory of personality: cognitive social learning theory.
A step in this new direction was taken with the social learning theory of Miller and Dollard (1941). According to Miller and Dollard, personality consists of habits formed through learning. The learning process, in turn, is described in terms of a version of S-R learning theory proposed by Clark L. Hull. According to Hull, a habit represents a strong connection between some stimulus and some response. This association is acquired by virtue of drive-reduction: in the presence of the stimulus, the behavior has led to the satisfaction of some drive (you can see the connection to Thorndike's Law of Effect).
Although Hull conceived of these drives as biological in nature, Miller (1951) later added concept of acquired (or secondary) drive. That is, through conditioning some external stimuli come to possess some of the properties of an internal drive state. For example, while fear is an innate drive, elicited by noxious stimulation, it can also be conditioned to previously neutral stimuli. Habits can be learned because they lead to fear reduction (a primary drive), and also because they eliminate fear stimuli (secondary drives). Drive-reduction theory thus provides the basic elements of personality viewed as a system of habits, in the form of principles of learning. A drive is any need which activates behavior. It can be innate, or it can be acquired through experience. However, drive itself does not give any particular direction to behavior. This directionality is given by the operation of other principles. Hull's theory, like Freud's, assumes that people are motivated to maintain homeostasis, eliminating states of tension. Drive-reduction serves to reward behavior. Responses are behaviors that lead to rewards. Finally, cues are stimuli that determine the selection of responses. Thus, personality can be viewed as a system of habits acquired and maintained through drive-reduction. Individual differences in habitual responses to environmental stimulation comprise the whole of personality.
Miller and Dollard argued that in order to understand human personality, it was necessary to understand the principles of learning. However, because the habits that comprise personality are social behaviors, it is also important to understand the social circumstances in which that learning takes place. Thus, Miller and Dollard called their approach social learning theory. In this regard, it is interesting to note that the theory represents the collaboration between Miller, a psychologist, and Dollard, a sociologist. Thus, personality becomes an interstitial field, combining different levels of analysis.
Like Skinner's stricter behavioral approach, social learning theory as stated would seem to imply that the person must have direct experience with reinforcement in order to establish habits. As noted, this is unlikely to be the case. In order to cope with this problem, Miller and Dollard postulated a drive of imitation. Imitation is a process by which similar actions are performed by two individuals in response to appropriate cues. At the start, imitation is a behavior which can be reinforced by the environment, just as other behaviors are. When rewarded regularly, however, it takes on the properties of an acquired drive. Thereafter, the individual is motivated to imitate the behavior of others -- to copy their behavior in order to obtain the same rewards that they receive from their actions. Imitation is widespread because the culture reinforces it strongly, as a means of maintaining social conformity and discipline. For this reason, although imitation is an acquired drive (and therefore optional in principle), it is almost a necessary consequence of socialization.
Miller and Dollard
discussed two principal forms of imitation. In both forms,
one person matches another's behavior.
Although some social-learning theorists continued to embrace the tradition of functional behaviorism into the 1960s and 1970s the break from the behaviorist view of social learning was apparent in the Rotter's Social Learning and Clinical Psychology, which appeared in 1954 (see also Rotter, 1955, 1960; Rotter, Chance, & Phares, 1972). Where Staats and Staats (1963), writing almost a decade later, were still acknowledging the primary influence of Skinner and other functional behaviorists, Rotter (1954) acknowledged the influence of no behaviorists at all. Rather, he aligned himself with the dynamic psychologist Adler and the gestalt psychologists Kantor and Lewin (see also Rotter, Chance, & Phares, 1972, p. 1). From the beginning, Rotter intended his theory as a fusion of the drive-reduction, reinforcement learning theories of Thorndike and Hull with the cognitive learning theories of Tolman and Lewin. Although Rotter's version of social learning theory often uses behaviorist vocabulary, it is with a clear cognitive twist.
In the first place,
Rotter is less interested in behavior than in choice,
an internal mental state which obviously manifests itself in
behavior. Rotter's cognitive-social learning theory
employs three basic concepts:
Rotter labeled his approach a social learning theory, and employed some of the concepts and principles of reinforcement theory in it. Nevertheless, his approach is less a theory of learning than it is a theory of choice. That is to say, Rotter is primarily concerned with how expectancies and values govern the choices we make among available behaviors. However, the theory has relatively little to say about how those expectancies, values, and behavioral options are acquired -- except to say that they are acquired through learning. It remained for another social learning theorist, Albert Bandura (Bandura, 1971, 1977, 1985; Bandura & Walters, 1963) to add to the concept of expectancies an explicit theory of the social learning process. Like Miller and Dollard, Bandura stressed the role of imitation in social learning. However, his concept of imitation departs radically from theirs in that it no longer functions as a secondary drive. By emphasizing cognitive processes over reinforcement, observation over direct experience, and self-regulation over environmental control, Bandura took a giant step away from the behaviorist tradition and offered the first fully cognitive theory of social learning.
Bandura's behaviorist roots are seen most clearly in his earliest statement of social learning theory, Social Learning and Personality Development (Bandura & Walters, 1963). On the surface, this book seems to draw heavily on Skinnerian analyses of instrumental conditioning. For example, there is a great deal of attention paid to the role of reinforcement schedules in the maintenance of behavior. Bandura and Walters argued that most social systems operated on some combination of fixed- and variable-interval schedules of reinforcement. For example, Bandura and Walters argued that most social reinforcements are delivered on an intermittent schedule. For example, family routines such as dining, parent-child interactions, shopping trips, and the like occur in a relatively unchanging cycle. Insofar as these activities can take on reinforcing properties, then, they are delivered on a fixed-interval schedule: the child cleans his plate at dinnertime during the week, and then gets to sit on his mother's lap during the family television hour on Saturday night. Other social reinforcements, however, seem to be delivered on a variable-interval. When a child seeks her mother's attention, she may get immediately, or at some time in the future when her mother doesn't have her hands full. Still other situations seem to involve the differential reinforcement of high or low rates of behavior. If a father pays attention to his child only when she kicks and screams, he is virtually guaranteeing that she will misbehave when she wants attention.
For a number of reasons, Bandura and Walters argued, most social reinforcements are dispensed on complex schedules combining variable ratios and variable intervals. In some respects, this complexity reflects the unreliability of social reinforcement. Often, the reinforcing agent is simply not present when the target behaviors occur -- in such a case, reinforcement must be deferred to a later time. And because humans are not automated machines, they will sometimes simply fail to deliver reinforcements that are due. Perhaps more important, the complexity of social reinforcement schedules reflects the complexity of social demands. It is rarely enough simply to perform a certain social behavior: it must be done in a particular way. A child asked to set the dinner table will not be rewarded simply for piling dishes and utensils; the forks have to be on the left side of the plate, and the blade of the knife turned inward. As Bandura and Walters note, effective social learning entails both adequate generalization and fine discriminations.
Social learning is also complex because of the wide variety of factors that affect the effectiveness of social reinforcements. For example, Bandura and Walters noted that children with strong dependency habits (note the phrase) are more susceptible to social reinforcement. Moreover, the prestige of the reinforcing agent is important, as is the match between the person and the agent on such attributes as gender. The person's internal states of deprivation, satiation, and emotional arousal are also important. The point is that social reinforcement is complex but not chaotic or haphazard. Social behavior is maintained by virtue of schedules of reinforcement, even if the precise nature of that schedule is sometimes hard to discern.
Although Miller's theory gained impressive support from analyses of animal behavior, Bandura and Walters were critical of its application to the case of human social behavior. For example, they argued that deliberate social learning also played a role in displacement. Thus, parents often direct their children's aggressive behaviors towards some targets rather than others, and displacement itself is maintained by contingencies of reinforcement. Clear examples of this may be found in scapegoating and other examples of prejudice towards minorities and other outgroups. By and large, these sorts of aggressive behaviors are not simply selected by the vicissitudes of the generalization gradient. Rather, children get their prejudices from their parents: as Rogers and Hammerstein wrote in South Pacific, "You've got to be carefully taught" whom to hate and fear.
While agreeing on the importance of reinforcement in the control of behavior, Bandura and Walters differed most from their behaviorist predecessors over the manner in which behavior was acquired in the first place. Taken at their word, Skinner and other functional behaviorists actually appear to deny that new behaviors are learned at all. Rather, responses already in the organism's repertoire come to be elicited by certain environmental cues by virtue of the law of effect. What are acquired are new patterns of behavior, by virtue of shaping and successive approximations. That is, a piece of behavior is synthesized from more elementary behaviors already in the organism's repertoire. Bandura and Walters, while agreeing that shaping procedures can be effective, doubted that they were responsible for the acquisition of most complex human social behaviors. Like Miller and Dollard, Bandura argues that social learning is largely mediated by imitation.
On the basis of anthropological studies as well as informal observation, Bandura and Walters argued that socialization -- the acquisition of socially sanctioned beliefs, values, and patterns of behavior -- was largely mediated by imitative learning. In some cultures, for example, young boys and girls are provided with miniature replicas of the tools used by their parents, and they spend a great deal of time tagging along with their parents practicing their use -- thus preparing for their adult roles. Similarly, children in the United States (and other developed societies) are given toys that the child can use to imitate adult behavior. In this way, for example, children in all cultures acquire behaviors consistent with the occupational roles deemed appropriate by their culture for persons of their gender.
Gender-role socialization is far from the only example of learning by imitation. In some tribal cultures, children even obtain their sex education by watching adults engage in various aspects of mating behavior. Certain aspects of language acquisition, such as the meanings and pronunciation of words, are learned largely through observation and imitation of other people. In addition, certain complex motor and cognitive skills appear to be acquired in this manner. Medical residents do not learn to perform surgery through a trial-and-error process. Rather, they learn by watching skilled practitioners operate, and by reading about the procedures in textbooks. In a very real sense, a surgeon knows how to do surgery before he or she ever puts a scalpel to a patient -- that is, before there can be any direct experience of trial and error. On a more mundane level, driver education courses in high schools make sure that students have acquired basic skills in handling an automobile before they ever take to the road.
In tribal cultures, parents and older siblings are probably the models for most imitation. They are, after all, the primary agents of socialization. However, this purpose may also be served by exemplary models sanctioned by the parents: children are constantly being encouraged to emulate various national heroes and mythological figures, as well as the children next door. In technologically advanced societies, models for imitation are provided by books, television, movies, and other media as well as by real life. One of the sources of the constant controversy over children's television viewing concerns the kinds of models presented to children in cartoons and action series. A major function of written and oral language is this kind of cultural transmission. By virtue of linguistic communication, we can tell someone what to do in a particular situation -- describe the behavior, and indicate when it should be performed -- instead of letting the person discover the relations between cues, acts, and outcomes for him- or herself. For this reason, social learning by imitation is highly efficient. In a complex, highly developed society, it also seems necessary.
While agreeing with Miller and Dollard that imitation is an important source of social learning, Bandura and Walters took issue with the theory that imitation -- either as a general tendency or of a specific act -- is acquired through reinforcement. For example, developmental studies show that children imitate others before they ever are reinforced for doing so. Very young infants, up to about four months of age, engage in pseudoimitation, in which they repeat some simple act (like babbling) displayed by their caretaker. However, this imitation will not occur unless the infant him- or herself had just recently performed the same act. Somewhat older infants will engage in genuine imitation of others, in circumstances where they have not just performed the same act themselves. The extent to which behavior will occur will depend on the degree to which the child's sensorimotor operations have developed. For example, children cannot reliably stick out their tongues in imitation of adults, until they have acquired some mental representation of their facial anatomy (Piaget, 1951; but see Meltzoff & Moore, 1977). Children are not reinforced for this: it simply happens, apparently as a reflection of an innate tendency to do so.
Even imitation of specific behaviors is not learned by virtue of reinforcement. The behaviorist model of imitation involves three elements: a discriminative stimulus (Sd) that serves as a cue, the response of imitating the model (R), and the reinforcing stimulus (Sr). By virtue of the law of effect, repeated reinforcement of the imitative behavior will make that behavior more likely to occur. However, a classic experiment on aggression by Bandura (1962) shows that this is not the case. Children watched a film in which a model displayed novel aggressive behaviors (that is, behaviors not previously in the children's repertoires) towards a "Bobo the Clown" doll. In one condition, the model was punished for this behavior; in another, he or she was rewarded; in a third condition, there were no consequences to the behavior of any sort. In a later test, children who viewed the punished model showed less imitative aggression than those who viewed the rewarded model; interestingly, those who viewed the unreinforced model displayed the same amount of aggression than those who saw the model rewarded. This first test was performed under conditions of no incentive. In a second test, the children were promised a reward for imitating the model: under these circumstances, the group differences disappeared. Thus, novel aggressive behaviors were acquired by the children even though they were not reinforced for imitating the behavior. However, the performance of these behaviors was under reinforcement control: those who saw the model punished were less likely to engage in the behaviors themselves, until instructed that the reinforcement contingencies had been changed.
In a later statement, Bandura (1977) argued that there are two forms of learning. Learning by response consequences is the kind of trial-and-error acquisition of knowledge familiar from the operant behaviorism of Skinner. However, this learning is given a cognitive emphasis. Direct experience provides information concerning environmental outcomes and what must be done to gain or avoid them. As a result, the person forms mental representations of experience that permit anticipatory motivation and behavioral self-control. Modeling involves learning through vicarious experience -- by observing the effects of other's actions. While a term such as "modeling" encompasses learning through example, Bandura also uses it to cover learning through precept -- deliberate teaching and learning, often mediated by linguistic communication.
Although Bandura goes beyond Rotter in discussing the process of social learning, his analysis of performance is similar to Rotter's in many respects. That is, Bandura agrees that the person's behavior is governed primarily by his or her expectancies concerning the future. Our responses to various situations are governed by information we possess concerning forthcoming events, and the outcomes of our actions. These expectancies are formed, respectively, through processes resembling classical and instrumental conditioning -- except that conditioning is given an active, cognitive interpretation as opposed to the conventional passive interpretation in terms of the laws of practice and effect. Moreover, conditioning is not the only -- or even the most important -- way that these expectancies can develop. Rather, they can be acquired vicariously through precept and example.
Expectations before the fact are, of course, subject to revision by the information gained subsequently. The actual consequences of an environmental event, for example, or of a person's actions, serve to confirm or revise the person's expectations. These consequences can be directly experienced by the person in question, or they may be experienced vicariously through observation or symbolic mediation. Moreover, in discussing the consequent determinants of behavior, Bandura stresses the role of aggregate as opposed to momentary outcomes. In his view, people are more influenced by what happens in the long run than by minor setbacks, delays, and irregularities. In large part, this is due to the cognitive capacities of humans, whose powerful memories permit them to transcend even long intervals, and integrate information from different points in time.
A unique feature of Bandura's social-learning theory is the active role played by the self. Behaviorist doctrine, of course, eschewed any reference to the self as an active organizer of experience or agent of action. Such talk was banned as mentalistic and ultimately beyond the pale of science. Insofar as the self was discussed at all, it was as (in Skinner's terms) a system of responses. As a cognitive theorist, however, Bandura (1977) permits the self to take an active, executive role in the regulation of behavior. In this way, the self plays a role as both an antecedent and a consequent determinant of behavior.
In the cognitive view offered by Tolman and by Rotter, outcome expectancies are vitally important determinants of behavior. That is, we tend to engage in behaviors that we expect will lead to outcomes we desire, and prevent outcomes we dislike. Bandura agrees that outcome expectancies are important. However, he has also added a new concept: self-efficacy expectations (Bandura, 1977, 1978). While it is obviously important that the individual expect that a particular behavior will lead to a certain outcome, it is equally important that the person have the expectancy that he or she can reliably produce the behavior in question. Note that the actual state of affairs is irrelevant here. It does not matter whether the person can, in fact, perform some particular action. What matters is whether the person thinks he or she can. Self-efficacy expectations are conceptually similar to the sense of mastery, and have important motivational properties, in that they determine whether the person will even attempt the behavior in question.
An example of self-efficacy can be found in the literature on learned helplessness. As a rule, dogs placed in a shuttlebox will acquire escape and avoidance responses fairly readily, shuttling back and forth in response to stimuli signaling forthcoming shock. However, dogs who have first received classical fear conditioning are retarded in learning escape and avoidance. In some instances, they simply sit and take the shock passively. Learned helplessness can also be produced in humans. For example, subjects who have been exposed to unsolvable anagram problems are retarded in completing subsequent problems that are solvable. Although the learned helplessness effect is quite complex, it appears to involve the subject's belief that he or she cannot master the situation. In fact, that is objectively not the case: the shock in the shuttlebox is avoidable, and the dog has in his repertoire the necessary behavior; the second set of puzzles is soluble, and the student has the intelligence to do so. Yet, experience has taught the subject to believe otherwise (if we can speak of beliefs in lower animals), and this belief controls behavior.
Self-efficacy can serve as an example of how antecedent expectations develop through social learning. Obviously, one source of self-efficacy is performance accomplishments: the personal experience of success and failure. Repeated failure experiences will lower the person's expectancy that he or she can effectively control outcomes. But the same sorts of expectancies can be generated through vicarious experience. Observing other people's success or failure will lead to appropriate expectations about oneself -- at least to the degree that one perceives oneself to be similar to those other people. But perceived self-efficacy can also be shaped in the absence of any experiential basis whatsoever, merely through verbal persuasion. A person who is repeatedly told that he or she is incapable of accomplishing some goal, especially if that information comes from an authoritative source, may actually come to believe it about him- or herself. Perceived self-efficacy can also change on a moment-to-moment basis, depending on the person's emotional state. Feelings of elation may increase feelings of mastery (sometimes beyond all reason, as in the megalomania of a manic patient), while anxiety or depression may reduce them. Finally, self-efficacy can vary from one situation to another. Even though a person has not encountered a particular problem before, he or she may have a high degree of self-efficacy if it closely resembles some other problem that the person has been able to master in the past.
Another way in which Bandura departs radically from the behaviorist analysis of social learning is by embracing the concept of self- reinforcement. Recall that Skinner objected to self-reinforcement on the ground that it was ineffective as a means of behavioral control. However, Bandura acknowledged that people can effectively regulate their own behavior in the absence of, or in opposition to, schedules of external reinforcement. For example, a run-of-the mill jogger can reward herself by finishing in the top half of a local road race, even though she will never get a medal for her performance. Alternatively, a college professor may feel remorse about flunking a student, even though he receives praise from his dean for upholding academic standards. It is so common to find writers, painters, and composers pursuing their own vision even though the are denied any professional recognition, that the image of the starving artist has become part of our cultural mythology. By means of goal-setting and self-reinforcement, people can free themselves from environmental control. This independence of the person from environmental control distinguishes Bandura's social learning theory from its behaviorist forebears.
In principle, self-reinforcement frees people from external control. As a practical matter, however, the essential first step in self- regulation, setting the standard, tends to be based on imitation. That is, we set standards for ourselves that a similar to those set for themselves by those we admire. These models may be our parents, teachers, or spiritual leaders. However, models may also come from other sources, such as books, films, and media. One important consequence of literacy, coupled with free access to books and magazines, is that we encounter potential models whose standards may be quite different from those whom we would otherwise meet. Modeling our standards on those individuals is another way in which we free ourselves from the constraints of our local social environment.
In addition to standard-setting, Bandura postulates three other component processes in self-regulation. The person must monitor his or her own performance, and evaluate it according to the standard set for him- or herself. The dimensions on which the performance is evaluated can vary widely, as can the precise standards. Very often, the individual will measure him or herself against actual or assumed population norms; or, some single individual will serve as the standard of comparison; in other circumstances, the standard will be set by the person's own previous behavior. It is important, of course, not to set standards that cannot be met. Research in a variety of domains, from academic achievement to weight loss, indicates that people should set goals for themselves that are clearly specified, and of only moderate difficulty. Vague or unambiguous goals, of course, are not goals at all. Setting an unattainable goal obviously has motivational drawbacks, while setting a goal that is too easy to accomplish will yield little or no satisfaction in its accomplishment. (It should be noted that the same considerations apply to goals set by others, as when parents enforce standards for their children's behavior.)
Once the evaluation has been made, the person will reinforce his or her performance appropriately. These rewards come in two forms, tangible and symbolic. The student who aces an exam may reward herself with a movie or punish herself by canceling a date; or she may just praise or censure herself. The effectiveness of self-praise or self-reproach, in the absence of tangible consequences, is currently subject to considerable debate. However, research clearly shows that people -- even young children -- who fail to meet their own performance standards will deny themselves reward. Apparently, such internal states as self-esteem and self-efficacy have their own motivating properties. While behavior that is controlled only by external contingencies will be unreliable in the absence of those contingencies, our selves are always with us. Thus, in principle self- reinforcement should lead to more effective behavioral regulation, because it is less subject to situational variation.
Moreover, human intelligence and consciousness permits us to project the consequences of our actions far into the future. Traditional behavioral theories, of course, assert that present behavior is under the control of past events, and that future prospects that have no parallel in the past are very weak determinants of behavior. However, this is clearly not the case. The emergence of political movements supporting environmental protection and nuclear disarmament are clear examples of the control of behavior by the future. We have had no experience of the greenhouse effect or nuclear winter, but the prospects of them in the future led us to try to protect the ozone layer, and reduce the number of nuclear warheads, today. The behaviorist analysis of future determinants is largely correct when it is applied to lower animals, with their limited cognitive capacities. Bandura's openness to such determinants is another mark of the extent to which social learning theory has embraced cognitivism, and abandoned its behaviorist roots.
Social learning is the cognitive basis of culture, which anthropologists define as the customary beliefs, social forms, and material traits of a racial, ethnic, or social group, transmitted through informal learning and formal training from one generation to the next. This intergenerational transmission cannot be accomplished through the genes: there is no inheritance of acquired characteristics. Instead, if must be accomplished by learning -- which is to say, social learning, through example and precept. It is through social learning, both informal modeling and in formal institutions (such as schools and libraries) organized for the purpose, that we pass down its knowledge, beliefs, and attitudes from one generation to the next. In this way, each generation builds on the advances made by those who went before, and doesn't have to start "from scratch".
raises the question of whether nonhuman animals have
"culture" as well. Observations of animals behaving in
their natural environment suggests that animals do indeed
learn vicariously from observing the experiences of others,
and in this respect possess sets of cultural traditions that
are passed from one generation to the next.
Along with consciousness and language, and culture, the capacity for learning, and especially for social learning, is one of the greatest gifts of evolution to the human species.
For More on Social Learning, Go to
the Appendix: The
Evolution of Cognitive Social Learning Theory
Behaving organisms are not just machines, operating by reflex, taxis, or instinct. Rather, even organisms with very simple nervous systems are able to modify their behavior in accordance with what they have learned. Much learning can be described in terms of classical and instrumental conditioning, and combinations thereof. But not all learning is of this sort: language learning is a particularly salient example of learning merely through exposure to others, without any reinforcement.
What is learned is not a simple connection between stimulus and response. Rather, the learning organism forms a mental representation of the world and its relation to it: of objects, events, its own behavior, and the contingent relations between them.
light of modern experiments on predictability,
controllability, and social learning, we should revise our
definition of learning.
For a comprehensive survey of the psychology of learning, see The Psychology of Learning and Behavior by B. Schwartz and S.J. Robbins (Norton, 1978), and subsequent editions. The most up-to-date of these is Learning and Memory by B. Schwartz and D. Reisberg (1991).
For a thorough discussion of behaviorism, see Behaviorism, Science, and Human Nature by B. Schwartz and H. Lacey (Norton, 1982).
For a comprehensive survey of theories of learning, see the various editions of Theories of Learning by E.R. Hilgard and G.H. Bower (1st ed. by E.R. Hilgard, published by Appleton-Century -Crofts, 1948; 5th ed. by G.H. Bower and E.R. Hilgard, published by Prentice-Hall, 1981).
Mostly, however, when we think about memory we mean episodic memory, which raises the question: can nonhuman animals have episodic memory, in the sense of an ability to remember specific experiences as such? Some theorists (like Tulving, 1983) think not -- that the ability to remember specific episodes of experience is a uniquely human faculty. But we've long since learned to accept the Darwinian principle of evolutionary continuity, so it would be surprising if at least some nonhuman species, most likely primates or other animals, had the ability to remember specific episodes in their lives.
Let's first define the terms. An episodic memory is a memory for an episode -- an event with a unique location in space and time. So, at the very least, an episodic memory has to have been encoded after a single experience.
this standard, any example of one-trial
learning -- such as the one-trial
step-down passive avoidance learning
often used in animal
models of traumatic
retrograde amnesia (e.g.,
Miller & Marlin, 1979) might count as episodic memory. In
this paradigm, a rat is perched on
a platform above a floor grid
which is wired to deliver an electric
shock. If the animal
steps down (and they always
step down), it gets a
fooshock, at which point it jumps back
up onto the shelf and
won't step down
again. It has
learned the association
between floor and shock in
a single trial, and it
passively avoids further
shock by refusing to step
down. (If the rat
jumping back up, it will
step back down onto the
floor as if nothing had
happened, apparently amnesic for
the shock experrience.)
Now, it might be that the rat remembers the specific experience of getting shocked when it stepped down onto the floor -- in which case the memory might count as episodic. Alternatively, it might be the case that the animal has acquired more generic knowledge that the floor delivers footshock in which case we're talking about something more like semantic memory -- abstract knowledge about the world. A human analogue would be source amnesia, in which a subject remembers factual knowledge acquired during a learning session, but not the learning session itself. So, the occurrence of one-trial learning isn't enough to qualify as an animal model of episodic memory.
So, returning to our definition
of episodic memory, it seems that, at a minimum, an episodic
memory has to contain information about
the target event, as well as information
about the time and place at which it
occurred. Call it a what-where-when
1972). It is this W-W-W structure
that makes the verbal-learning paradigm a
model of episodic
memory: subjects must remember what
words were on a
particular list studied at a
particular time and in a
particular place. So,
a successful animal model of episodic
memory would have to demonstrate, at a
minimum, that an animal remembers not just
what happened, but also where and when
Such a model was
introduced by Clayton &
based on cache-recovery behavior in scrub jays (note: not a primate or even
This sort of experiment, which has
been repeated many time in various species (including
rats), seems to indicate that the animals have the
ability to remember what was cached, where it
was cached, and when it was
cached - -thus meeting the minimal requirements for an
But maybe episodic memory requires more than this. Remember James's
definition of secondary memory:
Memory requires more than a mere dating of a fact in the past. It must be dated in my past. In other words, I must think that I directly experienced its occurrence.
This feature of "reminiscence,
recollection, reproduction, or recall" is necessarily
subjective, and would seem to be ruled out by the fact
that we simply have know way of knowing what the
subjective experience of remembering is like for
subjects who can't talk to us about
their introspections. Which is one reason why
Clayton and others refer to "episodic-like memory".
Related to this is Tulving's notion that episodic memory represents mental time travel (MTT), or traveling back in time to relieve a prior episode. Tulving (2005) now believes that this self-referential autonoetic experience is the real hallmark of episodic memory -- and that the ability to mentally travel backward in memory is also related to our ability to project ourselves, mentally, into the future. And and he also believes that this ability -- MTT in either direction -- is uniquely human. At the same time, we've known since Tolman, and certainly since the cognitive revolution in animal learning (Rescorla, Seligman, Kamin, and the others) that animals form expectations during both classical and instrumental conditioning. And the very idea of expectations implies some ability to anticipate the future.
"Episodic Memory" in AnimalsFor a recent overview of this research, see:
page last revised 09/16/2014.