In this fifth article concerning AGTB. I describe basic principles of learning, striving and inhibiting behaviour. Among other things, it includes the Law of Effect which was derived from studies with cats.
“responses that produce a satisfying effect in a particular situation become more likely to occur again in that situation.”
Edward Thorndike, 1898
For at least a century from the late 1800s theories of learning were the dominant concern of experimental psychologists. This was the era of ‘Grand Theories’ designed to bring a new dawn to the Science of Behaviour. The School of ‘Behaviourism’ would strive ultimately to explain all of behaviour. The animal laboratory became a crucible for a vast edifice of findings with hundreds of doctoral candidates cutting their teeth with a thousand different variables. For this, we can thank Edward Lee Thorndike (1874 –1949), an American psychologist who pioneered ethology, theories of learning and pedagogy. Our focus here is specifically Thorndike’s work on animal learning and the Law of Effect.
Learning is a relatively permanent change in behaviour that cannot be explained by temporary states, maturation, or innate response tendencies. An organism learns because: i) it needs to satisfy physiological and psychological needs, ii) it needs to adapt to new situations based on experience of similar situations in the past, and iii) because there is an intrinsic value in learning of and for itself.
Thorndike was born in Williamsburg, Massachusetts. He attended the oldest school in North America, Roxbury Latin School in Boston. Roxbury Latin was founded in 1645 by the Rev. John Eliot, a puritan missionary, under charter from King Charles I of England. Eliot’s mission was “to fit [students] for public service both in church and in commonwealth in succeeding ages.” After his time at Roxbury, Thorndike took an English degree at Wesleyan and a master’s degree at Harvard under no less a person than William James, and then a doctoral degree from Columbia in 1898 which was supervised by James Cattell. His doctoral thesis, Animal Intelligence: An Experimental Study of Associative Processes in Animals, established a learning theory that dominates all others for nearly 50 years, a notable achievement.
Like many scientists of that era, Thorndike was a eugenicist. He argued that “Selective breeding can alter man’s capacity to learn, to keep sane, to cherish justice or to be happy. There is no more certain and economical a way to improve man’s environment as to improve his nature.” One should not jump to judgement about this, because eugenics was in the zeitgeist, but whatever his – and the majority of his colleagues’ – views, Thorndike is one of the historical giants of theoretical Psychology.
Although there are precursors, Thorndike proposed the ‘Law of Effect’ (LOE) in 1898, laying a foundation stone for theories of learning for more than a century. The law has never been rescinded.
Thorndike invented the concept of ‘reinforcement’ that was to become especially important to the study of operant conditioning, in which the effect of a response influences the likelihood of the future production of that response. The LOE applies to the entire universe of behaviour in which stimuli yielding satisfaction or pleasure are approached and those yielding dissatisfaction or pain are avoided.
The motivation system is crucially interdependent on the ability to remember what leads to pain and what leads to pleasure. The organism requires a mechanism for learning which is at the heart of all behaviour and performance.
The LOE is central to AGTB and the kernel of Principle VII:
Principle VII (Law of Effect): (A) All voluntary action is determined by the degree of pleasure or displeasure that the action provokes. (B) Any behaviour that is followed by pleasant consequences is likely to be repeated. (C) Any behaviour that is followed by unpleasant consequences is unlikely to be repeated.
Thorndike was best known for his work with cats inside his famous “puzzle box”. A hungry cat was confined in a box with ‘manipulandum’ (i.e. a lever) that allowed the cat to escape by opening a door and receive a food morsel outside the door. Initially cats engage in semi-random exploratory behaviours that characterise many animals in confinement such as clawing, biting, meowing, rubbing, and so on. Ultimately the cat would accidentally activate the release mechanism and escape the box to consume the food. When returned to the box, the cat would again engage in a series of exploratory behaviours and eventually, once again, accidentally activate the release mechanism. Thorndike observed that the time between initial placement and escape slowly decreased over a series of trials, providing a learning curve.
By any account, Edward Lee Thorndike was a successful scientist. In 1912, Thorndike was elected president for the American Psychological Association and, in 1917, he became a Fellow of the American Statistical Association, and in 1934, he was elected president of the American Association for the Advancement of Science. Thorndike also composed three ‘word books’ to assist teachers with word and reading instruction.
Thorndike’s theoretical ideas were founded on his repeated observations. He concluded that cats learn by selecting and connecting, what others called “trial and error learning”, a “stamping in” of correct responses and a “stamping out” of incorrect responses. Thorndike proposed that learning consists of connecting stimuli (bits and pieces found inside the puzzle box) with responses (pushing the lever), producing stimulus-response (S-R) ‘connections’ or ‘bonds’. The instigator of the cats’ behaviour in the puzzle box was thought to be a ‘drive’ to escape.
The drive concept
was originally defined by Robert S. Woodworth in 1918 as an “intense internal force that motivates behaviour.” The concept became the foundation stone for Hull’s drive theory of behaviour, viz. that whether learned or innate, drive automatically motivates behaviour (Hull, 1943). Drive was viewed as the primary instigator of behaviour, a bodily state that renders behaviour ‘reinforceable’. Unlearned or innate sources of drive include ‘deprivation of biologically important substances such as food, water, or oxygen. Such deficits threaten survival and the organism make adjustments to restore the system to the normal set range via homeostasis. Drive also may be induced by aversive stimuli such as loud noise or electric shock that are not life threatening.
The first half-century of learning theory, culminating with Hull, generated a circle of concepts with ‘drive’ at the centre, and stimuli, responses, connections, and reinforcements of the circumference. Like all systems in the history of Psychology, however, there would be a rise and a fall. Hull’s theory fell into disfavour and the drive concept went into sharp decline. With the demise of the drive concept in the 1960s, 70s and 80s, Psychology threw away with the water, not only the baby, but the entire bath tub. We turn to consider an alternative ‘bath tub’ in the form of the concept of striving.
“Every person on the planet (barring illness) can tell good from bad, positive from negative, pleasure from displeasure”. Not only can we tell it, we can feel it. From the pre-Socratics philosophers until the present day, the role of pleasure and pain as motivators of human behaviour has been universally accepted. Psychological hedonism, the idea that all action is determined by the degree of pleasure or displeasure that imagining the action provokes, dates back to Epicurus (341 BC – 270 BC) who is alleged to have said: “We begin every act of choice and avoidance from pleasure…”
In 1789 the English philosopher Jeremy Bentham formulated the principle of utility in which any action that promotes the greatest amount of happiness is morally right. Happiness is identified with pleasure and the absence of pain. In 1848 the German physicist Gustav Fechner used the term Lustprinzip. Fifty years later Sigmund Freud copied this idea by formulating the ‘Pleasure Principle’ which has an almost exact equivalent in Cannon’s concept of homeostasis which has the goal of tension reduction for the sake of maintaining, or restoring, the inner equilibrium.
Interestingly, pleasure and pain are both objective and subjective at the same time, a double-sided feature that carries evolutionary benefits. If subjective and objective pain could get out of step, one can only imagine the disastrous consequences. The idea that organisms strive for pleasure and the avoidance of pain has been accepted for aeons.
What exactly do we mean by ‘the degree of pleasure’ and ‘displeasure? Michel Cabanac of Laval University in Québec suggested that the pleasure or displeasure of a sensation is directly related to the biological usefulness of the stimulus to the subject.The seeking of pleasure and the avoidance of displeasure are behaviours which have useful homeostatic consequences. [AP 019]. That is, they depend on the internal state of the stimulated subject at the particular moment of the stimulation. Pleasure indicates a useful stimulus and motivates the subject to approach it. Pain indicates a useful stimulus and motivates the subject to avoid it.
Emerging evidence indicates similarities in the anatomical substrates of painful and pleasant sensations in the opioid and dopamine systems. The experience of positive and negative affect is based on neural circuits that evolved to ensure survival. These circuits are activated by external stimuli that are appetitive and life sustaining or by stimuli that threaten survival. Activation of the pain and pleasure circuits alert the sensory systems to pay attention and prompt motor action.
The approach-avoidance concept has captured the imagination of many theorists and been extraordinarily pivotal. The approach-avoidance system also includes behavioural inhibition which takes over when there is approach-avoidance conflict.
Action schemata are also necessary precursors to action, as we shall see in the next post. This leads to a four-pronged system for regulating approach-avoidance-inhibition (AAIS). Operating together with action schemata, the REF, CLOCK and AAIS regulate voluntary action (Figure 1).
Figure 1. The REF, CLOCK and AAIS interconnect with action schemata to execute voluntary action.
Two necessary conditions are required by the AAIS: a need state or drive (e.g. hunger) and the ability to reset the need by homeostasis (eating of a food reward). These conditions are stated in Hull’s Law which contains the assumption that the ‘excitatory potential’, E, or homoeostatic pressure, determining the strength of a response is a multiplicative function of a learning factor, H, and a generalized drive factor, D, i.e., E = H x D. When D (drive/motivation) is zero, E automatically becomes zero also. In mature organisms, the inability to learn when drive is lacking is something that occurs in both operant and classical conditioning. Without motivation, learning does not generally happen, and behaviour is not performed.
A century of research on learning and the AAIS was conducted under laboratory conditions where food- or drink-deprived animals are all normally tested during the 9-5, traditional working day. We know that that the reward potential of the environment varies dramatically across the LD cycle as modulated by the CLOCK system. Free-living rats and mice normally sleep during daytime hours and so all of the lab research with them has been imposing ‘jet lag’ on the animals’ usual rhythms. The edifice of findings has been achieved with both the Type I homeostasis and CLOCK systems fully switched on. This, and other reasons, leads one to question the generalizability of the lab findings to the behaviour of free-living animals. In spite of the many reservations, it is necessary to accept that, within certain well-known biological constraints, there can be confidence that the LOE is not purely a laboratory artefact and that free-living organisms follow it.
As we have seen, major authorities agree that a drive underlies approach and avoidance energised by a striving toward pleasure and away from pain. Every living being strives towards a fixed set range of positive well-being. [AP 020]. Organisms approach sources of potential pleasure and satisfaction and studiously avoid potentially aversive stimuli and confrontations with danger. There really isn’t much difference between striving for something and having a drive for something. Both concepts involve a felt need to satisfy an unmet need, whether biological or behavioural. When the need has been satisfied, drive is reduced, striving ceases, and the organism resets to equilibrium and can rest. For this reason, we are pleased to return the ostracised drive concept from its exile.
In encountering a threatening stimulus, the organism fights, takes flight or freezes, in which case inhibition of behaviour minimizes the risks that come with a collision of interests or confrontation.
Miller’s (1944) summary of data on approach-avoidance conflict showed that the tendency to approach is stronger far from the feared goal, while the tendency to avoid is stronger near the goal. Inhibition of action occurs when approach or avoidance are impossible, when a danger cannot be accurately predicted or when there is no previous response pattern to fall back on. In these cases, the système inhibiteur de l’action, or ‘behavioural inhibition system’ (BIS), is activated, stimulating the neuroendocrinal responses described by Walter Cannon and Hans Selye.
Inhibition is a regular, everyday occurrence in the life of free-living animals. For example, consider the plains zebra (Equus quagga) drinking at a waterhole. With crocodiles always a danger, a cycle of approach, avoidance and inhibition will be repeated several times over before a zebra drinks. In many instances, the drive to drink water exceeds the drive to keep safe and thirsty zebras are frequently killed by crocodiles. Freezing until danger passes is necessary for the zebra’s long-term survival, as long as the suspense of drinking does not continue for too long.
‘Freezing’ is an option in many commonly occurring circumstances for humans also. A worker dealing with an exploitative boss cannot fight or flee because they would be out of a job. They may be forced to let months and years go by while they inhibit their behaviour. Behavioural inhibition causes arousal and anxiety which, if unchecked, ultimately has deleterious effects on physical and mental well-being. [AP 021].
The BIS was the discovery of the French surgeon and neuropsychopharmacologist Henri Laborit (1914-1995). Laborit is known for his work on the synthesis of chlorpromazine, the discovery of the neurotransmitter gamma-OH, the antidepressant minaprine, and the sedative clomethiazole. In regard to inhibition, Laborit stated: “… this situation in which an individual can find himself, this inhibition of action, if it persists, induces pathological situations. The biological perturbations accompanying it will trigger physical diseases and all the behaviours associated with mental illness.” 
Principle VIII (Behavioural Inhibition): The Behavioural Inhibition System is activated when there is conflict between competing responses to approach or avoid stimuli.
The BIS suppresses pre-potent responses and elicits risk assessment and displacement behaviours. [AP 022]. Displacement behaviours include head scratching, fidgeting and playing with the car keys when we are uncertain about what to do. Another AP relevant to both P(VI) and P(VIII) states: A primary source of behavioural inhibition is anxiety about actual or imagined failure. [AP 023].
Anxiety can lead one to foresee so many negative scenarios that we may end up doing nothing at all. To do nothing, and to maintain a dream often may be a better option than taking an action and falling flat on one’s face. Whichever way one looks at the oscillation of inhibition, it has a connection with the drive for equilibrium. We turn to consider an influential approach to the approach-avoidance-inhibition system.
GRAY AND MCNAUGHTON’S THEORY OF THE AAIS
If all human actions involved either approaching rewarding goals or avoiding punishing ones, life would be perfectly simple, albeit a little boring. A multitude of situations contain strongly competing goals of approach-approach, approach-avoidance or avoidance-avoidance conflict. To understand how an organism is to deal with such conflicts, we must unpack how the approach-avoidance-inhibition system might actually work in practice. In this regard, the work of Jeffrey A Gray and Neil McNaughton is of particular relevance.
Gray and McNaughton’s influential account of the approach and avoidance systems involves goal representations which have both cognitive (or identifying) and motivational (or consummatory) properties. The properties of a goal distinguish it from other kinds of stimuli and this includes the ability to be attractors (rewards) or repulsors (punishments). In the McNaughton-Gray theory, responding to attractors or repulsors brings three output systems into play: the Behavioural Approach System (BAS), the BIS, which we have already encountered, and the Freeze-Fight-Flight System (FFFS) (Gray & McNaughton, 2000).
According to McNaughton, DeYoung and Corr (2016), the “Behavioral Inhibition System” has outputs that: “inhibit the behaviour that would be generated by the positive and negative goals (without reducing the activation of the goals themselves), increases arousal and attention (generating exploration and displacement activities), and increases the strength of avoidance tendencies (i.e., increases fear and risk aversion). Increased avoidance during goal conflict is adaptive since, faced with risk, failing to obtain food or some other positive goal is likely to be easy to make up at another time, but experiencing danger could have severe consequences” (p. 30). It can be seen that a quickly taken avoidance decision may produce a false alarm, but, as the case of zebras at the waterhole illustrates, a slow response to a real threat might provide a crocodile with a fulsome dinner.
The approach (BAS), avoidance (FFFS=fight, freeze, flee) and conflict (BIS=behavioural inhibition) systems. The inputs to the system are classified in terms of the delivery (+) or omission (−) of primary positive reinforcers (PosR) or primary negative reinforcers (NegR) or conditional stimuli (CS) or innate stimuli (IS) that predict such primary events. The BIS is activated when it detects approach-avoidance conflict—suppressing prepotent responses and eliciting risk assessment and displacement behaviours. The systems interact homeostatically to generate behaviour. Based on this theory, it is possible to proceed with the proposal that: The voluntary behaviour of free-living organisms is coordinated by the REF, CLOCK and AAIS. [AP 024].
1) Drives, whether learned or innate, automatically motivate behaviour. Axiomatic to the General Theory of Behaviour is that organisms strive towards pleasure and away from pain.
2) Differing sources of pleasure and displeasure create conflicts, which are resolved by the approach-avoidance-inhibition system (AAIS).
3) When the AAIS activates the behaviour inhibition system, it increases arousal, attention and the strength of avoidance tendencies. The AAIS, together with the REF and CLOCK, coordinates voluntary action.
 The work of Russian physiologist, Ivan Pavlov (1849 – 1936) on classical conditioning was also hugely influential. Space restrictions prohibit discussion of the significant role of Pavlovian conditioning in this brief introduction to the General Theory. We also do not have space to go beyond a brief sketch of Thorndike’s approach to learning.
 Robert R. Mowrer & Stephen B. Klein (2001). Handbook of Contemporary Learning Theories Lawrence Erlbaum Associates. Bower G H & Hilgard E R. (1981). Theories of learning. Englewood Cliffs, NJ: Prentice-Hall.
 Bower, G. H., & Hilgard, E. R. (1981). Theories of learning. Prentice-Hall. p. 21.
 Quoted from: Thorndike, E.L.(1913). Education Psychology: briefer course. p.13. This quotation and a photograph of Thorndike are printed on the cover page of a London Conference on Intelligence held at University College London as recently as 2016. See: http://www.dcscience.net/London-conference-of-Intelligence-2016.pdf
 Thorndike, E. L. (1927). The law of effect. The American Journal of Psychology, 39(1/4), 212-222.
See: Hilgard, E. R. (1948). The century Psychology series. Theories of learning. East Norwalk, CT, US: Appleton-Century-Crofts. Hilgard, E. R., & Marquis, D. G. (1961). The century Psychology series. Hilgard and Marquis’ conditioning and learning, 2nd ed. East Norwalk, CT, US: Appleton-Century-Crofts.
 As we saw in the last post, Bernard liked to work with dogs. Thorndike showed a preference for cats.
 Animal lovers can feel more relaxed about Thorndike’s methods than Bernard’s or Pavlov’s.
 Woodworth, R.S. (1918). Dynamic Psychology. New York: Columbia University Press.
 In 1984, a paper was published defending the drive concept. See: Kendon Smith (1984).”Drive”: In Defense of a Concept. Behaviorism 12, 71-114.
 Quotation from the opening sentence of: Lindquist, K. A., Satpute, A. B., Wager, T. D., Weber, J., & Barrett, L. F. (2015). The brain basis of positive and negative affect: evidence from a meta-analysis of the human neuroimaging literature. Cerebral Cortex, 26(5), 1910-1922.
 Which all goes to prove that there’s nothing new under the sun.
 Cabanac, M. (1999). Pleasure and joy, and their role in human life. In Creating the productive workplace (pp. 62-72). CRC Press.
 Leknes, S., & Tracey, I. (2008). A common neurobiology for pain and pleasure. Nature Reviews Neuroscience, 9(4), 314.
 Lang, P. J., & Bradley, M. M. (2010). Emotion and the motivational brain. Biological Psychology, 84(3), 437-450.
 For a historical summary of the approach-avoidance construct, see: Elliot, A. J. (1999). Approach and avoidance motivation and achievement goals. Educational psychologist, 34(3), 169-189.
 We will give the Approach-Avoidance-Inhibition System the acronym “AAIS”.
 Tolman, E. C., & Honzik, C. H. (1930). Degrees of hunger, reward and non-reward, and maze learning in rats. University of California Publications in Psychology, 4, 241-256.
 Debold, R. C., Miller, N. E., & Jensen, D. D. (1965). Effect of strength of drive determined by a new technique for appetitive classical conditioning of rats. Journal of Comparative and Physiological Psychology, 59(1), 102.
 Possible exceptions are the innate disposition in critical periods to phase-sensitive learning or imprinting in young animals without specific reward and the learning that occurs in casual observation of others.
 Murray, G., Nicholas, C. L., Kleiman, J., Dwyer, R., Carrington, M. J., Allen, N. B., & Trinder, J. (2009). Nature’s clocks and human mood: The circadian system modulates reward motivation. Emotion, 9(5), 705.
 In addition to associative learning, animals have innate species-specific defence reactions such as fleeing, freezing, and fighting that are rapidly acquired; see Bolles, R. C. (1970). Species-specific defense reactions and avoidance learning. Psychological review, 77(1), 32. For a human example, see: Wichers, M., Kasanova, Z., Bakker, J., Thiery, E., Derom, C., Jacobs, N., & van Os, J. (2015). From affective experience to motivated action: Tracking reward-seeking and punishment-avoidant behaviour in real-life. PloS one, 10(6), e0129722.
 This is known as the “Life Dinner Principle”: it is better to sacrifice one’s dinner (or one’s drink) than one’s life. See: Dawkins R, Krebs JR. (1979). Arms races between and within species. Proc R Soc Lond B Biol Sci. 205:489–511.
 Laborit, who also discussed political philosophy, once stated: “It would be desirable to replace the republican motto “Liberty, Equality, Fraternity” by “Conscience, knowledge, imagination””.See: http://www.nouvellegrille.info/surlagrille.html
 Kunz, E. (2014). Henri Laborit and the inhibition of action. Dialogues in clinical neuroscience, 16(1), 113.
 Gray JA, McNaughton N. (2000). The NeuroPsychology of Anxiety: An Enquiry into the Functions of the Septo-hippocampal System. 2nd ed. Oxford: Oxford University Press; McNaughton, N., DeYoung, C. G., & Corr, P. J. (2016). Approach/avoidance. In Neuroimaging personality, social cognition, and character (pp. 25-49).
 For this purpose, we bring back the forsaken concept of drive.