Operant Conditioning Paradigms


Operant Conditioning Paradigms by Mark Plonsky, Ph.D. Copyright © 2014

Operant conditioning is a type of learning that has been carefully researched over the last half century or so. Understanding this little bit of science can provide a large payback (in terms of training effectiveness) for a relatively small investment (of time to understand).

A paradigm is a model or a way that something works. In this article, I will describe the eight specific paradigms of operant conditioning (OC) for folks that train dogs. My goal in doing so will be to give you an in depth understanding of OC. I want to make clear that I am not advocating any technique in particular. While I personally favor techniques that manipulate when the dog gets the things it wants (pleasant events), my goal here is to spell out all of the possibilities. I will first review some material to make the paradigms themselves more understandable.

The general paradigm for OC is that a Response (symbolized with the letter R) leads to a biologically relevant Stimulus consequence (S*). The asterisk indicates that the stimulus (or event in the environment) is something that is biologically relevant or important to the organism. Food, pain, and sex are examples of biologically relevant stimuli because they influence whether the dog survives and reproduces. Common terms used for biologically relevant stimuli are reward and punishment. The ‘response’ is just a fancy way of referring to behavior or what the dog does. Thus, OC can be illustrated with symbols in the following manner:

R -----> S*

Which should be read as:

A Response (leads to) a Stimulus consequence

We say that the S* is contingent (or dependent) upon the R. A quick example is that when a dog sits (the R), we give it a food treat (the S*). The dog’s sitting behavior is called the Response (R) and it led to (----->) the food treat (S*). I want to make clear that the treat was contingent upon the dog sitting. The dog had to sit in order to get us to deliver the treat.

A stimulus which signals that a contingency is in effect is called a Discriminative Stimulus (S^D). The OC paradigm then becomes a little more complicated.

S^D -----> R -----> S*

Which should be read as:

A discriminative Stimulus tells the dog that a
Response (leads to) a Stimulus consequence

Discriminative stimuli are signals such as words, hand or body signals, people, locations, etc. So, for a quick example, the word “Sit” is a signal that tells the dog that if it puts its butt down it will be rewarded. I should note that an S^D may signal that more than one contingency is in effect. For example, the word “sit” may also signal that if the dog does not put its butt down it will be punished.

We are now in a good position to understand the eight specific paradigms of OC. Consider that there are three relevant dimensions:

S* – The stimulus consequence can be pleasant or aversive.
R – We may want to increase or decrease the response or behavior in question.
S^D– We can either use a discriminative stimulus or not.

Since each dimension has two possibilities, that leads to 2x2x2 = 8 paradigms. Let’s discuss each specific type. It easiest to present them in two groups of four, that is, the four types without and S^D and the four types with an S^D. In each case, I will present a table illustrating the four types and then list examples for each.

The first four are when we do not use a discriminative stimulus (S^D):

	We want the R to:
S*	Increase	Decrease
Pleasant	Reward training (1)	Omission training (3)
Aversive	Escape training (2)	Punishment training (4)

Reward Training – Whenever the dog sits we give it a reward such as a food treat. Uses +R.
Escape Training – A common technique for training the dog to retrieve involves force. For example, an ear pinch may be used where the dog escapes the pain by retrieving. Uses -R (& +P).
Omission Training – Whenever the dog doesn’t jump (it keeps its four paws on floor) we give it a cookie. In other words, it is only rewarded when it omits the jumping. Uses +R (& -P).
Punishment Training – One method of training the dog not to jump on a person is to meet the dog with a knee. The contact is meant to be aversive, such that the dog is punished for jumping. Uses +P.

The next four types are when a discriminative stimulus (S^D) is used.

	We want the R to:
S*	Increase	Decrease
Pleasant	Discriminated operant (5)	Discriminated omission (7)
Aversive	Active avoidance (6)	Passive avoidance (8)

Discriminated Operant – The “sit” signal tells the dog that sitting will be rewarded. The discriminative stimulus tells the dog that the contingency is in effect.
Active Avoidance – The dog performs a behavior (is active) to avoid something unpleasant. Sometimes my dogs walk in front of me when we are walking. I say the word “move” prior to the inevitable collision. In the future, when I say the word “move”, they get out of the way and avoid the collision. In other words, they actively avoid the collision when they hear the word “move”.
Discriminated Omission – The “stay” signal tells the dog that omitting alternative behaviors will be rewarded. Hold your position without doing anything else and a reward will follow.
Passive Avoidance – The “stay” signals dog that alternative behaviors will be punished. The way to avoid the punishment is to remain passive.

I hope this presentation of the eight paradigms of OC has increased and deepened your understanding of OC. I also hope the time you invested in reading this article increases your effectiveness as a dog trainer.