Variable Reinforcement - Part 1.
"Consistency is contrary to nature, contrary to life. The only completely consistent people are the dead." - Aldous Huxley
While Huxley may not have known much about dog training, his point was well taken. In order for an organism to be alive, it must be able to vary its behavior according to changing circumstances. Even though most forms of dog competition require consistent performance, behavioral variability is a key ingredient to creating exceptional performance levels. Trainers who understand how to use variability and consistency are better able create and maintain their dog's great performance.
The first rule of variability is a simple one -- variable reinforcement causes variable behavior. Most scientific studies of behavior have avoided this aspect of behavior, entirely. Instead, the experimenter sets up a pass/fail system of reinforcement called "successive approximation." Any behavior that looks like a step toward the target behavior is reinforced. Behaviors that are not similar to the target behavior are not reinforced. Eventually, the reinforced behaviors become predominant and the unreinforced behaviors disappear. Soon the animal is doing one behavior, over and over again. If you are a behavioral psychologist and only want to count the rate at which an animal performs a single unit of behavior, you are in hog heaven. If you are a dog trainer and want your dog to learn how to get a lightening fast finish, you have just put your dog into a behavioral straight jacket. Successive approximation depends on consistent reinforcement, and consistent reinforcement causes consistent behavior. The rather tiny increments of change from repetition to repetition that allows the animal to learn smoothly also prevents them from learning by leaps and bounds. If you want the animal to learn dramatically enhanced versions of the behavior, you must either take a very long time, or look to other means of shaping behavior.
OK, here we go. I said that variable reinforcement causes variable behavior. This is really just a restatement of the prime rule of operant conditioning - operant behavior is determined by its consequences. Change the consequences and you change the behavior. If you make the consequences highly variable you will make the behavior highly variable, i.e. you will cause it to "wiggle." As the dog tries to adapt to this new set of rules, he is likely to start offering variations of the target behavior. The variations are often harder, quicker, more forceful versions of the original behavior. Mixed in with these desirable effects of variable reinforcement, you may also see slower, lower, weaker, different variations of the behavior.
The central theme to this process is to invoke your dog's ability to experiment with a new behavior. The benefits of learning to vary the reinforcement in this fashion are considerable. First, you teach the dog that after a behavior is shaped with successive approximation, it is time to uncork the creativity and experiment a little. Second, this format allows the dog to bypass tedious "stair step" styles of raising of criteria. While some behaviors require methodical increases in complexity, other behaviors are almost impossible to get with such linear methods of shaping. Third, varying the reinforcement teaches the animal, "If at first your don't succeed, try, try again. It is difficult to overestimate the importance of teaching a dog to learn in a persistent fashion.
Now, for fair warning -- if you have a great deal of experience training dogs, this next part is going to feel really uncomfortable. I am going to ask you to take a behavior that you have already shaped and start to make it wiggle. To get you started, I have a list of reinforcements for a series of repetitions. The idea is to take a behavior that has been shaped using consistent reinforcement and suddenly put the reinforcement schedule on a roller coaster. Your goal for this exercise is to watch closely and see what effect this change has on your dog's behavior. Ready? Here goes.
Random Reinforcement Project
NOTE: Use the cue to get the behavior to occur. Only give the reinforcement described if the animal successfully performs the behavior. In the event that the behavior disappears, drop your standards and go back to a one to one rate of reinforcement for 5 repetitions, then start this list at the point you left off. If you want to use this list over and over again, merely start new sessions in different places, or start at the end and work in reverse order.
1. Click + 10 treats, praise affection and babytalk
2. No click, no treat
3. Click - no treat
4. Click + one treat
5. No click, no treat
6. Praise and affection only
7. Click + 1treat, praise, affection and babytalk
8. No click, no treat
9. No click, no treat
10. Click, no treat
11. No click, no treat
12. Praise and affection only
13. Click + 3 treats, praise affection and babytalk
14. No click, no treat
15. Click - no treat
16. Click + one treat
17. No click, no treat
18. Praise and affection only
19. Praise affection and babytalk
20. No click, no treat
21. No click, no treat
22. Click + one treat
23. No click, no treat
24. Praise and affection only
25. Click + 5 treats, praise affection and babytalk
26. No click, no treat
27. Click - no treat
28. Click + one treat
29. No click, no treat
30. Praise and affection only
For the majority of dogs, this type of system will trigger some pretty interesting variations on the target behavior. As you work through this list over a series of sessions, you will start to see some really good variations and some really bad ones. You can start adding a touch of consistency to the process by giving extra recognition for better versions of the behavior, while using "wrong" for marginal performance. If you do see a really exceptional version of the behavior, give the dog an extra batch of treats, enthusiastic praise and affection. Next month we will start adding structure to this process on our way to great performance.
Go to the next article | Go to Prior Article