Reinforcer Efficacy, Response Persistence, and Delay of Reinforcement

Jarmolowicz, David P.; Lattal, Kennon A.

Serviços Personalizados

Journal

Artigo

Indicadores

Citado por SciELO
Acessos

Links relacionados

Similares em SciELO

Mais
Mais

Permalink

Acta de investigación psicológica

versão On-line ISSN 2007-4719versão impressa ISSN 2007-4832

Acta de investigación psicol vol.1 no.1 Ciudad de México Abr. 2011

Reinforcer Efficacy, Response Persistence, and Delay of Reinforcement

Eficacia del reforzador, persistencia de la respuesta y demora de reforzamiento

David P. Jarmolowicz & Kennon A. Lattal¹

West Virginia University.

Please address correspondence to:
Kennon A. Lattal,
Department of Psychology,
West Virginia University,
53 Campus Drive,
1124 Life Sciences Building,
P.O. Box 6040,
Morgantown, WV, 26505.

Abstract

In an evaluation of the effects of delayed reinforcement on response persistence, two pigeons were exposed to a series of conditions in which reinforcement that either immediately followed or was delayed from the response that produced it alternated across blocks of sessions. Responding was maintained by a progressive-ratio schedule in which the response requirements incremented for successive reinforcers. The effects of signaled and unsignaled delay values of 1, 5, 10, and 20 s were investigated. In general, responding was more persistent, as measured as the point at which responding ceased for 300 s, with shorter delays, regardless of whether the delays were correlated with a distinct stimulus (that is signaled) or not. The results complement earlier findings showing that reinforcement delays affect reinforcer efficacy or response persistence by showing similar effects using an index of response strength that is independent of response rate. They also extend the general effects of delay of reinforcement to a schedule in which they previously have not been demonstrated.

Keywords: Reinforcer efficacy, Response persistence, Progressive-ratio schedule, Signaled delay of reinforcement, Unsignaled delay of reinforcement, Key peck, Pigeons.

Resumen

En una evaluación de los efectos de la demora de reforzamiento sobre la persistencia de la respuesta, se expuso a dos palomas a una serie de condiciones en las que el reforzamiento, que ya sea siguió inmediatamente a la respuesta o estuvo demorado de la respuesta que lo produjo, alternó a través de bloques de sesiones. La respuesta se mantuvo mediante un programa de razón progresivo en el que los requisitos de respuesta aumentaron para reforzadores sucesivos. Se investigaron los efectos de demoras señaladas y no señaladas de 1, 5, 10 y 20 s.

En general, el responder fue más persistente, medido como el punto en el que cesó durante 300 s, con las demoras cortas, independientemente de si las demoras estuvieron correlacionadas con un estímulo distintivo (es decir demora señalada) o no. Los resultados complementan hallazgos previos que mostraron que las demoras de reforzamiento afectan la eficiencia del reforzador o la persistencia de la respuesta, al mostrar efectos similares utilizando un índice de la fuerza de la respuesta que es independiente de la tasa de respuesta. También extienden la generalidad del efecto de la demora de reforzamiento a programas en los que previamente no se había demostrado.

Palabras Clave: Eficacia del reforzador, Persistencia de la respuesta, Programa de razón progresiva, Demora de reforzamiento señalada, Demora de reforzamiento no señalada, Picoteo a una tecla, Palomas.

The efficacy of a reinforcer varies as function of its parameters. Nevin (1974) showed that responding maintained by reinforcers that were larger, more frequent, and more immediate was more persistent in the face of environmental challenges than were reinforcers that were relatively smaller, less frequent, or longer delayed. Using multiple variable-interval (VI) VI schedules with different reinforcement parameters in either component, Nevin (1974) examined three different challenges to responding: extinction (i.e., discontinuation of reinforcement), imposing response-independent food presentations during blackouts between the components, and rendering the reinforcer itself less effective by pre-feeding the pigeons. Cohen, Riley, and Weigle (1993) replicated Nevin's findings when multiple schedules whose components differed in response requirement (fixed-ratio [FR] schedules) or reinforcement rate (fixed- [FI] and variable- interval schedules) maintained responding. Neither prefeeding, extinction, nor imposing response-independent food delivery during single schedules (VI, FI, or FR), however, had systematic effects on behavior as a function of the FR requirement or FI or VI value.

Progressively increasing the response requirement for reinforcement also has been proposed as test of reinforcer efficacy (Hodos, 1961; see Stafford, LeSage, & Glowa, 1998; Stewart, 1975 for reviews). Progressive-ratio (PR) schedules entail systematic increases in the number of responses required to produce successive reinforcer deliveries (see Jarmolowicz & Lattal, 2010 for a review of different progressive contingencies). The response requirement at which responding ceases for a predetermined period, labeled the break point, is the index of reinforcer efficacy. For example, reinforcers of greater magnitude result in higher break points than do their lesser-magnitude counterparts (e.g., Baron, Mikorski & Schlund, 1992; Hodos, 1965; Hodos & Kalman, 1963; Ricard, Bondy, Zhang, Bradshaw, & Szabadi, 2009). Hodos and Kalman, using sweetened condensed milk varying in a range of volumes from .025 to .25 ml in different conditions, and Baron et al. using different concentrations of a milk solution in different conditions, found that break points of rats' lever pressing were higher with reinforcers of greater magnitude. These results are consistent with Nevin's (1974) finding that larger magnitude reinforcement results in greater behavioral persistence than does reinforcement of lesser magnitude. The effects of other reinforcement parameters shown to affect persistence in Nevin's paradigm on PR performance, however, remain uninvestigated.

In the present experiment we therefore examined how another parameter of reinforcement, the delay between the response that produces the reinforcer and its subsequent delivery, affects response persistence under PR schedules. As noted above, Nevin (1974) previously showed that more immediate reinforcers maintained responding that was more persistent than did more delayed reinforcers. The question was whether delay of reinforcement has similar degrading effects on response persistence as measured by PR schedule performance as it has been shown to have when imposed in different components of a multiple schedule. Differential persistence with changing delay duration also would indicate changes in reinforcer efficacy with these durations. Because the presence or absence of a stimulus change during the delay is an important determinant of the effects of the delay, we compared the effects of signaled (stimulus change during the delay) and unsignaled (no such stimulus change) delays.

Method

Subjects

Two White Carneau pigeons (977 & 723) were maintained at 80% (+/- 2%) of their free-feeding weights by food obtained during experimental sessions and post-session. Water and health grit were available continuously in the home cage, where a 12-hr light: 12-hr dark cycle was maintained. Each pigeon had a history of responding on a variety of reinforcement schedules.

Apparatus

Sound-attenuating operant conditioning chambers (31 cm wide, 30 cm long, and 38 cm high) containing a brushed-aluminum three-key work panel were used. Only the response key located on the midline of the work panel was used. It was 2.5 cm in diameter and its midpoint was located 14 cm above the top rim of the food-access aperture. The key could be transilluminated white. The food access aperture (6 cm wide by 6.5 cm high) was located on the midline of the panel, 8 cm from the floor. The aperture allowed access to mixed grain when a hopper was raised. A 28-V DC clear bulb illuminated the aperture and the response key light was dark during hopper presentations. A ventilation fan, located in the back right corner of the rear wall, and white noise delivered through a speaker, located in the lower left corner of the work panel, masked extraneous noise. Programming and data recording were controlled by a computer in an adjacent room using MED-PC® software and hardware (MED Associates, Inc. & Tatham, 1991).

Procedure

Pretraining, consisting of several sessions of exposure to a variable-ratio 20 schedule, was conducted prior to initiation of the experiment proper. Sessions occurred 6-7 days a week at approximately the same time each day and ended when 300 s lapsed without a key peck. As noted in the introduction, the last ratio completed before this lapse defined the break point. The sequence of conditions and numbers of sessions at each is shown in Table 1. Blocks of baseline and delay-of-reinforcement (delay) sessions alternated. During baseline sessions, responding on PR schedules (PR 7 for 977 and PR 15 for 723) resulted in the immediate delivery of 3-s access to grain. Each baseline condition lasted for at least 13 days and until breakpoints were stable. Stability required that the mean of the breakpoint during the first and last 3 sessions of a 6-session block not differ from the grand mean of all 6 sessions by more than 5%, without any systematic directional trend in the breakpoints. Delay sessions were similar to baseline sessions, except either blackout-signaled (chamber dark) or unsignaled (no stimulus change; see Sizemore & Lattal, 1978) delays were arranged between the response that satisfied the PR requirement and the delivery of the reinforcer. The signaled and unsignaled delays also may be described as, respectively, chained PR fixed-time (FT) and tandem PR FT schedules. All delays were non-resetting (responses during the delay had no effect) and delay durations increased across conditions from 1-s to 5-s to 10-s to 20-s. The order of delay type (signaled and unsignaled) at each delay duration differed for the two pigeons (see Table 1). Each delay condition was preceded and followed by a return to baseline. Delay durations were increased to the next value only after both signaled and unsignaled delay conditions were conducted at a given delay.

Results

Data shown in all figures except Figure 2 are means of the final six sessions of each delay condition and baselines are averages of the final six sessions of each baseline shown in Figure 2. Figure 1 shows mean breakpoints for each condition. The session-by-session data from which they are summarized are shown in Figure 2. With the exception of 1-s unsignaled delays, breakpoints generally were highest in the immediate reinforcement conditions and decreased with increases in delay value. This inverse relation between delay value and breakpoint generally held for both signaled and unsignaled delays. Thus, the presence or absence of the signal did not systematically affect the break point. The data in Table 1 show that rates of reinforcement decreased with increasing delay values, as is inevitable with changes in delay value.

Figure 3 shows the mean run rate (number of responses in the periods before the delay started/session time minus post-reinforcement pause plus reinforcement access time). Because breakpoints differed between conditions, only those ratio values that recurred in all sessions were included in the analysis shown in this and all subsequent figures. There was a negative relation for both pigeons between run rate and delay duration when delays were unsignaled, but not when they were signaled.

The upper graph in Figure 4 shows that responding during signaled delay periods was rare and that responding during unsignaled delay periods was both higher than during the equivalent signaled delays and constant across increasing delays values. Response rates during the unsignaled delays were higher than the mean response rates shown in Figure 3 probably because they are calculated based on responding at the end of each ratio run and not the entire ratio. The lower graphs of Figure 4 show that the obtained delays (the delay between the last response in a given ratio requirement and reinforcement) was constant for unsignaled delays and increased with delay duration for signaled delays.

Figure 5 shows mean post-reinforcement pauses (PRPs) during each condition. Thus, these data are means of PRPs at each increasing response requirement across the session. The PRPs for both pigeons generally increased with increasing delays. There were no systematic differences in PRPs between the signaled and unsignaled delays across the two pigeons.

Discussion

The point at which both pigeons ceased completing progressively increasing ratio requirements decreased as the delay between the reinforced response and the reinforcer delivery increased. These results extend to another parameter of reinforcement—its delay-- the findings of Hodos and Kalman (1963) and Baron et al. (1992) showing that PR schedule performance is sensitive to changes in reinforcement magnitude. The results also complement the findings of Nevin (1974) that more immediate reinforcers are more efficacious in developing persistent responding than are delayed reinforcers. This finding extends Nevin's conclusion by suggesting that the signal is not as important as the delay itself in persistence.

Nevin's (1974) tests of response persistence involve what Harper and McLean (1992) identified as a behavioral disruption from outside the components of the schedule. That is, extinction, pre-session feeding, and imposing response-independent food during blackouts all are independent of the schedule maintaining the response. By contrast, other disruptors occur within the schedule itself. Thus, imposing response-independent food during a schedule (an internal disruptor) as opposed to between schedules as in Nevin's use of such food delivery in the blackout between multiple schedule components (an external disruptor) can have different effects on behavior. Cohen et al. (1993), for example, compared the former and latter procedures, showing that only the latter had systematic effects on response persistence.

Both the response requirement and the response-reinforcer dependency seem integral elements of the reinforcement contingency (cf. Harper & McLean, 1992). Nonetheless, in the case of both reinforcement magnitude and reinforcement delay, the effects obtained with PR schedules complement the effects of external disruptors on operant behavior. Perhaps the difference between affecting persistence by imposing response-independent food delivery during the schedule itself (Cohen et al., 1993; Lattal & Bryan, 1976; Zeiler, 1979) and changing the parameters of the reinforcer in a PR schedule have to do with the index of persistence. In the former, the index is the very response rate that may be disrupted by the response-independent food itself, but in the latter the index of persistence, the break point, is one that is independent of response rate.

The relation between the effects of variations in reinforcement parameters on response rates and break points warrants comment. Reinforcer magnitude has mixed effects on response rates maintained by reinforcement schedules (Bonem & Crossman, 1988), but its effects on PR break points are systematic and replicable (e.g., Hodos & Kalman, 1963; Baron et al., 1992). Responding maintained by VI schedules generally declines with increasingly long delays of reinforcement (e.g., Sizemore & Lattal, 1978; Richards, 1981; see Lattal, 2010 for a review). This finding is consistent with the present findings relating break points to delay duration. Run rates (response rates calculated after the PRP was omitted from the calculation) showed a somewhat orderly relation to delay duration, but, unlike previous comparisons of signaled and unsignaled delays of reinforcement imposed on VI schedules, the signaled delay did not yield consistently higher response rates. Nevin (1974) suggested that reinforcement rates determine response strength, but Lattal (1989) showed that when reinforcement rates are constant, lower response rates are more resistant to change than are higher rates. In the present experiment, both reinforcement and response rates were more or less similar in the two conditions, and persistence also was similar.

The present results may be compared along other dimensions to those of previous investigations of the effects of delays of reinforcement imposed on other schedules of reinforcement. Morgan (1972), for example, found no systematic relation between work time (equivalent to run rates on FR schedules) and (signaled) delay duration under FR schedules. Post-reinforcement pauses, however, generally increased with increasing delay values both in the present experiment and in those reported by Morgan (1972) and Meunier and Ryman (1974).

Two potential limitations of the present method warrant comment. One is that reinforcement rates decreased with increasing delay values. This variable has been controlled in some (e.g., Lattal, 1984; Sizemore & Lattal, 1978) but by no means all (e.g., Ferster, 1953; Richards, 1981) previous investigations of delay of reinforcement on reinforcement rate. Although the effects of delay duration have been shown to be largely independent of changes in reinforcement rate in other schedules of reinforcement (e.g., Sizemore & Lattal), the lack of a control condition in the present experiment mandates that the present results be interpreted as a joint outcome of delay and potential changes in reinforcement rate. One way of isolating the effects of progressive reinforcement rate decreases in PR schedules would be to use a yoked control procedure, as was done by Lattal, Reilly, and Kohn (1998). Another would be to use a chained FT PR schedule, which would equate reinforcement rate without the delays. A second potential limitation is that the delays were increased systematically across conditions. Ferster (1953) showed that such a procedure may attenuate the effect of the delay; however, Ferster's procedure was not a truly progressively increasing delay procedure. Rather, he titrated the delay, increasing or decreasing the delay duration, as a function of the pigeon's responding in an attempt to demonstrate that behavior could be sustained in the presence of relatively long delays via such a procedure. By contrast, in the present experiment the delays were imposed at full value, each separated from the previous by an immediate reinforcement baseline condition. Thus, each delay was compared to an immediately preceding immediate reinforcement condition, making it less likely that the order of delays had any strong, systematic effect on the results.

Despite these potential limitations, the present results are generally consistent with other research on both delay of reinforcement and of other parameters of reinforcement on PR performance. Thus, these present results extend the general findings of delay of reinforcement to PR schedules, on which such effects have not previously been demonstrated. Delay of reinforcement challenges response persistence in PR schedules much as it does when tested using multiple-schedule resistance to change testing procedures. That is, delay of reinforcement reduces the efficacy of the reinforcer. Such a finding extends the utility and generality of the PR schedule as a method for indexing response strength or behavioral persistence. More importantly, because the break point is a measure of persistence that is independent of response rate, the present findings in combination with the extant literature on delay of reinforcement further suggest that delays of reinforcement not only reduce response rates, but they more generally affect the persistence and strength of operant behavior.

References

Baron, A., Mikorski, J., & Schlund, M. (1992). Reinforcement magnitude and pausing on progressive-ratio schedules. Journal of the Experimental Analysis of Behavior, 58, 377-388. [ Links ]

Bonem, M., & Crossman, E. K. (1988 ). Elucidating the effects of reinforcer magnitude. Psychological Bulletin, 104, 348-362. [ Links ]

Cohen, S. I., Riley, D. S., & Weigle, P. A. (1993). Tests of behavior momentum in simple and multiple schedules with rats and pigeons. Journal of the Experimental Analysis of Behavior, 60, 255-291. [ Links ]

Ferster, C. B. (1953). Sustained behavior under delayed reinforcement. Journal of Experimental Psychology, 45, 218-224. [ Links ]

Harper, D. N., & McLean, A. P. (1992). Resistance to change and the law of effect. Journal of the Experimental Analysis of Behavior, 57, 317-337. [ Links ]

Hodos, W. (1961). Progressive ratio as a measure of reward strength. Science, 134, 943-944. [ Links ]

Hodos, W. (1965). Motivational properties of long durations of rewarding brain stimulation Journal of Comparative and Physiological Psychology, 59, 219-224 [ Links ]

Hodos, W., & Kalman, G. (1963). Effects of increment size and reinforcer volume on progressive ratio performance. Journal of the Experimental Analysis of Behavior, 6, 387-392. [ Links ]

Jarmolowicz, D. P., & Lattal, K. A. (2010). On distinguishing progressively increasing response requirements for reinforcement. The Behavior Analyst, 33, 119-125. [ Links ]

Lattal, K. A. (1984). Signal functions in delayed reinforcement. Journal of the Experimental Analysis of Behavior, 42, 239-253. [ Links ]

Lattal, K. A. (1989). Contingencies on response rate and resistance to change. Learning and Motivation, 20, 191-203. [ Links ]

Lattal, K. A. (2010). Delayed reinforcement of operant behavior. Journal of the Experimental Analysis of Behavior,93, 129-139. [ Links ]

Lattal, K. A., & Bryan, A. J. (1976). Effects of concurrent response-independent reinforcement on fixed-interval schedule performance. Journal of the Experimental Analysis of Behavior, 26, 495-505. [ Links ]

Lattal, K. A., Reilly, M. P., & Kohn, J. P. (1998). Response persistence under ratio and interval reinforcement schedules. Journal of the Experimental Analysis of Behavior, 70, 165-183. [ Links ]

MED Associates, Inc., & Tatham, T. A. (1991). MED-PC Medstate notation. East Fairfield, NH: MED Associates, Inc. [ Links ]

Meunier, G. F., & Ryman, F. (1974) Delay of reinforcement in fixed-ratio behavior. Psychological Reports, 34, 350. [ Links ]

Morgan, M. J. (1972). Fixed-ratio performance under conditions of delayed reinforcement. Journal of the Experimental Analysis of Behavior, 17, 95-98. [ Links ]

Nevin , J. A. (1974). Response strength in multiple schedules of reinforcement. Journal of the Experimental Analysis of Behavior, 21, 389-408. [ Links ]

Ricard, J. F., Body, S. Zhang, Z., Bradshaw, C.M., & Szabadi, E. (2009) Effect of reinforcer magnitude on performance maintained by progressive-ratio schedules. Journal of the Experimental Analysis of Behavior, 91, 75-87. [ Links ]

Richards, R. W. (1981). A comparison of signaled and unsignaled delay of reinforcement. Journal of the Experimental Analysis of Behavior, 35, 145-152. [ Links ]

Sizemore, O. J., & Lattal, K. A. (l978). Unsignaled delay of reinforcement in variable-interval schedules. Journal of the Experimental Analysis of Behavior, 30, 169-175. [ Links ]

Stafford, D., LeSage, M. G., & Glowa, J. R. (1998). Progressive-ratio schedules of drug delivery in the analysis of drug self administration: A review. Psychopharmacology, 139, 169-184. [ Links ]

Stewart, W. J. (1975). Progressive reinforcement schedules: A review and evaluation. Australian Journal of Psychology, 27, 9-22. [ Links ]

Zeiler, M. D. (1979). Reinforcing the absence of fixed-ratio performance. Journal of the Experimental Analysis of Behavior, 31, 321-332. [ Links ]

Note

¹ A version of this manuscript was presented at the 34th annual meeting of the Association for Behavior Analysis International in Chicago, IL. David Jarmolowicz is now at the University of Arkansas for Medical Sciences.