The effects of behavioral history on response acquisition with delayed reinforcement: A parametric analysis

Pulido, Marco A.; Paz, Mariana; Sosa, Rodrigo

Services on Demand

Journal

Article

Indicators

Revista mexicana de análisis de la conducta

Print version ISSN 0185-4534

Rev. mex. anál. conducta vol.34 n.1 México Jun. 2008

The effects of behavioral history on response acquisition with delayed reinforcement: A parametric analysis

Efectos de historia en la adquisición de la respuesta con reforzamiento demorado: Un análisis paramétrico

Marco A. Pulido, Mariana Paz and Rodrigo Sosa

Laboratorio de Condicionamiento Operante Universidad Intercontinental, México

Address correspondence to the first author,
Av. Universidad 1330, A, 1102, col. Del Carmen Coyoacán, C.P. 04100 México, DF,
or by email: mpulido@uic.edu.mx

Abstract

The present study systematically assessed the effects of two independent variables on response acquisition, delay duration and the number of sessions of non–contingent food delivery. Sixty naive, male Wistar rats were exposed to an FT 60–s schedule for a different number of sessions (0, 1, 5, 15 or 30). Once exposure to non–contingent food delivery was over, subjects were exposed to one of four different Tandem FR 1, FT x–s schedule, for 10 sessions, where FT duration could be programmed at 10, 20, 40 or 60–s. Results showed evidence of response acquisition was more apparent in those groups where subjects were exposed to 1, 5 or 15 sessions of non–contingent food delivery; response acquisition was less evident in those groups exposed to 0 or 30 sessions of the FT 60–s schedule. In general, obtained reinforcement rate decreased as delay duration increased. Results were discussed in terms of how history effects may make it difficult to compare experimental findings; the discussion also centered on variables that could probably explain why reinforcement history affects response acquisition with delayed reinforcement.

Key words: History effects, delay duration, response acquisition, lever pressing, rats.

Resumen

El objetivo del presente estudio fue el de evaluar sistemáticamente el efecto de dos variables independientes sobre la adquisición de la respuesta: 1) Duración de la demora y 2) el número de sesiones de entrega de comida no contingente que recibe el sujeto antes de iniciar la fase de adquisición. Sesenta ratas ingenuas, macho de cepa Wistar Lewis fueron expuestas a un programa TF 60–s durante un número diferente de sesiones (0, 1, 5, 15 o 30). Una vez terminada la exposición a la comida no contingente, los sujetos fueron expuestos a un programa tándem RF1, TF x–s, durante 10 sesiones, en el cual la duración del TF podía programarse en 10, 20, 40 o 60–s. Los resultados mostraron que la adquisición de la respuesta fue más evidente en aquellos sujetos que fueron expuestos a 1, 5 o 15 sesiones de comida no contingente; la adquisición de la respuesta fue comparativamente menos evidente en aquellas condiciones en las cuales se expuso a los sujetos a 0 o 30 sesiones del programa de TF 60–s. En general la tasa de reforzamiento obtenida disminuyó al aumentar la duración de la demora. Los resultados se discutieron en términos de cómo los efectos de historia dificultan la comparación entre experimentos. La discusión también se centra en variables que podrían explicar los efectos de historia sobre la adquisición de la respuesta bajo condiciones de reforzamiento demorado.

Palabras clave: Efectos de historia, duración de la demora, adquisición de la respuesta, palanqueo, ratas.

Recent studies (Lattal & Gleeson, 1990; Williams, Preston & De Kervor, 1990, experiment 3) have shown that response acquisition¹ may occur under conditions of delayed reinforcement and without explicit shaping. The finding has been replicated using different procedures to program delay interval (Wilken–field, Blakely, Nickel & Poling, 1992; Dickinson, Watt & Griffiths, 1992) and different manipulanda (Critchfield & Lattal, 1993; Schlinger & Blakely, 1994).

Evidence of response acquisition with delayed reinforcement has been documented in rats and pigeons (Lattal & Gleeson), Siamese fighting fish (Lattal & Metzger, 1994) and rhesus monkeys (Galuzca & Woods, 2005).

The studies presented in the previous paragraph suggest an important number of experiments have been conducted to assess the generality of the finding, however other experiments have tried to identify variables that may facilitate (or hinder) response acquisition. For instance both Critchfield & Lattal (1993) and Schlinger & Blakely (1994) exposed naive rats to a situation where the interruption of an invisible photo beam produced reinforcement after a delay interval. Some groups received brief auditory feedback when the beam was broken (others did not). Response acquisition was evident in those groups where the onset of the delay period was signaled (and comparatively less evident in unsignaled ones). The facilitative role of signals on response acquisition with delayed reinforcement was further explored by Pulido, Lopez & Lanzagorta (2005). These scientists used a 32–s temporally defined schedule of fixed duration (Schoenfeld & Cole, 1972). During the first few seconds of the cycle a response produced brief, immediate, auditory feedback; food reinforcement was presented at the end of the reinforcement interval. Groups of naive rats were exposed to this schedule and the presence (or absence) of the signal was varied. Results showed response rates rose faster and reached higher levels when the delay interval was signaled for both response dependent and response independent cue conditions.

Other scientists have identified variables that facilitate response acquisition. Lattal & Williams (1997) for instance discovered an inverse relationship between body weight and response acquisition using naive rats and a VI 15–s DRO 30–s schedule. In another study, Le Sage, Byrne & Poling (1996) found that acquisition of lever pressing by naive rats under fixed and variable delays could be enhanced under certain doses of d–amphetamine. Snycersky, Laraway, Bradley, Huitema & Poling (2004) exposed naive rats to either CRF or tandem RF1, DRO 15–s. however subjects initiated the acquisition phase only after being exposed to different number of sessions of response independent water delivery, (0, 1 or 5 sessions). Results showed evidence of response acquisition was more robust in those subjects that had received previous exposure to the VT 60–s schedule.

The Snycersky et al. (2004) study is interesting because any review of the experiments conducted so far to study response acquisition, clearly shows two things. In first place there is an enormous amount of variability, within and between studies; in second place preparatory operations previous to the acquisition phase also vary (Pulido, Sosa & Valadez, 2006). To illustrate the later case we will present the preparatory operations of a single author that has published an important number of studies on response acquisition with delayed reinforcement. In the Lattal & Gleeson (1990) magazine training was accomplished by means of one session exposure to 25 hopper presentations under a VT 30–s schedule; however in the Lattal & Metzger (1994) study no attempt was made to expose the fish with reinforcement delivery prior to the acquisition phase; and in the Lattal & Williams (1997) study, rats were exposed for 5 sessions to an FT 15–s schedule that was contingent on the subject's distance to the food tray.

In view of the great variability in magazine training procedures used by the scientists that explore the response acquisition phenomenon and considering the evidence produce by Snycersky et al. (2004) regarding preparatory operations, the present authors considered important to further extend the analysis of history effects. Specifically this study explored the effects of a greater number of magazine training sessions (0, 1, 5, 15 or 30 sessions) and a larger number of delay durations (10, 20, 40 or 60–s).

METHOD

Subjects:

Sixty naive male Wistar Lewis rats were used as subjects. All subjects were approximately five months old at the beginning of the study. Each subject's weight was registered on three consecutive days under free–feeding conditions to determine ad libitum body weight; food was then restricted until all subjects reached 80% of their free–feeding weight. Subjects were kept at their prescribed body weights throughout the experiment by means of supplementary feeding following each experimental session. Subjects were kept on the Laboratory vivarium under constant temperature conditions and a twelve–hour light–dark cycle (lights on a 7:00 a.m.). All experimental subjects were kept in individual cages with free access to water.

Apparatus:

Sessions were conducted in a Med Associates (ENV 008) rodent operant conditioning chamber made of stainless steel and transparent Plexiglas. The space in which the subjects were studied measured 21 cm in height by 30.5 cm length by 24.1 cm depth. A 3 cm stainless steel rolled lever was placed on the left panel of the front wall of the chamber. The lever was placed 3.0 cm above the floor and 3.0 cm apart from a trough type pellet receptacle located on the center panel of the front wall. The pellet receptacle was placed 1 cm above the floor and consisted of a 5 cm square opening 2cm in depth. A force of at least 30.5 grams was required to register the response. A depression of the lever produced an audible click and was counted as a response. A .45 mg pellet dispenser delivered 2 .20 mg pellets in each emission. Pellets were produced by means of remolding pulverized Purina Nutri Cubes. Two 1.1 W, 28 Vdc lights were used to illuminate the experimental chamber. One light was placed on the center panel of the back wall of the chamber and was used as a house light. The second light (a pilot light with a white glass translucent cover) was placed 5 cm above the lever and was not used during the study. The chamber was placed inside a sound–attenuating larger wooden box equipped with a ventilating fan. Experimental events were programmed and recorded using an IBM compatible 386 microcomputer equipped with an industrial automation card (Advantech, PC–Labcard 725) coupled to a relay rack.

Procedure

The present study may be conceptualized as a between groups factorial design with two independent variables, the number of sessions subjects were exposed to magazine training (0, 1, 5, 15 or 30 sessions) and the delay duration (10, 20, 40 or 60–s). Thus the experiment consisted of 20 different combinations of magazine training sessions and delay durations; three subjects were assigned to each experimental condition. During magazine training, with the lever absent from the chamber, subjects were exposed to an FT 60–s schedule; subjects received 60 response independent food deliveries in each session. The magazine was inspected after each session, to make sure that the subjects had consumed the food; inspections showed all animals consumed their allotted food during the sessions. Once the programmed number of magazine training sessions had been terminated for each experimental group, the lever was placed inside the chamber and a tandem schedule FR1, FT(x–s) was in effect for ten consecutive sessions. In all experimental conditions the tandem schedule was in effect 1 hour or the time necessary to produce 30 reinforcers, whichever occurred first. Experimental sessions were conducted six days a week at approximately the same time every day.

RESULTS

Figure 1 shows obtained reinforcement rate for all subjects in all experimental conditions. Each row shows a different delay value, programmed FT duration increases from top to bottom. Columns show the number of sessions each subject was exposed to response independent food delivery, the number of sessions each subject was exposed to the FT 60–s schedule increases from left to right. Reinforcement rate was chosen as dependent variable (rather than the more typical response rate) because subject Q8 produced a very high response rate during the 4th session of its corresponding experimental session and thus data from other subjects and experimental conditions may not be properly appreciated because of a floor effect.

In general, figure 1 shows reinforcement rates reached higher levels under the 10 second delay condition, were intermediate during the 20–s and 40–s delay conditions and low under the 60–s condition groups. Reinforcement rates decreased gradually in those conditions where subjects were exposed to response independent food delivery for 1, 5 and 15 days, in contrast, reinforcement rates decrease abruptly in those groups that did not receive response independent food, or received response independent food for 30 sessions (reinforcement rate is relatively high during the 10–s delay condition for some subjects and then decreases abruptly when the delay is increased to 20–s). With the exception of the 60–s delay groups, where most subjects did not earn reinforcers, the rest of the experimental conditions show that reinforcement rates where higher when non contingent reinforcement was delivered for 1, 5 or 15 sessions.

In order to further assess the effects of the independent variables, response acquisition was arbitrarily defined as earning at least 15 reinforcers in one of the ten experimental sessions. The authors recognize this criterion as arbitrary and flawed (because subjects may reach the acquisition criterion and subsequently produce less reinforcers per session), however as no consensus has been reached regarding this topic it will be used as an objective behavior measure for other scientists to validate (or reject). Table 1 shows these data using the following arrangement. Columns show different delay durations, delay duration increases from left to right; rows show exposure to non–contingent food delivery, the number of sessions increase from top to bottom. The table always presents the subjects label followed by "N" (did not earn at least 15 reinforcers) or "Y" (earned at least 15 reinforcers). After the subject's label, the first column of data show if the acquisition criterion was met during the first Ave experimental sessions; the second column shows if the criterion was met during the last 5 experimental sessions.

Table 1 shows that the number of subjects that met the response acquisition criterion decreased as delay duration increased. Thirteen out of 15 subjects met the criterion with a 10–s delay; 10 subjects met the criterion with a 20–s delay; 5 subjects met the criterion with a 40–s delay and only 3 subjects with a 60–s delay. The number of subjects that met the response acquisition criterion also varied with the number of sessions the subject was exposed to non–contingent food delivery. The criterion was met only occasionally in the no exposure condition (4 out of 12 subjects) and in the 30 sessions condition (3 subjects); in contrast, the criterion was met frequently in the 1 session condition (10 subjects); in the 5 session condition (11 subjects) and in the 15 session condition (9 subjects).

In order to further assess the effects of the independent variables on obtained reinforcement rate, a two–way, between groups analysis of variance was conducted. Delay duration and the number of non–contingent food deliver sessions where used as independent variables; reinforcement rate on the 10 experimental sessions was used as dependent variable. Both delay duration and the number of sessions of non–contingent food delivery produced significant main effects (F(3/599)=157.68, p=.000); (F(4/598)=41.5; p=.000) respectively. The interaction between the two independent variables was also significant (F(12/579)=10.303; p=.000). Delay duration and the number of non–contingent food delivery sessions also produced significant main effects on response rate per minute (F(3/599)=67.006, p=.000); (F(4/599)=30.719, p=.000) respectively. The interaction between the two independent variables was also significant (F(12/579)=7.210, p=.000). In order to assess differences between experimental groups for each independent variable, a Newman–Keuls post–hoc test was used to form homogeneous subsets based on the harmonic mean of the response rate per minute. The test showed that the group that did not receive non–contingent food and the group that received 30 sessions of non–contingent food, belong in the same and lower mean subset (X=.182; X=.447; those groups exposed to 1 and 5 sessions belonged in a same and intermediate mean subset (X=1.118; X=1.164); lastly the 15 sessions group belonged to an independent and higher mean subset (X=1.848). The same post–hoc test was used to form homogeneous subsets based on the harmonic mean of the response rate per minute for delay duration. The first a lowest subset was composed of the 60–s delay and the 40–s delay groups (X=.307; X=.497). The intermediate subset grouped, (once again), the 40–s delay condition and the 20–s delay condition (X=.497; X=.790). The last and highest mean group was composed of the 10–s delay condition (X=2.217). Delay of reinforcement effects were also assessed using linear regression analysis; response rate per minute, for the ten sessions, of all experimental subjects, was used as dependent variable and delay duration as independent variable. The equation produced a negative and significant slope (p=.000) and is described as follows: y = (–.032)x + 2.004.

Figure 2 was designed to describe response rate per minute during 5–s bins of the delay interval. Only the Ave last experimental sessions are shown. The general organization of the figure is the same used in figure 1 with two exceptions; in the first place the vertical axis shows average response rate per minute; in second place the horizontal axis shows 5–s bins of the FT interval. In order to avoid floor effects each delay value was graphed using a different maximum value in the "Y" axis; in addition, because FT duration varied across delays, the number of bins for each delay value differs.

Figure 2 shows that a common finding across all experimental groups and conditions is that response rate is high during the first bin and then drops sharply during the rest of the delay interval. This effect is present in subjects that respond at very high rates (for instance T17) and also in subjects that only occasionally emit a response (for instance O4). These results suggest in general that obtained delays should closely match programmed delays. Figure 2 also shows that maximum response rate in each delay condition drops gradually as delay duration increases (nearly 7 response per minute occur in the 10–s delay condition; response rates nearly reach 3 responses per minute in the 20–s delay condition and then drop to just under 1 response for the 40–s delay condition and just under 0.5 in the 60–s delay condition.

DISCUSSION

The results of the present study suggest that exposure to non–contingent food delivery has important effects on response acquisition with delayed reinforcement. Apparently 1, 5 or 15 sessions of non–contingent food delivery may considerably enhance response acquisition. Response enhancing effects of FT exposure apparently depend on delay duration, (being more evident under short response–reinforcer interval, and more conspicuous as FT duration increases). The present findings extend and partially agree with those produced by Snycersky et al (2004); they extend previous findings by showing that the facilitative effects of non–contingent food delivery may be produced by both VT and FT schedules. In a similar vein, they extend previous findings by showing that the facilitative effects of non–contingent food delivery may occur under longer delays of reinforcement than those assessed by Snycerky and her colleagues. The present results, however, suggest that the function that relates non–contingent food delivery exposure to response rate is not linear (as may be deduced by the Snycersky et al findings); instead, the present results suggest extensive exposure to periodically delivered free food, decreases response rate.

It has been suggested by some scientists that response acquisition is a problematic dependent variable because it produces highly variable results (Pulido, Lanzagorta, Moran, Reyes & Rubi, 2004; Pulido, Sosa & Valadez, 2006 and Pulido & Martínez, 2008). Both the Snycersky et al study and the present experiment suggest that some of this variability could be accounted for by the different preparatory operations used in the studies conducted so far. As was mentioned previously, magazine training procedures not only vary between scientists, they also vary between different studies conducted by the same scientists. Thus if comparable results are to be produced in the study of response acquisition with delayed reinforcement, some sort of agreement between different study groups is needed. If no such agreement is possible or forthcoming, then at least comparisons between studies should take into consideration the similarity in the preparatory operations used in the experiments.

Methodological problems aside, how can the effects produced by the different number of sessions of non–contingent food delivery be accounted for? Catania (1979) suggested that acquisition of new behavior in an operant conditioning environment is considerably enhanced once the subject has been shaped to lever–press because the animal is familiar with the elements of the "learning set" (including the noise of the pellet dispenser, the place where food may be recovered, etc.). Thus it is possible that those subjects that had no previous experience with food delivery within the chamber had to learn a greater number of elements of the learning set in order to produce the final response (and thus lever–pressing for food inside the chamber would take longer to appear). But why is response acquisition so conspicuous in those subjects that received 30 sessions of non–contingent food delivery (they have certainly been exposed to various elements of the learning set for a long time)? Informal observation of the subjects inside the experimental chamber suggests the behavior of sitting beside the pellet receptacle, gradually became more frequent, and food was consumed almost as soon as it appeared. Perhaps sitting next to the pellet receptacle was gradually reinforced by immediate food delivery and thus strong "behavioral momentum" made it difficult for new behavior to appear and develop (Nevin, Mandell & Atak, 1983). Both the "learning set" theory and "behavioral momentum theory" could probably explain why exposure to 1, 5 or 15 sessions of non–contingent food delivery had response enhancing effects. It is possible that acquiring more elements of the learning set enhances response acquisition, as long as positive reinforcement does not develop such strong behavior stereotypy as to inhibit response variability. Obviously the benefits of acquiring more elements of the learning set outweighed variability inhibition even after exposure to 15 sessions of an FT schedule. Where exactly does the breaking point occur, and response variability inhibition outweigh the benefits of acquiring elements of the learning set exceeds the data produced by this study. In those subjects exposed to a 40–s delay, the benefits of acquiring elements of the learning set appear to decrease orderly as the number of sessions of FT exposure increased; however for those subjects exposed to a 60–s delay, learning set benefits appear to top in the 15 session condition. The authors acknowledge that learning set and behavioral momentum are not the only hypotheses furthered to account for the present data. Some scientist, that have had access to the data produced by this study, have suggested that low response rates produced in the 30 session condition could probably be accounted for in terms of "learned laziness" (Trapold, Carlson & Myers, 1965; Engberg, Hansen, Welker & Thomas, 1972) Perhaps future research may help clear this issue.

A more fine grained analysis of the learning set theory suggests that if an association between the noise produced by the feeder and pellet delivery occurs, then the interval between food availability and its recovery would be brief (because the animal is likely to approach the pellet receptacle as soon as feeder noises occur). Exposure to non–contingent food delivery should produce the aforementioned association thus avoiding that a pellet recovery delay be added to the programmed delay (if the pellet recovery delay is added to the programmed delay, response–consequence association would be considerably hindered). Perhaps a study that could measure pellet recovery delay under different FT histories could help determine the effects of pellet recovery delay on response acquisition with delayed reinforcement.

ACKNOWLEDGMENTS

The authors would like to thank the "Facultad de Psicología" and the "Instituto de Posgrado e Investigación" of the Universidad Intercontinental for their support in the preparation and conduction of the present study. The authors would like to thank Israel Ogando and Enrique Bustamante for their help in the preparation of the Figures. The authors are also in debt with Dr. Kennon A. Lattal for providing some hard to And experimental literature.

REFERENCES

Catania, Ch. (1979) Learning. Englewood Cliffs, N.J. Prentice Hall [ Links ]

Critchfield, T.S. & Lattal, K.A. (1993). Acquisition of a spatially defined operant with delayed reinforcement. Journal of the Experimental Analysis of Behavior, 59, 373–387. [ Links ]

Dickinson, A., Watt, A., & Griffiths, W.J.H. (1992). Free–operant acquisition with delayed reinforcement. The Quarterly Journal of Experimental Psychology, 45B, 241–258. [ Links ]

Engberg, L.A., Hansen, G.A.., Welker, R.I.. & Thomas, D.R. (1972). Acquisition of key–pecking via autoshaping as a function of prior experience: "Learned laziness?" Science, 178, 1002–1004. [ Links ]

Galuzca, C.M., & Woods, J.H. (2005). Acquisition of cocaine self–administration with unsignaled delayed reinforcement in rhesus monkeys. Journal of the Experimental analysis of behavior, 84, 269–280. [ Links ]

Lattal, K.A. & Gleeson, S. (1990). Response acquisition with delayed reinforcement. Journal of Experimental Psychology: Animal Behavior Processes, 16, 27–39. [ Links ]

Lattal, K.A. & Metzger, B. (1994). Response acquisition by Siamese fighting fish. Journal of the Experimental analysis of Behavior, 61, 35–44. [ Links ]

Lattal, K.A. & Williams, A.M. (1997). Body weight and response acquisition with delayed reinforcement. Journal of the Experimental Analysis of Behavior, 67, 131–144. [ Links ]

Le Sage, M.G., Byrne, T. & Poling, A. (1996). Effects of d–amphetamine on response acquisition with immediate and delayed reinforcement. Journal of the Experimental Analysis of Behavior, 66, 349–367. [ Links ]

Nevin, J.A. Mandell, C. & Atak, J.R. (1983). The analysis of behavioral momentum. Journal of the Experimental Analysis of Behavior, 39, 49–59. [ Links ]

Pulido, M., Lanzagorta, N., Morán, E., Reyes, A. & Rubí, M. (2004) El efecto de las señales en programas de reforzamiento demorado: Una revisión contemporánea. Revista del Consejo Nacional para la Enseñanza e Investigación en Psicología, 9, 321–339. [ Links ]

Pulido, M., López, L., & Lanzagorta, N. (2005). Effects of contingent and non–contingent signals during delay interval on response acquisition by rats. Enviado para dictamen a la Revista Mexicana de Análisis de la Conducta. [ Links ]

Pulido, M., Sosa, R. & Valadez, L. (2006). Adquisición de la operante libre bajo condiciones de reforzamiento demorado: Una revisión. Acta Comportamentalia, 14, 5–21. [ Links ]

Pulido, M., & Martinez, G. (2008). Effects of response–signal temporal separation on behavior maintained under temporally defined schedules of delayed signaled reinforcement. Sent for review to the Psychological Record. [ Links ]

Schlinger, H.D. & Blakely, E. (1994). The effects of delayed reinforcement and a response–produced auditory stimulus on the acquisition of operant behavior in rats. Psychological record, 44, 391–409. [ Links ]

Schoenfeld, W.N., & Cole, B.K. (1972). Stimulus schedules: The t–T systems. New York, Harper and Row. [ Links ]

Snycerski, S., Laraway, S., Bradley, E., Huitema., & Poling, A. (2004). The effects of behavioral history on response acquisition with immediate and delayed reinforcement. Journal of the Experimental analysis of behavior, 81, 51–64. [ Links ]

Trapold, M.A., Carlson, J.G., & Myers, W.A. (1965). The effect of noncontingent fixed–and variable–interval reinforcement upon subsequent acquisition of the fixed–interval scallop. Psychonomic Science, 2, 261–262. [ Links ]

Wilkenfield, J., Nickel, M., Blakely, E. & Poling, A. (1992) Acquisition of lever–press responding in rats with delayed reinforcement. Journal of the Experimental analysis of behavior, 58, 431–443. [ Links ]

Williams, B.A., Preston, R.A. & De Kervor, D. (1990) Blocking of the response–reinforcer association: Additional evidence. Learning and Motivation, 21, 379–398. [ Links ]

NOTE

¹The authors are in general agreement with the idea that some of the behavior produced in the present study is difficult to characterize as "response acquisition;" however the term was used in this paper to facilitate communication and in recognition that a new semantic consensus regarding this concept is lacking).