Print version ISSN 0185-4534
Rev. mex. anál. conducta vol.37 n.2 México Jan. 2011
Response acquisition with Delayed Conditioned Reinforcement*
Adquisición de la respuesta con Reforzamiento Condicionado Demorado
Rodrigo Sosa and Marco A. Pulido
Laboratorio de Condicionamiento Operante Universidad Intercontinental, México, D.F., México. Correo electrónico: email@example.com
Received: September 30, 2010
Revised: June 7, 2011
Accepted: June 30, 2011
The present study assessed the possibility that lever pressing by rats, will increase, if this response is correlated with the presentation of exteroceptive stimuli, previously paired with primary reinforcement. Naïve rats were exposed to ten 30 minute sessions where a FT 60-s schedule presented food correlated with a 3-s blackout and the operation of the food delivery magazine. After the training phase was over, an acquisition phase began, subjects could receive one of four different schedules for twenty consecutive sessions: 1) CRF; 2) FR1, FT 2-s; 3) FR1, FT 5-s or 4) FR1, FT 10-s; nine subjects were exposed to each schedule. Results produced by the conditioned reinforcement conditions were compared with conditions where: 1) primary reinforcement was delivered during the acquisition phase, 2) blackout and magazine operation occurred during the training phase but in the absence of food, and 3) subjects remained in the experimental chamber without any programmed stimulus presentation during the training phase. Both the primary reinforcement and the conditioned reinforcement conditions produced delay gradients; the latter was considerably steeper than the former. Results were discussed in terms of their similarity with previous studies; they were also discussed in terms of the ongoing debate regarding the empirical validity of the conditioned reinforcement concept.
Key words: Conditioned reinforcement, reinforcement delay, response acquisition, lever pressing, rats.
El objetivo de este estudio fue evaluar la posibilidad de que la tasa de palanqueo producida por ratas aumente si dicha respuesta produce estímulos exteroceptivos previamente asociados con la entrega de comida. Ratas macho sin experiencia experimental previa, fueron expuestas a 10 sesiones de 30 minutos en las cuales un programa TF 60-s presentaba comida, un oscurecimiento de la cámara experimental y la activación del comedero. Después de la fase de entrenamiento, inició una fase de adquisición durante la cual los sujetos podían recibir uno de 4 diferentes programas de reforzamiento durante 20 sesiones consecutivas: 1) RFC; 2) RF1, TF 2-s; 3) RF1, TF 5-s o 4) RF1, TF 10-s; nueve sujetos fueron expuestos a cada condición. Los resultados obtenidos fueron comparados con condiciones en las cuales: 1) se entregó comida durante la fase de adquisición, 2) se presentó un oscurecimiento de la cámara experimental y se operó el comedero durante la fase de entrenamiento, pero no se entregó comida y 3) los sujetos permanecieron en la cámara experimental durante la fase de entrenamiento, pero sin estímulos exteroceptivos programados. Tanto la condición de reforzamiento primario como la de reforzamiento condicionado produjeron gradientes de demora; sin embargo, el segundo fue considerablemente más inclinado que el primero. Los resultados son discutidos en términos de su similitud con estudios anteriores; también se discutieron en términos del debate contemporáneo sobre el concepto de reforzamiento condicionado.
Palabras clave: Reforzamiento condicionado, demora de reforzamiento, adquisición de la respuesta, presión de la palanca, ratas.
The concept of conditioned reinforcement is pervasive within the analysis of behavior, in both experimental and applied contexts (Williams, 1994). Despite its extensive use as an explanatory concept, contemporary research on the subject has failed to produce convincing evidence of its existence (Fantino & Romanovich, 2007).
One approach, that has been frequently used to try to produce evidence of conditioned reinforcement, compares response rates produced in the initial link of tandem and chained reinforcement schedules. The basic assumption behind this approach is that the exteroceptive stimulus, presented during the chain terminal component, should help sustain response rate during the early link. This may occur, because reinforcement delivery occurs in the presence of the terminal signal, and thus the former endows the latter with reinforcing properties. In contrast, responding in the early links of tandem schedules should be considerably lower, because response emission in the early link will not produce a signal associated with reinforcement delivery.
In agreement with a conditioned reinforcement account of chained schedules, Gollub's (1977) review, suggests that, response rates in the initial link of two component chained FI schedules, is higher, than that observed in equivalent tandem schedules. Gollub's conclusions, however, are not in agreement with the data produced by several studies. For instance, Malagodi, De Weese, & Johnston (1973) found response rates in the initial link of two component FI tandem and chained schedules, were homogeneously low. In another study, Wallace, Osborne, & Fantino (1982) found response rates in the initial link of two component tandem schedules, were higher than those produced by equivalent chain schedules. Kelleher & Fry (1962) found a similar result using three component FI tandem and chained schedules. Response rates in the initial links of three and five component FR chained schedules were also lower than those produced by equivalent tandem schedules (Jwaideh, 1973).
In summary, this initial approach has failed to produce conclusive evidence. The failure of the approach has led other scientists to develop new procedures to assess the effects of exteroceptive stimuli in chained reinforcement schedules. For instance, Royalty, Williams, & Fantino (1987) reasoned that variables that affect primary reinforcer value should similarly change conditioned reinforcement value. This reasoning led them to suggest that, if delayed food sustains lower response rates than immediate reinforcement (Skinner, 1938; Williams, 1976; Sizemore & Lat-tal, 1977), delayed stimulus change in chained schedules should accordingly sustain lower response rates than immediate stimulus change. In order to assess this possibility Royalty et al. exposed pigeons to three component chained schedules where schedule transition could occur immediately (eg., VI 33-s, VI 33-s, VI 33-s) or after a 3-s delay (eg., VI 30-s FT 3-s, VI 33-s, VI 33-s). The scientists assessed the effect of delaying component transition in the initial and middle link of the chain. In agreement with their hypothesis, results showed response rates were lower when stimulus change was delayed. This effect was consistent in both the initial and middle links of the chain, for all subjects.
The temporal separation between an operant response and stimulus change, in chained schedules, was used by Royalty etal. (1987) to determine if a cue presented within the framework of a chained schedule, acquired conditioned reinforcement properties. However response-cue separation in chained reinforcement schedules is an experimental manipulation that appears in a number studies with different theoretical interests. The following account presents published studies within different theoretical frameworks but with a common independent variable, response-signal temporal separation in chained schedules. Their presentation is relevant because they permit the reader to follow the experimental manipulation across different experimental procedures; additionally they permit the reader to assess different results and interpretations associated with the independent variable.
Tombaugh & Tombaugh (1971), exposed naïve rats to a chained FR 1 FT 7.5-s schedule and varied the placement of a 1.5-s visual cue across the delay interval. Their results showed response latency was high when no cue was present, and low with a continuous signal; intermediate latencies were found when the signals were located at the beginning or end of the delay interval. After response acquisition was accomplished, Tombaugh & Tombaugh exposed the experimental subjects to an extinction condition. During extinction response latencies in both the continuous and late signal conditions were considerably shorter than those obtained under the no-cue and early cue conditions. The scientists interpreted their results in Pavlovian terms, suggesting that temporal contiguity between the CS and the UCS is an important variable for both respondent and operant behavior.
Lieberman, Davidson, & Thomas (1985) exposed naïve pigeons to RI 20-s Ft 6-s chained schedule where a 1-s change in illumination followed the RI component immediately or 3-s after the emission of the response that initiated the FT. Reinforcement was delivered at the end of the schedule if the peck that initiated the FT component was delivered at the side of the response key selected by the experimenter as "correct." Results showed "correct" choices for both the immediate and delayed signal conditions were very similar. The scientist interpreted their finding as evidence that delayed signals do not reduce drive in pigeons.
Taken together the studies designed to evaluate the effects of response-signal temporal separation in reinforcement schedules are difficult to assess. The Royalty et al. (1987) study shows separation of the response from the signal has clear detrimental effects on response rate maintenance. The Tombaugh & Tombaugh (1971) study suggests a late signal may enhance resistance to extinction. Finally, the Lieberman, Davidson & Thomas (1985) study shows delayed cue presentation may not affect acquisition of a discrimination response.
In synthesis, two different approaches have tried to assess the possibility that, exteroceptive stimuli that occur in proximal contiguity with reinforcement, acquire reinforcing properties. Both approaches have produced contradictory results. Thus the purpose of the present study is to suggest an alternative strategy that may help assess the conditioned reinforcement hypothesis. Lattal & Gleeson (1990) produced data that suggest that key pecking by pigeons may be established without explicit shaping and under conditions of delayed reinforcement. Thus, if pairing an exteroceptive stimulus with primary reinforcement endows the former with reinforcing properties, it should be possible to establish an operant response using this same stimulus. Additionally, if this stimulus truly acquires reinforcing properties, then response rates maintained by it, should be a decreasing function of delay duration (as occurs with response rates maintained by primary reinforcement).
As a matter of fact, the idea presented in the previous paragraph, has already been proposed by other scientists (see for instance, Hendry, 1969 or Wike, 1966). Additionally, this idea has been empirically assessed by Bersh (1951) and by Snyc-ersky, Laraway, & Poling (2005). The study conducted by Bersh (1951) is difficult to evaluate, because the paper omits key features of the experimental procedure. However, the study conducted by Snycersky et al. (2005) adequately describes its experimental procedure, and is thus presented in some detail in this paper.
In their study, Snycersky et al. (2005) exposed rats to 3 sessions of response independent water deliveries, using a VT 30-s schedule. Subjects were exposed to one of four different tandem FR 1, DRO (0, 15, 30, or 45-s). During the acquisition phase, lever pressing produced immediate or delayed presentation of the empty dipper cup. Snycerky et al. (2005) found evidence of response acquisition under immediate reinforcement conditions; evidence of response acquisition decreased as delay increased. It is possible that the rather tenuous results produced by Snycerky et al. (2005) may be due to the relatively short training periods used by the scientists. This short training period may have failed to adequately pair water with dipper presentation; additionally, the DRO contingencies used in the aforementioned study may favor low reinforcement rates, this may, in turn, make response-reinforcer pairing complicated (see for instance Lattal, 1987).
In the present study, acquisition with delayed and immediate reinforcement was assessed using ten thirty minute sessions in the training phase (in order to "adequately" pair magazine clicking and light illumination changes with food delivery); additionally an FT schedule (instead of a DRO) was used to program delay duration. The present study also included comparison conditions that were lacking in the Snycersky et al. (2005) study. In one comparison condition, periodic exteroceptive stimuli were presented during the training phase (but without food presentation); in another comparison condition, subjects remained in the experimental chamber for ten thirty minute sessions; in yet another comparison condition, subjects received immediate or delayed food during the acquisition phase.
Seventy two naive male Wistar Lewis rats were used as subjects. All subjects were approximately four months old at the beginning of the study. Each subject's weight was registered on three consecutive days under free-feeding conditions to determine ad libitum body weight; food was then restricted until all subjects reached 80% of their free-feeding weight. Subjects were kept at their prescribed body weights throughout the experiment by means of supplementary feeding following each experimental session. Subjects were kept on the Laboratory vivarium under constant temperature conditions and a twelve-hour light-dark cycle (lights on at 7:00 a.m.). All experimental subjects were kept in individual cages with free access to water. Two subjects (AA5 and S16) died for unknown reasons before complete data could be collected. Their data were thus excluded from the analysis.
Sessions were conducted in a custom-built rodent operant conditioning chamber made of transparent Plexiglas. The space in which the subjects were studied measured 18.5 cm in height by 23.5 cm length by 23.5 cm depth. A stainless steel lever made of a 3 cm bar topped by a 2 cm diameter metal disk was placed on the front wall of the chamber. The lever was placed 5.5 cm above the floor and 11 cm apart from each wall. The lever required a force of at least 24 grams for depression. A depression of the lever produced an audible click and was counted as a response. A 5 cm diameter metal plate located two cm below and to the right of the lever was used as a pellet receptacle. A BRS-LVE, PDH-020 pellet dispenser delivered 4 .25 mg pellets in each emission. Pellets were produced by means of remolding pulverized Purina Nutri Cubes. One 1.1 W, 28 Vdc pilot light with a glass translucent cover was used to illuminate the experimental chamber. The light was located inside the box 7 cm above the food receptacle. The conditioning chamber was housed inside a sound-attenuating larger wooden box equipped with a ventilating fan. Experimental events were programmed and recorded using an IBM compatible 386 microcomputer equipped with an industrial automation card (Advantech PC-Labcard 725) coupled to a relay rack.
The present study may be conceptualized as a (4 x 4) between groups experiment with two independent variables: 1) training procedure (primary reinforcement, conditioned reinforcement, periodic stimulus presentation and permanence in chamber conditions) and 2) delay of reinforcement duration (0, 2, 5 or 10-s). Subjects in the conditioned reinforcement condition were exposed to a training phase in which a 3-s blackout was presented periodically, using a FT 60-s schedule; the stimulus was paired with the operation of the food magazine and food delivery. In the acquisition phase, these same subjects were then exposed to reinforcement schedules where lever pressing produced a blackout, and the sound of the operating (but empty) food magazine. Subjects in another condition (primary reinforcement condition) were exposed to the same training phase; however, during the acquisition phase, lever pressing produced both blackout and food. Subjects in yet another condition (periodic stimulus presentation) were exposed, during training phase, to non contingent presentations of the blackout and operating food magazine, but without food delivery. During acquisition phase, lever pressing produced the same exteroceptive stimuli presented during training phase. Finally, in a third comparison condition (permanence in chamber), subjects were introduced inside the conditioning chamber, but without any programmed event. During the acquisition phase, lever pressing produced both a blackout and the operation of the empty food magazine. The experimental design allocated 9 subjects to each group in the conditioned reinforcement condition, (this being the critical manipulation of the study). Only 3 subjects were assigned in each group of the comparison conditions because results of those manipulations have been reported extensively in other studies (see Pulido, Paz, & Sosa, 2008, for a review).
A summary of the experimental procedures is shown in Table 1.
During training phase, subjects in the "conditioned reinforcement condition" and "primary reinforcement condition," were exposed, to a 60-s Fixed Time Schedule (FT 60-s). This schedule delivered food paired with a 3-s blackout. Subjects in the "periodic stimulus presentation condition," were exposed to the previously described schedule, however only the blackout, and the operation of the empty food magazine were produced. Subjects in the "permanence in chamber condition" were introduced into the experimental chamber; however, no programmed exteroceptive stimuli were presented during that time. During training phase, the lever was removed from the chamber. This phase was in effect for 10 consecutive 30 minute sessions. Training sessions were conducted at approximately the same time each day.
For subjects in the "conditioned reinforcement," "periodic stimulus presentation," and "permanence in chamber" conditions, acquisition phase consisted of exposure to schedules where lever pressing produced a 3-s blackout and the sound of the operating (but empty) food magazine; operation of the lever, produced food and blackout, for subjects in the "primary reinforcement condition" Tandem FR 1, FT (0, 2, 5 or 10-s) schedules were in effect during acquisition phase. As this experiment was conceptualized as a between groups design, within each condition, groups of different subjects were assigned to one of the four different delay durations. During acquisition phase the lever was inserted into the operant chamber. This phase lasted 20 one hour consecutive sessions. Acquisition sessions were conducted at approximately the same time each day.
Figure 1 shows lever press rate (responses per minute) for all sessions, subjects and conditions, as a function of exposure to the different reinforcement contingencies. The first row shows the primary reinforcement condition; the second row shows the secondary reinforcement condition; the third row shows the periodic stimulus presentation condition; lastly, the fourth row shows the permanence in chamber condition. The first column shows the CRF schedules; the second column shows the Tandem FR 1, FT 2-s schedule conditions; the third column shows the Tandem FR 1, FT 5-s schedule conditions; the fourth column shows the Tandem FR 1, FT 10-s schedule conditions. In all graphs the "y" axis shows response rate per minute; the "x" axis shows consecutive sessions. The primary reinforcement condition shows considerably higher response rate values in the "y" axis than any other condition. Thus, in order permit an adequate assessment of conditions with lower response rates, the primary reinforcement condition was graphed using a different scale.
In general, response rates produced by the primary reinforcement conditions are higher than those produced by the conditioned reinforcement conditions or any of the other comparison conditions. The primary reinforcement condition also shows a clear delay gradient (response rates monotonically decrease as delay duration increases). Two subjects in the primary reinforcement condition did not produce any substantial number of responses "X19" in the CRF condition, and "AA8" in the tandem FR 1, FT 5-s condition. Regarding the conditioned reinforcement condition, response variability is high, especially in the CRF and tandem FR 1, FT 2-s groups; in both groups some subjects show very high response rates, particularly "Y2" and "Y1," in the CRF condition; and "S10" in the tandem FR 1 FT 2-s condition. As a matter of fact, these subjects substantially increase response rates, creating a situation that makes it difficult to assess response trends in the rest of the subjects (and also in the subjects exposed to the periodic stimulus and permanence in chamber conditions). Regarding the third and fourth conditions, response rates are homogeneously low, now and then, some subjects present high response rates. However, this response bursts are not always correlated with short delay durations (see for instance, subjects "T6," "W9" and "X2"). Perhaps the only consistent findings that may be appreciated in the first Figure are the homogenously low response rates produced by the Tandem FR1, FT 10s condition, and the consistently high response rates produced by the primary reinforcement condition.
In order to further assess the effects of the independent variables, a 4x4 two-way analysis of variance was conducted using experimental condition and delay duration as independent variables, and the average response rate per minute from the experimental groups as dependent variable. Results showed that the main effect for the experimental conditions attained statistical significance (F(3,69)=25.064, p=.000); the main effect for delay duration also attained statistical significance (F(3,69)=3.616, p=.019); the interaction of experimental conditions and delay duration, also attained significance (F(9,69)=2.137, p=.042). The results of the analysis should be read with care (particularly the main effects of the experimental conditions), as the number of subjects in the groups differs. However the main effects produced by the experimental conditions appear logical, due to the very high rates produced by the primary reinforcement condition, and the comparatively lower rates produced in the other three conditions.
Due to the high variability in response rates between subjects and conditions, group means were calculated. Figure 2 shows mean response rates in the "Y" axis and delay duration in the "X" axis. The first graph in the figure corresponds to the primary reinforcement condition; the second graph corresponds to the conditioned reinforcement condition; the third graph corresponds to the periodic stimulus presentation condition; the last graph shows the data produced by those subjects that remained in the experimental chamber during the training phase. Each graph presents its own particular scale in the "Y" axis to facilitate trend identification.
Figure 2 shows mean response rates were higher in the primary reinforcement condition, followed by the conditioned reinforcement condition, "periodic noise" condition and the "permanence in box" condition. Only two of the graphs show evidence of a delay gradient, the primary reinforcement condition and the conditioned reinforcement condition. The gradient produced by primary reinforcement shows that response rates decrease in a rather gradual fashion; in contrast, the gradient produced by conditioned reinforcement shows that response rates decrease in an abrupt way (they reach very low and asymptotic levels at the 5-s delay condition). Beside each graph appear linear and negative exponential regression equation calculated for delay duration (as independent variable) and mean response rate per minute as dependent variable. As conditions 3 and 4 showed no evidence of a delay gradient, regression equations are not provided. The equations show that the relationship between delay duration and response rate is best described by a linear model, in the primary reinforcement condition; however a negative exponential model best describes the relationship between the variables in the conditioned reinforcement condition.
Table 2 shows three different response acquisition criterions used to assess performance in the conditioned reinforcement condition. The table shows if the different experimental subjects reached an arbitrarily chosen limit of 10 reinforcers, within one session; the table also shows the number of sessions that each subject required in order to first earn 10 reinforcers in one session; lastly, the table shows the maximum number of reinforcers obtained by each subject. Delay duration increases from left to right. The bottom of the table shows the total number of subjects that reached the acquisition criterion for each delay value; the average number of sessions required to earn 10 reinforcers and the average maximum number of reinforcers, for each delay are also shown. These criterions were selected because they have been used by scientists in other acquisition studies (see Pulido, Sosa, & Valadez, 2006 for a review) and because previous studies have shown that decisions regarding response acquisition phenomena change little using more stringent criterions (see Pulido, Paz, & Sosa, 2008).
In general, Table 2 shows that subjects exposed to 0 or 2-s delay conditions, had a higher probability of reaching the 10 reinforcers acquisition criterion than subjects exposed to the long delay conditions (16 out of 18 subjects in the short delay duration attained the criterion; only 6 out of 17 subjects in the long delay duration attained the criterion). A Chi square test, comparing the acquisition frequencies between the long and short delays attained statistical significance (Chi(1)=7.68, p=.000). Averaging the number of sessions the subjects required to produce 10 reinforcers, for the first time, showed that the lower averages occur in the short delay conditions (2.14 for 0 delay and 3.89 for 2-s delay), and that the higher averages occur with the long delay conditions (5.75 for 5-s delay and 4.4 for 10-s delay). Averaging the maximum number of reinforcers earned in each delay condition, showed that the higher averages occur in the short delay conditions (23.2 for 0 delay and 24.1 for 2-s delay), and that the lower averages occur with the long delay conditions (8.5 for 5-s delay and 16.5 for 10-s delay).
In general, the results of the study show that rats lever press when this behavior is associated with stimuli correlated with food delivery; this behavior is more likely to occur under immediate reinforcement, and under conditions of briefly delayed reinforcement; conversely this behavior is infrequent under long delay durations (5 or 10-s). Results also show that an increase in lever-pressing may occur, when this behavior is correlated with the activation of the empty food magazine and an illumination change (even in the absence of pairing between exteroceptive stimuli and food), however this behavior is infrequent and does not seem to produce a delay gradient. Results from the present study also suggest that delay gradients produced under conditioned reinforcement conditions are steep, (relative to those produced by primary reinforcement delivery). Results also show great variability in response rate, both between and within experimental conditions.
Thus, the data produced by the present study suggest that the results published by Snycersky et al. (2005) are replicable; furthermore, they may occur even when stimulus-reinforcement pairing consists of the association between food and a blackout. The present study also extends previous results by showing that evidence of response acquisition with conditioned reinforcement may appear even when an FT schedule is used to program the delay interval, and after ten consecutive sessions of food-stimuli pairing, according to an FT 60-s schedule. The results produced by the present study differ from those produced by Snycerky & colleagues (2005), because the subjects in our study were responding at extremely low response rates with delay values of 5 or 10-s; some subjects in the Snycerky et al (2005) study were consistently responding with delays of 15, 30 and 45-s (4, 4 and 3 respectively).
Thus the data produced so far, regarding the possibility of producing response acquisition with delayed conditioned reinforcement, suggest that it may be used to increase the frequency of responding by naïve rats. The results also suggest that conditioned reinforcement value is degraded, relative to primary reinforcement; this conclusion may be supported by comparing the gradients produced by primary and secondary reinforcement in this study; it is additionally supported by comparing the Snycerky et al. (2005) study, with other studies where primary reinforcement has been used to establish lever pressing (see for instance Ávila & Bruner, 1995; Bruner, Ávila, & Gallardo, 1996; Pulido, Paz & Sosa, 2008).
Regarding the theoretical relevance of the present findings, several scientists have argued that seventy years of research using the methods designed by the experimental analysis of behavior have failed to produce compelling evidence that may support the notion that primary reinforcement may transfer reinforcing properties to other stimuli (Fantino & Romanovich, 2007; Squires, 1972; Staddon & Cerutti, 2003). This conclusion is partly based, on the failure of the chained-tandem comparisons to produce comparably higher response rates during the signaled schedule (Royalty, Williams & Fantino, 1987); it is also partly based on the relatively inconsistent results produced by studies that have assessed the effects of separating the signal from the response in delayed-signaled reinforcement studies (see Pulido & Martínez, 2010 for a review). Other reasons have also been argued (Davidson & Baum, 2006). To date, the results produced by Snycerky et al. (2005) (and those produced in the present study), offer evidence that support the idea that the conditioned reinforcement concept is empirically valid for behavior analysis. The studies offer no clues that may help understand why previous studies have failed to produce data that are in theoretical agreement with the concept, however they do offer evidence that it may not yet be discarded.
Regarding a research agenda for response acquisition with delayed conditioned reinforcement, the present authors consider that the training operations that help bond the primary reinforcer with other exteroceptive stimuli are still poorly understood. The present study hypothesized that 10 sessions would be enough to enhance the effects produced by Snycerky et al. (2005), yet it produced scant evidence of response acquisition under comparatively shorter delays. Pulido, Paz & Sosa (2008) have produced data that suggest that extensive exposure to response independent reinforcement may inhibit response acquisition. This finding suggests that a study that systematically evaluates the effects of the number of sessions of response independent food delivery, on response acquisition with conditioned reinforcement, could help determine the boundaries where these pairings are helpful (or detrimental) for response acquisition. Additionally, in the present study a FT 60-s schedule was used in the training phase, Snycersky et al. used a VT 60-s. The effects of the different types of schedules, during the training phase, have not been assessed yet. In a similar way, reinforcement interval duration effects, during the training phase, have yet to be assessed. The importance of this last variable on response acquisition using delayed conditioned reinforcement may not be discarded as stimulus presentation interval has proven to be an important variable for the development of conditioned stimulus (Prokasy & Whaley, 1963; Salafia, Mis, Terry, Bartosiak & Daston, 1973). Future studies could help clear these issues.
Avila, R., & Bruner, C. (1995). Response acquisition under long delays of signaled and unsignaled reinforcement. Revista Mexicana de Análisis de la Conducta, 21, 117-127. [ Links ]
Bersh, P. J. (1951). The influence of two variables in the establishment of a secondary reinforcer of operant responses. Journal of Experimental Psychology, 41, 62-73. [ Links ]
Bruner, C., Avila, R., & Gallardo, L. (1996) Acquisition with delayed reinforcement under combinations of response dependent and independent reinforcement. Revista Mexicana de Análisis de la Conducta, 22, 29-39. [ Links ]
Fantino, E., & Romanovich, P. (2007). The effect of conditioned reinforcement rate on choice: A review. Journal of the Experimental Analysis of Behavior, 87, 409-421, available vía: http://dx.doi.org/10.1901%2Fjeab.2007.44-06 [ Links ]
Gollub, L. R. (1977). Conditioned reinforcement: Schedule effects. In W.K. Honig & J.E.R. Staddon (Eds.), Handbook of operant behavior (pp 288-312). Englewood Cliffs, N.J.: Prentice-Hall. [ Links ]
Hendry, D. P. (1969). Introduction. In Hendry, D. P. (Ed.), Conditioned reinforcement (pp. 1-34). Illinois: Dorsey Press. [ Links ]
Jwaideh, A. R. (1973). Responding under chained and tandem fixed-ratio schedules. Journal of the Experimental Analysis of Behavior, 19, 259-267, available via: http://dx.doi.org/10.1901%2Fjeab.1973.19-259 [ Links ]
Kelleher, R. T., & Fry, W. T. (1962). Stimulus functions in chained fixed-interval schedules. Journal of the Experimental Analysis of Behavior, 5, 167-173, available via: http://dx.doi.org/10.1901%2Fjeab.1962.5-167 [ Links ]
Lattal, K. A. (1987). The effect of delay and intervening events on reinforcement value. In Commons, M. L., Mazur, J. E., Nevin, J. A. & Rachlin, H. (Eds.), Quantitative Analysis of Behavior ( Vol. 5). New Jersey, Lawrence Erlbaum Associates Publisher. [ Links ]
Lattal, K. A., & Gleeson, S. (1990). Response acquisition with delayed reinforcement. Journal of Experimental Psychology: Animal Behavior Processes, 16, 27-39, available via: http://dx.doi.org/10.1037%2F%2F0097-7403.16.1.27 [ Links ]
Lieberman, D. A., Davidson, F. H., & Thomas, G.V. (1985). Marking in pigeons: The role of memory in delayed reinforcement. Journal of Experimental Psychology: Animal Behavior Processes, 11, 611-624, available via: http://dx.doi.org/10.103 7%2F%2F0097-7403.11.4.611 [ Links ]
Malagodi, E. F., De Weese, J., & Johnston, J. M. (1973) Second order schedules: A comparison of chained, brief stimulus, and tandem procedures. Journal of the Experimental Analysis of Behavior, 20, 447-460. [ Links ]
Prokasy, W.F., & Whaley, F.L. (1963). Intertrial interval range shift in classical eyelid conditioning. Psychological Reports, 12, 55-58. [ Links ]
Pulido, M. A., & Martínez, G. (2010). Effects of response-signal temporal separation on behavior maintained under temporally defined schedules of delayed signaled reinforcement. The Psychological Record, 60, 115-136. [ Links ]
Pulido, M. A., Paz, M., & Sosa, R. (2008). The effects of behavioral history on response acquisition with delayed reinforcement: A parametric analysis. Revista Mexicana de Análisis de la Conducta, 34, 43-56. [ Links ]
Pulido, M. A., Sosa, R., & Valadez, L. (2006). Adquisición de la operante libre bajo condiciones de reforzamiento demorado: Una revisión. Acta Comportamentalia, 14, 5-21. [ Links ]
Royalty, P., Williams, B. A., & Fantino, E. (1987). Effects of delayed conditioned reinforcement in chain schedules. Journal of the Experimental Analysis of Behavior, 47, 41-56, available via: http://dx.doi.org/10.1901%2Fjeab.1987.47-41 [ Links ]
Salafia, W. R., Mis, F. W., Terry, W. S., Bartosiak, R. S., & Daston, A. P. (1973). Conditioning of the nictitating membrane response of the rabbit (Oryctolagus cuniculus) as a function of the length and degree of variation of the intertrial interval. Animal Learning and Behavior, 8, 85-91. [ Links ]
Sizemore, O. J., & Lattal, K. A. (1977). Dependency, temporal contiguity and response independent reinforcement. Journal of the Experimental Analysis of Behavior, 25, 119-125, available via: http://dx.doi.org/10.1901%2Fjeab.1977.27-119 [ Links ]
Skinner, B. F. (1938) The behavior of organisms. New York, Appleton-Century-Crofts. [ Links ]
Snycersky, S., Laraway, S., & Poling, A. (2005). Response acquisition with immediate and delayed conditioned reinforcement. Behavioural Processes, 68, 1-11, available via: http://dx.doi.org/10.1016%2Fj.beproc.2004.08.004 [ Links ]
Squires, N. (1972). Preference for conjoint schedules of primary reinforcement and brief stimulus presentation. Unpublished doctoral dissertation. University of California, San Diego. [ Links ]
Staddon, J. E. R., & Cerutti, D. T. (2003). Operant conditioning. Annual Review of Psychology, 54, 115-144, available via: http://dx.doi.org/10.1146%2Fannurev.psych.54.101601.145124 [ Links ]
Tombaugh, J. W., & Tombaugh, T. N. (1971) Effects on performance of placing a visual cue at different temporal locations within a constant delay interval. Journal of Experimental Psychology, 87, 220-224, available via: http://dx.doi. org/10.1037%2Fh0030583 [ Links ]
Wike, E. L. (1966). Secondary reinforcement: Selected experiments. Oxford: Harper & Row. [ Links ]
Wallace, F., Osborne, S., & Fantino, E. (1982). Conditioned reinforcement in two-link chain schedules. Behaviour Analysis Letters, 2, 335-344 [ Links ]
Williams, B. A. (1976). The effects of unsignalled delayed reinforcement. Journal of the Experimental Analysis of Behavior, 26, 441-449. [ Links ]
Williams, B. A. (1994). Conditioned reinforcement: Experimental and theoretical issues. The Behavior Analyst, 17, 261-285. [ Links ]
*The authors would like to thank the Universidad Intercontinental and the APIEC-UIC for their support in the conduction of the present study. The authors would also like to thank Guillermo Martínez for his help in the development of some of the figures and Marco Antonio Pulido Benítez for his revision of the manuscript. The authors also acknowledge their debt with the anonymous reviewers for their helpful comments. This study was conducted as part of the bachelor's degree requirements of the first author, and supervised by the second author.