Cortando con Procesos de Decisión Estocásticos respecto al contexto de las Redes de Petri

Clempner, Julio; Medel, Jesús; Cârsteanu, Alin

Services on Demand

Journal

Article

Indicators

Cited by SciELO
Access statistics

Computación y Sistemas

On-line version ISSN 2007-9737Print version ISSN 1405-5546

Comp. y Sist. vol.10 n.3 Ciudad de México Jan./Mar. 2007

Resumen de tesis doctoral

Setting Decision Process Optimization into Stochastic vs. Petri Nets Contexts

Cortando con Procesos de Decisión Estocásticos respecto al contexto de las Redes de Petri

Graduated: Julio Clempner
Center for Computing Research (CIC), National Polytechnic Institute
Av. Juan de Dios Batiz s/n, Edificio CIC, Col. Nueva Industrial Vallejo, 07738 Mexico City, Mexico
Center for Applied Science and High Technology Research (CICATA), National Polytechnic Institute
Legaria 69 Col. Irrigación, 11500 Mexico City, Mexico
e–mail:julio@k–itech.com

Advisor: Jesús Medel
Center for Computing Research, National Polytechnic Institute
Av. Juan de Dios Batiz s/n, Edificio CIC, Col. Nueva Industrial Vallejo, 07738 Mexico City, Mexico
e–mail: jjmedelj@cic.ipn.mx

Co–Advisor: Alin Cârsteanu
Center for Research and Advanced Studies (Cinvestav), National Polytechnic Institute
Av. IPN 2508, C.P. 07360, Col. San Pedro Zacatenco, Mexico City, Mexico
e–mail: alin@math.cinvestav.mx

Graduated on: November 24, 2006

Abstract

In this work we introduce a new modeling paradigm for developing decision process representation for shortest–path problem and games. Whereas in previous work, attention was restricted to tracking the net using as a utility function Bellman's equation, this work uses a Lyapunov–like function. In this sense, we are changing the traditional cost function by a trajectory–tracking function which is also an optimal cost–to–target function for tracking the net. This makes a significant difference in the conceptualization of the problem domain, allowing the replacement of the Nash equilibrium point by the Lyapunov equilibrium point in shortest–path game theory. Two different formal theoretic approaches are employed to represent the problem domain: i) Markov decision process and, ii) place–transitions Petri Nets having as a feature a Markov decision process, called Decision Process Petri nets (DPPN). The main point of this paper is its ability to represent the system–dynamic and trajectory–dynamic properties of a decision process. Within the system–dynamic properties framework we prove new notions of equilibrium and stability. In the trajectory–dynamic properties framework, we optimize the trajectory function value used for path planning via a Lyapunov–like function, obtaining as a result new characterizations for final decision points (optimum points) and stability. We show that the system–dynamic and Lyapunov trajectory–dynamic properties of equilibrium, stability and final decision points (optimum points) meet under certain restrictions. Moreover, we generalize the problem to join with game theory. We show that the Lyapunov equilibrium point coincides with the Nash equilibrium point under certain restrictions. As a consequence, all the properties of equilibrium and stability are preserved in game theory under certain restrictions. This is the most important contribution of this work. The potential of this approach remains in its formal proof simplicity for the existence of an equilibrium point. To the best of our knowledge the approach seems to be new in decision process, game theory and Petri Nets.

Keywords: shortest–path problem, shortest–path game, stability, Lyapunov, Markov decision process, Petri nets.

Resumen

En este trabajo se introduce un paradigma nuevo de modelado para representar procesos de decisión relacionados con el problema de la trayectoria más corta y teoría de juegos. Mientras que trabajos anteriores han restringido su atención a recorrer la red utilizando la ecuación de Bellman como función de utilidad, en este trabajo se utiliza una función de tipo Lyapunov. En este sentido, se está cambiando la función de costo tradicional por una función de trayectoria y costo a objetivo óptima. Esto genera una diferencia significativa en la manera que el dominio del problema es conceptuado permitiendo el cambio del punto de equilibrio de Nash por el punto de equilibrio de Lyapunov en teoría de juegos. Se utilizan dos aproximaciones teóricas diferentes para representar el dominio del problema: i) procesos de decisión de Markov, y ii) redes de Petri lugar–transición teniendo como característica un proceso de decisión de Markov. El punto principal del escenario propuesto es la habilidad de representar las propiedades de la dinámica del sistema y la dinámica de las trayectorias de un proceso de decisión. Dentro del marco de las propiedades dinámicas del sistema se muestran nuevas características de equilibrio y estabilidad. Dentro del marco de las propiedades de dinámicas por trayectoria del sistema se optimiza la función para calcular la trayectoria de planeación con una función del tipo Lyapunov, obteniendo como resultado una caracterización nueva para puntos finales de decisión (puntos óptimos) y estabilidad. Además, se muestra que las propiedades dinámicas del sistema y las propiedades dinámicas por trayectoria del sistema de equilibrio, estabilidad y puntos finales de decisión (puntos óptimos) convergen bajo ciertas restricciones. Inclusive, se generaliza el problema para desembocar en teoría de juegos. En ese contexto, se muestra que el punto de equilibrio de Lyapunov coincide con el punto de equilibrio de Nash bajo ciertas restricciones. Como consecuencia todas las propiedades de equilibrio, estabilidad y punto final de decisión persisten en teoría de juegos. Esta es la contribución más importante de este trabajo. La potencialidad de esta aproximación está en la simplicidad de la prueba formal para la existencia de un punto de equilibrio. Hasta lo que nuestro conocimiento alcanza este trabajo parece ser nuevo en procesos de decisión, teoría de juegos y redes de Petri.

Palabras clave: problemas de la trayectoria más corta, juegos con trayectoria más corta, estabilidad, Lyapunov, procesos de decisión de Harkov, redes de Petri.

DESCARGA ARTÍCULO EN FORMATO PDF

References

1. D. P. Bertsekas and S. E. Shreve. Stochastic Optimal Control: The Discrete Time Case, Academic Press, N.Y., 1978. [ Links ]

2. D. P. Bertsekas. Dynamic Programming: Deterministic and Stochastic Models, Prentice––Hall, Englewood Cliffs, N.Y., 1987. [ Links ]

3. P. Bertsekas and J. N. Tsitsiklis. Parallel and Distributed Computation: Numerical Methods. Prentice––Hall, Englewood Cliffs, N.Y., 1989. [ Links ]

4. P. Bertsekas and J. N. Tsitsiklis. An analysis of stochastic shortest path problems. — Mathematics of Operations Research, 16, 3, 580––595, 1991. [ Links ]

5. Blackwell. Positive Dynamic Programming. Proc. Fifth Berkeley Sympos. Math. Statist. and Probability (Berkeley, Calif., 1965/66), 1: Statistics, 415––418 Univ. California Press, Berkeley, Calif., 1967. [ Links ]

6. J. Clempner, J. Medel and A. Cârsteanu. Extending Games with Local and Robust Lyapunov Equilibrium and Stability Condition. International Journal of Pure and Applied Mathematics, 19, 4, 441–454, 2005. [ Links ]

7. J. Clempner. Colored Decision Process Petri Nets: Modeling, Analysis and Stability. International Journal of Applied Mathematics and Computer Science, 15, 3, 405–420, 2005. [ Links ]

8. J. Clempner. Modeling Shortest–Path Games with Petri Nets: A Lyapunov Based Theory. International Journal of Applied Mathematics and Computer Science, 16, 3, 387–397, 2006. [ Links ]

9. J. Clempner. Towards Modeling The Shortest Path Problem and Games with Petri Nets. Proc. of The Doctoral Consortium at the 27th International Conference on Application and Theory of Petri Nets and Other Models of Concurrency, 1–12, 2006. [ Links ]

10. C. Derman. Finite State Markovian Decision Processes, Academic Press, N.Y., 1970. [ Links ]

11. E. B. Dynkin. The Optimum Choice of the Instant for Stopping a Markov Process, Soviet Math. Doklady, 150, 238––240, 1963. [ Links ]

12. J. H. Eaton and L. A. Zadeh. Optimal Pursuit Strategies in Discrete State Probabilistic Systems. Trans. ASME Ser. D, J. Basic Eng., 84, 23––29, 1962. [ Links ]

13. R L Grigelionis and A. N. Shiryaev. On Stefan's Problem and Optimal Stopping Rules for Markov Processes. Theory of Probability and its Applications, 11, 541––558, 1966. [ Links ]

14. Hernández–Lerma and J.B. Lasserre. Discrete–Time Markov Control Process: Basic Optimality Criteria. — Berlin, Germany : Springer, 1996. [ Links ]

15. Hernández–Lerma, G. Carrasco and R. Pére–Hernández. Markov Control Processes with the Expected Total Cost Criterion: Optimality, Stability and Transient Model. Acta Applicadae Matematicae, 59, 3 229–269, 1999. [ Links ]

16. Hernández–Lerma and J.B. Lasserre. Futher Topics on Discrete–Time Markov Control Process. Berlin, Germany: Springer–Verlag, 1999. [ Links ]

17. K. Hinderer and K. H. Waldmann. The Critical Discount Factor for Finite Markovian Decision Process With an Absorbing Set. Mathematical Methods of Operation Research, 57, 1–19, 2003. [ Links ]

18. K. Hinderer and K. H. Waldmann. Algorithms for Countable State Markov Decision Model with an Absorbing Set. SIAM Journal of Control and Optimization, 43, 2109–2131, 2005. [ Links ]

19. A. Howard. Dynamic Programming and Markov Processes. MIT Press, Cambridge, 1960. [ Links ]

20. R. E. Kalman and J. E. Bertram. Control System Analysis and Design Via the "Second Method" of Lyapunov. Journal of Basic Engineering, 82(D), 371–393, 1960. [ Links ]

21. H. J. Kushner and S. G. Chamberlain. Finite State Stochastic Games: Existence Theorems and Computational Procedures, IEEE Transactions on Automatic Control, 14, 3, 1969. [ Links ]

22. P. R. Kumar and T. H. Shiau. Zero Sum Dynamic Games, in Control and Dynamic Games, (C. T. Leondes, ed.)– — Academic Press, 1345–1378, 1981. [ Links ]

23. H. Kushner. Introduction to Stochastic Control, Holt, Rinehart, and Winston, N.Y., 1971. [ Links ]

24. Lakshmikantham, S. Leela and A.A. Martynyuk, Practical Stability of Nonlinear Systems, World Scientific, Singapore, 1990. [ Links ]

25. V. Lakshmikantham, V.M. Matrosov and S. Sivasundaram, Vector Lyapunov Functions and Stability Analysis of Nonlinear Systems, Kluwer Academic Publ., Dordrecht, 1991. [ Links ]

26. P. Mandl and E. Seneta. The theory of non–negative matrices in a dynamic programming problem. The Australian Journal of Statistics, 11, 85–96, 1969. [ Links ]

27. Murata T. (1989). Petri Nets: Properties, Analysis and Applications. Proceedings of the IEEE, 77, 4, 541–580. [ Links ]

28. J. Nash: Non–cooperative games, Ann. Math. 54 (1951), 287–295. [ Links ]

29. J. Nash: Essays on Game Theory, Elgar, Cheltenham 1996. [ Links ]

30. J. Nash: The essential John Nash, Eds: H.W. Kuhn and S. Nasar, Princeton UP 2002. [ Links ]

31. K. M. Passino, K. L. Burguess and A. N. Michel. Lagrange Stability and Boundedness of Discrete Event Systems, Journal of Discrete Event Systems: Theory and Applications, 5, 383–403, 1995. [ Links ]

32. S. D. Patek. Stochastic Shortest Path Games: Theory and Algorithms. PhD thesis, Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, MA, September 1997. [ Links ]

33. S. D. Patek and D. P. Bertsekas. Stochastic Shortest Path Games. SIAM Journal on Control and Optimization, 37, 3, 804–824, 1999. [ Links ]

34. S. D. Patek. On Terminating Markov Decision Processes with a Risk Averse Objective Function. Automatica, 37, 9, 1379–1386, 2001. [ Links ]

35. R. Pallu de la Barriere. Optimal Control Theory, Saunders, Phila., 1967. [ Links ]

36. S. R. Pliska. On the transient case for Markov decision chains with general state space. —––In: Dynamic Programming and Its Applications, (M.L. Puterman, Ed.). New York: Springer, 335––349, 1978. [ Links ]

37. S. Poznyak, K. Najim and E. Gomez–Ramirez. Self–learning control of finite Markov chains. Marcel Dekker, Inc., New York, 2000. [ Links ]

38. U. Rieder. Bayesian Dynamic Programming. Advances in Applied Probability, 7, 330–348, 1975. [ Links ]

39. L. S. Shapley. Stochastic Games, Proceedings of the National Academy of Sciences, Mathematics, 39, 1095–1100, 1953. [ Links ]

40. N. Shiryaev. Optimal Stopping Problems. Springer–Verlag, N.Y., 1978. [ Links ]

41. R. Strauch. Negative Dynamic Programming. Ann. Math. Statistics, 37, 871––890, 1966. [ Links ]

42. J. van der Wal, Stochastic Dynamic Programming, Mathematical Centre Tracts 139, Mathematisch Centrum, Amsterdam 1981. [ Links ]

43. F., Jr. Veinott. Discrete Dynamic Programming with Sensitive Discount Optimality Criteria. Ann. Math. Statistics, 40, 5, 1635––1660, 1969. [ Links ]

44. P. Whittle. Optimization over Time. Wiley, N.Y., 2, 1983. [ Links ]