SciELO - Scientific Electronic Library Online

 
vol.9 issue4A Reactive Location Routing Algorithm with Cluster-Based Flooding for Inter-Vehicle CommunicationGeometrical Modeling of Wideband MIMO Channels author indexsubject indexsearch form
Home Pagealphabetic serial listing  

Services on Demand

Journal

Article

Indicators

Related links

  • Have no similar articlesSimilars in SciELO

Share


Computación y Sistemas

On-line version ISSN 2007-9737Print version ISSN 1405-5546

Comp. y Sist. vol.9 n.4 Ciudad de México Apr./Jun. 2006

 

Artículos

 

A Supervised Discretization Method for Quantitative and Qualitative Ordered Variables

 

Un método de Discretización Supervisada para Variables Cuantitativas y Cualitativas Ordenadas

 

Francisco J. Ruiz1, Cecilio Angulo1 and Núria Agell2

1 Knowledge Engineering Research Group. Universitat Politècnica de Catalunya
Av. Víctor Balaguer s/n. 08800 Vilanova i la Geltrú (Spain)

francisco.javier.ruiz@upc.edu, cecilio.angulo@upc.edu

2 Department of Quantitative Methods Management. ESADE–Universitat Ramon Llull
Av. Pedralbes 62–65. 08034 Barcelona (Spain)

nuria.agell@esade.edu

 

Article received on November 15, 2005; accepted on January 01,2006

 

Abstract

In this work, a new technique to define cut–points in the discretization process of a continuous attribute is presented. This method is used as a prior step in a regression problem, considered as a learning problem in which the output variable can be either quantitative (continuous or discreet) or qualitative defined over an ordinal scale. The proposed method emphasizes the concept of location to determine discretization cut–points. In the case of continuous outputs, the method is based on the maximization of the difference between distributions by using intervalar distances. In the case of qualitative outputs, a qualitative distance is defined over a structure of absolute orders of magnitude. The main characteristics of the method presented are illustrated through three examples, two for continuous outputs and the last for a qualitative output.

Keywords: Supervised Discretization, Regression, Qualitative Reasoning, Intervalar distance.

 

Resumen

En este trabajo se presenta una nueva técnica para definir las fronteras en el proceso de discretización de una variable continua. Este método es usado como paso previo en un problema de regresión, considerado como un problema de aprendizaje en el cual la variable de salida puede ser cuantitativa (continua o discreta) o cualitativa definida sobre una escala ordinal. El método propuesto enfatiza el concepto de "localidad" para determinar las fronteras de las discretización. En el caso de variables continuas, el método se basa en la maximización de la diferencia entre distribuciones usando distancias intercalares, y en el caso de salidas cualitativas, en una distancia definida sobre una estructura de órdenes de magnitud absolutos. La principal característica del método se ilustra con tres ejemplos, dos para salidas continuas y un último con salidas cualitativas.

Palabras Clave: Discretización Supervisada, Regresión, Razonamiento Cualitativo, Distancia Intervalar.

 

DESCARGA ARTICULO EN FORMATO PDF

 

Acknowledgements

This work has been partially supported by the coordinated project MERITO (analysis and development of innovative soft–computing techniques integrating expert knowledge: an application to the measure of financial credit risk), funded by the Spanish Ministry of Science and Technology (TIC2002–04371–C02).

 

References

1. Núria Agell. Estructures matematiques per al model qualitatiu d'ordres de magnitud absoluts. Ph. D. Thesis. Universitat Politècnica de Catalunya, 1998.        [ Links ]

2. Núria Agell, F.J. Ruiz, and Cecilio Angulo. A kernel intersection defined on intervals. In Proc del Congrés Català d'Intelligència Artificial, 2004.        [ Links ]

3. Núria Agell, Xari Rovira, Francisco Ruiz, and Cecilio Angulo. Kernel machines for continuous and discrete variables: An application to credit risk measurement. In Proc. of the Learning 04, Elche, Spain, 2004.        [ Links ]

4. Catlett J. On changing continuous attributes into ordered discrete attributes. Proc. Fifth European Working Session on Learning. Berlin: Springer Verlag, pp. 164–177, 1991.        [ Links ]

5. J.Y. Ching, A.K.C. Wong, and K.C.C. Chan. Class–dependent discretization for inductive learning from continuous and mixed mode data. IEEE Transactions on Pattern Analysis and Machine Intelligence, 17(7):641–651, 1995.        [ Links ]

6. James Dougherty, Ron Kohavi, and Mehran Sahami. Supervised and unsupervised discretization of continuous features. In International Conference on Machine Learning, pages 194–202, 1995.        [ Links ]

7. U. M. Fayyad and K. B. Irani. Multi–interval discretization of continuous–valued attributes for classification learning. In Proc. of the 13th IJCAI, pages 1022–1027, Chambery, France, 1993.        [ Links ]

8. Luis González, Francisco Velasco, Cecilio Angulo, J.A. Ortega, and F.J. Ruiz. Sobre núcleos, distancias y similitudes entre intervalos. Revista Iberoamericana de Inteligencia Artificial, 8(23): 111–117, 2004.        [ Links ]

9. Ho, K.M and Scott, P.D. Zeta: A global method for discretization of continuous variables. In KDD97: 3rd International Conference of Knowledge Discovery and Data mining. Newport Beach, CA, pp. 191–194, 1997.        [ Links ]

10. Kerber, R. ChiMerge: Discretization of Numeric Attributes. Proc. 10th National Conference on Artificial Intelligence. MIT Press, pp. 123–128, 1992.        [ Links ]

11. Kurgan, L.A and Cios, K.J CAIM Discretization Algorithm. IEEE Transactions on Knowledge and Data Engineering, vol. 16, no. 2. pp. 145–153, 2004.        [ Links ]

12. H. Liu, F. Hussain, C. Lim Tam, and M. Dash. Discretization: An enabling technique. Data Mining and Knowledge Discovery, 6(4):393–423, 2002.        [ Links ]

13. Rovira, X., Agell, N., Sánchez, M., Prats, F. and Parra, X. An Approach to Qualitative Radial Basis Function Networks over Orders of Magnitude. Proceedings of 18th International Workshop on Qualitative Reasoning. 2004.        [ Links ]

14. Travé–Massuyès, L. and Dague, P. Modèles et raisonnements qualitatifs. Hermès, 2003.        [ Links ]

15. Wang, K. and Liu, B. Concurrent discretization of multiple attributes. Pacific–Rim International Conference on AI. pp. 250–259, 1998.        [ Links ]

16. Wong, A.K.C. and Liu, T.S. Typicality, diversity and feature pattern of an ensemble, IEEE Trans. Computers, vol. 24, pp. 158–181, 1975.        [ Links ]

Creative Commons License All the contents of this journal, except where otherwise noted, is licensed under a Creative Commons Attribution License