SciELO - Scientific Electronic Library Online

 
vol.9 número3Formación de Medidas de Equivalencia entre ConjuntosProcedimiento para el Muestreo y Reconstrucción de Campos Gausianos índice de autoresíndice de materiabúsqueda de artículos
Home Pagelista alfabética de revistas  

Servicios Personalizados

Revista

Articulo

Indicadores

Links relacionados

  • No hay artículos similaresSimilares en SciELO

Compartir


Computación y Sistemas

versión On-line ISSN 2007-9737versión impresa ISSN 1405-5546

Comp. y Sist. vol.9 no.3 Ciudad de México ene./mar. 2006

 

Artículos

 

MONIL Language, an Alternative for Data Integration

 

El Lenguaje MONIL, una Alternativa para la Integración de Datos.

 

Mónica Larre1, José Torres–Jiménez, Eduardo Morales2, Juan Frausto–Solís and Sócrates Torres1

 

1ITESM Campus Cuernavaca
Av. Paseo de la Reforma 182–A Col. Lomas de Cuernavaca
monica.larre@itesm.mx, juan.frausto@itesm.mx, socrates@itesm.mx

2 Instituto Nacional de Astrofisica Optica y Electronica
Luis Enrique Erro 1, Sta. Ma. Tonantzintla, 72840 Puebla
emorales@inaoep.mx

 

Article received on September 07, 2001; accepted on January 21, 2005

 

Abstract

Data integration is a process of retrieving, merging and storing of data originated in heterogeneous sources of data.

The main problem facing the data integration is the structural and semantic heterogeneity of participating data. A concern of research communities in computer sciences is the development of semi–automatic tools to assist the user in an effective way in the data integration processes.

This paper introduces a programming language called MONIL, as an alternative to integrate data by means of design, storage and program execution. MONIL is based on the use of meta–data, conversion functions, a meta–model of integration and a scheme of integration suggestions. MONIL offers to the user a dedicated work environment with built–in semi–automatic tools supporting the integration process in three stages.

Keywords: data integration, integration language, databases, metadata.

 

Resumen

La integración de datos es el proceso de extracción, mezcla y almacenamiento de datos provenientes de fuentes de datos heterogéneas. El problema principal que enfrenta la integración de datos es la heterogeneidad estructural y semántica de los datos que participan.

Una preocupación en las comunidades de investigación de las ciencias computacionales, es el desarrollo de herramientas semiautomáticas que asistan a los usuarios de forma efectiva en los procesos de integración de datos. Este artículo presenta un lenguaje de programación llamado MONIL, como una alternativa para integrar datos mediante el diseño, almacenamiento y ejecución de programas. MONIL está basado en el uso de metadatos, funciones de conversión, un metamodelo de integración y un esquema de sugerencias de integración. MONIL ofrece al usuario un ambiente de trabajo dedicado con herramientas semiautomáticas integradas y que soportan un proceso de integración en tres etapas.

Palabras claves: integración de datos, lenguaje de integración, bases de datos, bodegas de datos, metadatos.

 

DESCARGA ARTICULO EN FORMATO PDF

 

References

1. J. Ullman, .Information integration using logical views. In. Proc. of the 6th International Conference on Database Theory (ICDT.97), vol. 1186 of Lecture Notes in Computer Science, pp. 19–40, 1997.        [ Links ]

2. N. Kushmerick, R. Doorenbos, and D. Weld. Wrapper induction for information extraction, 15th International Joint Conference on Artificial Intelligence, 1997.        [ Links ]

3. G. Wiederhold. Mediators in the architecture of future information systems, IEEE Computer, vol. 25, no. 3, pp. 38–42, 1992.        [ Links ]

4. S. Chawathe, H. Garcia–Molina, J. Hammer, K. Ireland, Y. Papakonstantinou, J. Ullman, and J. Widom. The TSIMMIS approach to mediation: Data models and languages, Journal of Intelligent Information Systems, 1997.        [ Links ]

5. G. Zhou, R. Hull, and R. King. Generating data integration mediators that use materialization, Journal of Intelligent Information Systems, vol. 3:2/3, no. 2/3, pp. 199–221, May 1996.        [ Links ]

6. L. Ling, M. T. Özsu, and L. Liu. Accesing heterogeneous data through homogenization and integration mediators, Second IFCIS Conference on Cooperative Information Systems (CoopIS97), 1997.        [ Links ]

7. W. Inmon. Building the Data Warehouse, 2nd ed. John Wiley and Sons, 1996.        [ Links ]

8. M. Jarke, C. Quix, D. Calvanese, Maurizio Lenzerini, E. Francosi, S. Ligoudistiano, P. Vassiliadis, and Y. Vassiliou. Concept based design of data warehouses: The DWQ demonstrators, In Proc. of the ACM SIGMOD International Conference on Management of Data, p.p. 591–2000.        [ Links ]

9. W. Inmon, R. Terdeman, and C. Imhof. Exploration Warehousing Turing Business Into Business Opportunity, John Wiley and Sons, Inc., 2000.        [ Links ]

10. R. Kimbal, L. Reeves, M. Ross, and W. Thornthwaite. The Data Warehouse Lifecycle Toolkit: Tools and Techniques for Designing, Developing, and Deploying Data Warehouses, John Wiley and Sons, 1998.        [ Links ]

11. R. Kimball, L. Reeves, M. Ross, and W. Thornthwaite. The Data Warehouse Lifecycle Toolkit : Expert Methods for Designing, Developing, and Deploying Data Warehouses, John Wiley and Sons; ISBN: 0471255475, 1998.        [ Links ]

12. M. Larre, S. Torres, J. Torres, and E. Morales. Un algoritmo para la integración de datos basado en el descubrimiento de relaciones, In. Proc. of the 7° Congreso Internacional de Investigaciones en Ciencias Computacionales (CIIC00), pp. 264–273, 2000.        [ Links ]

13. S. Torres, M. Larre, and J. Torres. A string representation methodology to generate syntactically valid genetic programs, WSEAS Transactions on Systems, vol. 1, p.p. 290, 2002.        [ Links ]

14. M. Larre, J. Torres, and E. Morales. Data integration with MONIL, metadata and correspondence suggestions, 3er. Encuentro Internacional de Ciencias de la Computación (ENC01), vol. 2, pp. 623–632, 2001.        [ Links ]

15. M. Larre, J. Torres, E. Morales, and S. Torres. Data integration using the MONIL language, Proceedings of ICEIS 2002 – the Fourth Conference on Enterprise Information Systems, 2002.        [ Links ]

16. M. Larre, J. Torres, and E. Morales. MONIL, the metadata and object integration language, Advances in information science and soft computing (ISBN 960 8052 602), p.p. 114, 2002.        [ Links ]

17. C. Beeri, G. Elber, T. Milo, Y. Sagiv, O. Shmueli, N. Tishby, Y. Kogan, D. Konopnic–ki, P. Mogilevski, and N. Slonim. Websuite–a tool suite for harnessing web data, In. Proc of the International Workshop on the Web and Databases, 1998.        [ Links ]

18. W. Cohen. Integration of heterogeneous databases without common domains using queries based on textual similarity, In. Proc. of ACM SIGMOD Conference on Management of Data, 1998.        [ Links ]

19. L. Haas, D. Kossmann, E. Wimmers, and J. Yang. Optimizing queries across diverse data sources, In Proc. of the International Conference on Very Large Data Bases (VLDB), 1997.        [ Links ]

20. Z. Ives, D. Florescu, M. Friedman, and A. Levy. An adaptative query execution system for data integration, Proc. of ACM SIGMOD Conference on Management of Data, 1999.        [ Links ]

21. S. Bergamaschi, G. Cabri, F. Guerra, L. Leonardi, M. Vincini, and F. Zambonelli. Supporting information integration with autonomous agents, 5th International Workshop CIA–2001 on Cooperative Information Agents,, 2001.        [ Links ]

22. O. Duschka, M. Genesereth, and A. Levy. Recursive query plans for data integration,. Journal of Logic Programming, special issue on Logic Based Heterogeneous Information Systems, 1999.        [ Links ]

23. M. Friedman and D. Weld, .Efficient execution of information gathering plans. In. Proc. of the International Joint Conference on Artificial Intelligence, 1997.        [ Links ]

24. H. Garcia–Molina, Y. Papakonstantinou, D. Quass, A. Rajaraman, Y. Sagiv, J. Ullman, and J. Widom. The TSIMMIS project: Integration of heterogeneous information sources, Intelligent Information System, vol. 8, num. 2, pp. 117–132, 1997.        [ Links ]

25. C. Knoblock, S. Minton, J. Ambite, N. Ashish, P. Modi, I. Muslea, A. Philpot, and S. Tejada. Modeling web sources for information integration,. In Proc. of the 15th National Conference on Artificial Intelligence, 1998.        [ Links ]

26. D. Beneventano and S. Bergamaschi. Extensional knowledge for semantic query optimization in a mediator based system, International Workshop on Foundations of Models for Information Integration (FMII–2001), 2001.        [ Links ]

27. D. Beneventano, S. Bergamaschi, F. Guerra, and M. Vincini. The MOMIS approach to information integration, AAAI International Conference on Enterprise Information Systems (ICEIS01), 2001.        [ Links ]

28. S. Bergamaschi, S. Castano, D. Beneventano, and M. Vincini. Semantic integration of heterogeneous information sources, Special Issue on Intelligent Information Integration, Data and Knowledge Engineering, vol. 36, no. 1, pp. 215–249, 2001.        [ Links ]

29. W. Kim, I. Choi, S. Gala, and M. Sheevel. On resolving schematic heterogeneity in multi–databases systems, Distributed and Parallel Databases, vol. 1, num. 3, 1993.        [ Links ]

30. S. Madnick. From VLDB to VMLDB (very MANY large data bases):dealing with large–scale semantic heterogeneity, in Proc. if the 21th International Conference on Very Large Databases, pp. 11–16, 1995.        [ Links ]

31. F. Saltor and E. Rodriguez. On intelligent access to heterogeneous information, In. Proc. of the 4th KRDB Workshop, 1997.        [ Links ]

32. M. Bright, A. Hurson, and S. Pakzad. Automated resolution of semantic heterogeneity in multi–databases, ACM Transactions on Database Systems, vol. 19, no. 2, pp. 212–253, 1994.        [ Links ]

33. R. Hull. Managing semantic heterogeneity in databases: A theorical perspective, 16th ACM SIGACT Symp. on Principles of Database Systems (PODS.97), 1997.        [ Links ]

34. S. C. Diego Calvanese and, F. Guerra, D. Lembo, M. Melchiori, G. Terracina, D. Ursino, and M. Vincini. Towards a comprehensive methodological framework for semantic integration of heterogeneous data sources, Proc. of the 8th Int. Workshop on Knowledge Representation meets Databases (KRDB 2001), 2001.        [ Links ]

35.  T. Häder, G. Sauter, and J. Thomas. The intrinsic problems of structural heterogeneity and an approach to their solution, VLDB Journal, vol. 8, no. 1, pp. 25–43, 1999.        [ Links ]

36. W. Klas, G. Fisher, and K. Aberer. Integrating relational and object oriented database systems using a metaclass concept, Journal of Systems Integration, vol. 4, no. 4, 1994.        [ Links ]

37. M. Roth and P. Scharz. Don.t scrap it, wrap it, 23th Conference on Very large Databases, 1997.        [ Links ]

38. S. Adali, K. Candan, Y. Papakonstantinou, and V. Subrahmanian. Query caching and optimization in distributed mediator systems, In. Proc. of the ACM SIGMOD Conference on Management Data, 1996.        [ Links ]

39. C. H. Goh, S. E. Madnick, and M. Siegel. Context interchange: Overcoming the challenges of large scale interoperable database systems in a dynamic environment, 3rd. International Conference on Information and Knowledge Management (CIKM.94), pp. 337–346, 1994.        [ Links ]

40. J. Hammer, M. Breunig, H. Garcia–Molina, S. Ñestorov, V. Vassalos, and R. Yerneni. Template based wrappers in the TSIMMIS system, In Proc. of the 26th SIGMOD International Conference on Management of Data, 1997.        [ Links ]

41. Cali, D. Calvanese, G. D. Giacomo, and M. Lenzerini. Accessing data integration systems through conceptual schemas, Proc. of the 20th International Conference on Conceptual Modeling (ER2001), pp. 270–284, 2001.        [ Links ]

42. D. Calvanese, G. De–Giacomo, M. Lenzerini, D.Ñardi, and R. Rosati. A principled approach to data integration and reconciliation in data warehousing., Proc. of the International Workshop on Design and Management of Data Warehouses (DMDW.99), vol. 19, p. 16, 1999.        [ Links ]

43. Levy, A. Rajaraman, and O. J.J. Query an answering algorithms for information agents, Proceeding of AAAI, 1996.        [ Links ]

44. Y. Arens, C. Chee, C. Hsu, and C. Knoblock. Retrieving and integrating data for multiple information sources, International Journal of Intelligent and Cooperative Information Systems, vol. 2, no. 2, pp. 127–158, 1993.        [ Links ]

45. D. Calvanese, G. D. Giacomo, M. Lenzerini, and M. Y. Vardi. View–based query processing and constraint satisfaction, Proc. of the 15th IEEE Sym. on Logic in Computer Science (LICS 2000), pp. 361.371, 2000.        [ Links ]

46. W. Litwin, L. Mark, and N. Roussopolos. Interoperatibility of multiple autonomous databases, ACM Computing Surveys, vol. 22, no. 3, pp. 267–293, 1990.        [ Links ]

47. R. Hull. Towards the study of performance trade of between materialized and virtual integrated views, Workshop on Materialized Views, pp. 91.102, 1996.        [ Links ]

48. Y. Levy, A. O. Mendelzon, Y. Sagiv, and D. Srivastava. Answering queries using views, Proceedings of the 14th ACM SIGACT SIGMOD–SIGART Symposium on principles of Database Systems, 1995.        [ Links ]

49. R. Pottinger and A. Y. Levy. A scalable algorithm for answering queries using views, Proc. of the Int. Conf. on Very Large Data Bases (VLDB), 2000.        [ Links ]

50. R. Hull and G. Zhou. A framework for supporting data integration using the materialized and virtual approaches, ACM SIGMOD International Conference on Management of Data, pp. 481–492, 1996.        [ Links ]

51. R. Pottinger and A. Y. Halevy. Minicon: A scalable algorithm for answering queries using views, VLDB Journal, 2001.        [ Links ]

52. D. Theodoratos, S. Ligoudistianos, and T. Sellis. Designing the global data warehouse with SPJ views, 11th Conference on Advanced Information Systems Engineering CAiSE.99), 1999.        [ Links ]

53. D. Srivastava, S. Dar, H. V. Jagadish, and A. Y. Levy. Answering SQL queries using materialized views, Proceedings of VLDB, 1996.        [ Links ]

54. Labrinidis and N. Roussopoulos. Reduction of materialized view staleness using on line updates, Proc. of Workshop on Materialized Views: Techniques and Applications (VIEW 1996), pp. 91–102, 1996.        [ Links ]

55. V. Kashyap and A. Sheth. Schema correspondences between objects with semantic proximity, Department of Computer Science Rutgers University, Tech. Rep. DCS–TR–301, 1993.        [ Links ]

56. J. Hopfcroft and J. Ullman. Introduction to Automata Theory, Languages and Computation, 2nd ed. Addison–Wesley Pub. Co., 2000.        [ Links ]

57. T. Kyte. Expert One on One: Oracle, 2nd ed. Wrox Press Inc ISBN: 1861004826, 2001.        [ Links ]

58. G. Harrison. Oracle SQL High–Performance Tuning, 2nd ed. Prentice Hall PTR ISBN: 0130123811, 2000.        [ Links ]

Creative Commons License Todo el contenido de esta revista, excepto dónde está identificado, está bajo una Licencia Creative Commons