SciELO - Scientific Electronic Library Online

vol.17 issue2Generation of Bilingual Dictionaries using Structural PropertiesSingle-Document Keyphrase Extraction for Multi-Document Keyphrase Extraction author indexsubject indexsearch form
Home Pagealphabetic serial listing  

Services on Demand




Related links

  • Have no similar articlesSimilars in SciELO


Computación y Sistemas

Print version ISSN 1405-5546

Comp. y Sist. vol.17 n.2 México Apr./Jun. 2013




Optimizing Selection of Assessment Solutions for Completing Information Extraction Results


Optimización de selección de soluciones de evaluación para completar los resultados de recuperación de información


Christina Feilmayr


Johannes Kepler University Linz, Institute of Application Oriented Knowledge Processing, Altenberger Straße 69, 4040 Linz, Austria


Article received on 14/12/2012
Accepted on 03/02/2013.



Incomplete information produces serious consequences in information extraction: it increases costs and leads to problems in downstream processing. This work focuses on improving the completeness of extraction results by applying judiciously selected assessment methods to information extraction based on the principle of complementarity. Our recommendation model simplifies the selection of assessment methods which can overcome a specific incompleteness problem. This paper also focuses on the characterization of information extraction and assessment methods as well as on a rule-based approach that allows estimation of general processability, profitability in the complementarity approach, and the performance of an assessment method under evaluation.

Keywords: Information extraction, information quality, method selection, data and text mining.



La información incompleta causa graves consecuencias en la extracción de la misma: aumenta los costos y propicia problemas para el procesamiento en cadena. El objetivo de este trabajo es presentar la mejora en los resultados de extracción con el fin de completarlos con métodos de evaluación juiciosamente selectos basados en el principio de complementariedad. El modelo propuesto simplifica la selección de los métodos de evaluación, los cuales pueden resolver un problema específico de información incompleta. Este artículo se enfoca también en la caracterización de la extracción de información y los métodos de evaluación con un enfoque basado en reglas que permita validar la capacidad de procesamiento general, la rentabilidad en el enfoque de complementariedad y el rendimiento de los métodos de evaluación.

Palabras clave: Extracción de información, calidad de información, selección del método, minería de datos y textos.





This work is supported by an Austrian research grant (FIT-IT Semantic Systems Dissertation Fellowship Project) from BMVIT (project 829601).



1. Batini, C. & Scannapieca, M. (2010). Data Quality: Concepts, Methodologies and Techniques, (Softcover reprint of hardcover 1st ed. 2006). Berlin; New York: Springer.         [ Links ]

2. Bloch, I., Hunter, A., et al. (2001). Fusion: General Concepts and Characteristics. International Journal of Intelligent Systems, 16(10), 1107-1134.         [ Links ]

3. Brazdil, P.B. & Henery, R.J. (1994). Analysis of Results. Machine Learning, Neural and Statistical Classification, (175-212), New York: Ellis Horwood.         [ Links ]

4. Brazdil, P., Giraud-Carrier, C., Soares, C., & Vilalta, R. (2009). Meta-Learning: Applications to Data Mining. Berlin: Springer.         [ Links ]

5. Castiello, C., Castellano, G., & Fanelli, A.M. (2005). Meta-Data: Characterization of Input Features for Meta-Learning. Modeling Decisions for Artificial Intelligence. Lecture Notes in Computer Science, 3558, 457-468.         [ Links ]

6. Charest, M. & Delisle, S. (2006). Ontology-Guided Intelligent Data Mining Assistance: Combining Declarative and Procedural Knowledge. 10th International Conference on Artificial Intelligence and Soft Computing, Palma de Mallorca, Spain, 9-14.         [ Links ]

7. Diamantini, C., Potena, D., & Storti, E. (2009). KDDONTO: An Ontology for Discovery and Composition of KDD Algorithms. Third Generation Data Mining: Towards Service-Oriented Knowledge Discovery (SoKD'09), Bled, Slovenia, 13-24.         [ Links ]

8. Feilmayr, C. (2012). Tackling Incompleteness in Information Extraction - A Complementarity Approach. The Semantic Web: Research and Applications. Lecture Notes in Computer Science, 7295, 808-812.         [ Links ]

9. Feilmayr, C. (2012). Characterization and Resolution of Incompleteness in (World-Wide-Web) Information Extraction. 23rd International Workshop on Database and Expert Systems Applications, Vienna, Austria, 241-245.         [ Links ]

10. Giraud-Carrier, C. (2008). Meta-Learning - A Tutorial, Technical Report for Tutorial on 7th International Conference on Machine Learning and Applications (ICMLA).         [ Links ]

11. Hilario, M. & Kalousis, A. (2001). Fusion of MetaKnowledge and Meta-Data for Case-based Model Selection. Principles and Practice of Knowledge Discovery. Lecture Notes in Computer Science, 2168, 180-191.         [ Links ]

12. Hilario, M., Kalousis, A., Nguyen, P., & Woznica, A. (2009). A Data Mining Ontology for Algorithm Selection and Meta-Mining. Third Generation Data Mining: Towards Service-Oriented Knowledge Discovery (SoKD'09), Bled, Slovenia, 76-87.         [ Links ]

13. Kietz, J.U., Serban, F., Bernstein, A., & Fischer, S. (2010). Data Mining Workflow Templates for Intelligent Discovery Assistance and Auto-Experimentation. Third-Generation Data Mining: Towards Service-oriented Knowledge Discovery (SoKD-10), Barcelona, Spain, 1-12.         [ Links ]

14. Rice, J.R. (1976). The Algorithm Selection Problem. Advances in Computation, 15, 65-118.         [ Links ]

15. Serban, F., Vanschoren, J., Kietz, J.U., & Bernstein, A. (2013). A Survey of Intelligent Assistants for Data Analysis. ACM Computing Surveys, 45(3).         [ Links ]

16. Smith-Miles, K.A. (2008). Cross-Disciplinary Perspectives on Meta-Learning for Algorithm Selection. ACM Computing Surveys, 41(1), Article 6.         [ Links ]

17. Vilalta, R., Giraud-Carrier, C., Brazdil, P., & Soares, C. (2004). Using Meta-Learning to Support Data-Mining. International Journal of Computer Science & Applications, 1(1), 31-45.         [ Links ]

18. Wolpert, D.H. (1996). The Lack of a Priori Distinction between Learning Algorithms. Neural Computation, 8(7), 1341-1390.         [ Links ]

Creative Commons License All the contents of this journal, except where otherwise noted, is licensed under a Creative Commons Attribution License