SciELO - Scientific Electronic Library Online

 
vol.17 número2Generación de diccionarios bilingües usando las propiedades estructuralesExtracción de palabras clave de documentos individuales para extracción de palabras clave de documentos múltiples índice de autoresíndice de materiabúsqueda de artículos
Home Pagelista alfabética de revistas  

Servicios Personalizados

Revista

Articulo

Indicadores

Links relacionados

  • No hay artículos similaresSimilares en SciELO

Compartir


Computación y Sistemas

versión impresa ISSN 1405-5546

Comp. y Sist. vol.17 no.2 México abr./jun. 2013

 

Artículos

 

Optimizing Selection of Assessment Solutions for Completing Information Extraction Results

 

Optimización de selección de soluciones de evaluación para completar los resultados de recuperación de información

 

Christina Feilmayr

 

Johannes Kepler University Linz, Institute of Application Oriented Knowledge Processing, Altenberger Straße 69, 4040 Linz, Austria cfeilmayr@faw.jku.at

 

Article received on 14/12/2012
Accepted on 03/02/2013.

 

Abstract

Incomplete information produces serious consequences in information extraction: it increases costs and leads to problems in downstream processing. This work focuses on improving the completeness of extraction results by applying judiciously selected assessment methods to information extraction based on the principle of complementarity. Our recommendation model simplifies the selection of assessment methods which can overcome a specific incompleteness problem. This paper also focuses on the characterization of information extraction and assessment methods as well as on a rule-based approach that allows estimation of general processability, profitability in the complementarity approach, and the performance of an assessment method under evaluation.

Keywords: Information extraction, information quality, method selection, data and text mining.

 

Resumen

La información incompleta causa graves consecuencias en la extracción de la misma: aumenta los costos y propicia problemas para el procesamiento en cadena. El objetivo de este trabajo es presentar la mejora en los resultados de extracción con el fin de completarlos con métodos de evaluación juiciosamente selectos basados en el principio de complementariedad. El modelo propuesto simplifica la selección de los métodos de evaluación, los cuales pueden resolver un problema específico de información incompleta. Este artículo se enfoca también en la caracterización de la extracción de información y los métodos de evaluación con un enfoque basado en reglas que permita validar la capacidad de procesamiento general, la rentabilidad en el enfoque de complementariedad y el rendimiento de los métodos de evaluación.

Palabras clave: Extracción de información, calidad de información, selección del método, minería de datos y textos.

 

DESCARGAR ARTÍCULO EN FORMATO PDF

 

Acknowledgements

This work is supported by an Austrian research grant (FIT-IT Semantic Systems Dissertation Fellowship Project) from BMVIT (project 829601).

 

References

1. Batini, C. & Scannapieca, M. (2010). Data Quality: Concepts, Methodologies and Techniques, (Softcover reprint of hardcover 1st ed. 2006). Berlin; New York: Springer.         [ Links ]

2. Bloch, I., Hunter, A., et al. (2001). Fusion: General Concepts and Characteristics. International Journal of Intelligent Systems, 16(10), 1107-1134.         [ Links ]

3. Brazdil, P.B. & Henery, R.J. (1994). Analysis of Results. Machine Learning, Neural and Statistical Classification, (175-212), New York: Ellis Horwood.         [ Links ]

4. Brazdil, P., Giraud-Carrier, C., Soares, C., & Vilalta, R. (2009). Meta-Learning: Applications to Data Mining. Berlin: Springer.         [ Links ]

5. Castiello, C., Castellano, G., & Fanelli, A.M. (2005). Meta-Data: Characterization of Input Features for Meta-Learning. Modeling Decisions for Artificial Intelligence. Lecture Notes in Computer Science, 3558, 457-468.         [ Links ]

6. Charest, M. & Delisle, S. (2006). Ontology-Guided Intelligent Data Mining Assistance: Combining Declarative and Procedural Knowledge. 10th International Conference on Artificial Intelligence and Soft Computing, Palma de Mallorca, Spain, 9-14.         [ Links ]

7. Diamantini, C., Potena, D., & Storti, E. (2009). KDDONTO: An Ontology for Discovery and Composition of KDD Algorithms. Third Generation Data Mining: Towards Service-Oriented Knowledge Discovery (SoKD'09), Bled, Slovenia, 13-24.         [ Links ]

8. Feilmayr, C. (2012). Tackling Incompleteness in Information Extraction - A Complementarity Approach. The Semantic Web: Research and Applications. Lecture Notes in Computer Science, 7295, 808-812.         [ Links ]

9. Feilmayr, C. (2012). Characterization and Resolution of Incompleteness in (World-Wide-Web) Information Extraction. 23rd International Workshop on Database and Expert Systems Applications, Vienna, Austria, 241-245.         [ Links ]

10. Giraud-Carrier, C. (2008). Meta-Learning - A Tutorial, Technical Report for Tutorial on 7th International Conference on Machine Learning and Applications (ICMLA).         [ Links ]

11. Hilario, M. & Kalousis, A. (2001). Fusion of MetaKnowledge and Meta-Data for Case-based Model Selection. Principles and Practice of Knowledge Discovery. Lecture Notes in Computer Science, 2168, 180-191.         [ Links ]

12. Hilario, M., Kalousis, A., Nguyen, P., & Woznica, A. (2009). A Data Mining Ontology for Algorithm Selection and Meta-Mining. Third Generation Data Mining: Towards Service-Oriented Knowledge Discovery (SoKD'09), Bled, Slovenia, 76-87.         [ Links ]

13. Kietz, J.U., Serban, F., Bernstein, A., & Fischer, S. (2010). Data Mining Workflow Templates for Intelligent Discovery Assistance and Auto-Experimentation. Third-Generation Data Mining: Towards Service-oriented Knowledge Discovery (SoKD-10), Barcelona, Spain, 1-12.         [ Links ]

14. Rice, J.R. (1976). The Algorithm Selection Problem. Advances in Computation, 15, 65-118.         [ Links ]

15. Serban, F., Vanschoren, J., Kietz, J.U., & Bernstein, A. (2013). A Survey of Intelligent Assistants for Data Analysis. ACM Computing Surveys, 45(3).         [ Links ]

16. Smith-Miles, K.A. (2008). Cross-Disciplinary Perspectives on Meta-Learning for Algorithm Selection. ACM Computing Surveys, 41(1), Article 6.         [ Links ]

17. Vilalta, R., Giraud-Carrier, C., Brazdil, P., & Soares, C. (2004). Using Meta-Learning to Support Data-Mining. International Journal of Computer Science & Applications, 1(1), 31-45.         [ Links ]

18. Wolpert, D.H. (1996). The Lack of a Priori Distinction between Learning Algorithms. Neural Computation, 8(7), 1341-1390.         [ Links ]

Creative Commons License Todo el contenido de esta revista, excepto dónde está identificado, está bajo una Licencia Creative Commons