SciELO - Scientific Electronic Library Online

vol.18 issue3Multi-document Summarization using Tensor DecompositionOn-line and Off-line Chinese-Portuguese Translation Service for Mobile Applications author indexsubject indexsearch form
Home Pagealphabetic serial listing  

Services on Demand




Related links

  • Have no similar articlesSimilars in SciELO


Computación y Sistemas

Print version ISSN 1405-5546

Comp. y Sist. vol.18 n.3 México Jul./Sep. 2014 

Artículos regulares


Entity Extraction in Biochemical Text using Multiobjective Optimization


Utpal Kumar Sikdar, Asif Ekbal, and Sriparna Saha


Department of Computer Science and Engineering, Indian Institute of Technology, Patna, India.,,


Article received on 18/01/2014.
Accepted on 01/02/2014.



In this paper we propose a multiobjective modified differential evolution based feature selection and classifier ensemble approach for biochemical entity extraction. The algorithm performs in two layers. The first layer concerns with determining an appropriate set of features for the task within the framework of a supervised statistical classifier, namely, Conditional Random Field (CRF). This produces a set of solutions, a subset of which is used to construct an ensemble in the second layer. The proposed approach is evaluated for entity extraction in chemical texts, which involves identification of IUPAC and IUPAC-like names and classification of them into some predefined categories. Experiments that were carried out on a benchmark dataset show the recall, precision and F-measure values of 86.15%, 91.29% and 88.64%, respectively.

Keywords: Multiobjective modified differential evolution (MODE), feature selection, ensemble learning, conditional random field (CRF), named entity (NE).





1. Ekbal, A. & Saha, S. (2010). Classifier ensemble selection using genetic algorithm for named entity recognition. Research on Language and Computation, 8, 73-99.         [ Links ]

2. Ekbal, A. & Saha, S. (2010). Weighted vote based classifier ensemble selection using genetic algorithm for named entity recognition. In Proceedings of the Natural language processing and information systems, NLDB'10, pp. 256-267.         [ Links ]

3. Ekbal, A. & Saha, S. (2011). Weighted vote-based classifier ensemble for named entity recognition: A genetic algorithm-based approach. ACM Trans. Asian Lang. Inf. Process., 10(2).         [ Links ]

4. Ekbal, A. & Saha, S. (2012). Multiobjective optimization for classifier ensemble and feature selection: an application to named entity recognition. IJDAR, 15(2), 143-166.         [ Links ]

5. Lafferty, J. D., McCallum, A., & Pereira, F. C. N. (2001). Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data. In ICML, pp. 282-289.

6. Liu, H. & Motoda, H. (1998). Feature Selection for Knowledge Discovery and Data Mining. Kluwer Academic Publishers, Norwell, MA, USA.         [ Links ]

7. Liu, H. & Yu, L. (2005). Toward integrating feature selection algorithms for classification and clustering. IEEE Trans. on Knowl. and Data Eng., 17(4), 491-502. doi:         [ Links ]

8. Sikdar, U. K., Ekbal, A., & Saha, S. (2012). Differential evolution based feature selection and classifier ensemble for named entity recognition. In COLING, pp. 2475-2490.

9. Sikdar, U. K., Ekbal, A., & Saha, S. (2014). Modified differential evolution for biochemical name recognizer. In CICLing, pp. 225-236.

10. Storn, R. & Price, K. (1997). Differential evolution — a simple and efficient heuristic for global optimization over continuous spaces. J. of Global Optimization, 11(4), 341-359. doi: 10.1023/A:1008202821328.         [ Links ]

Creative Commons License All the contents of this journal, except where otherwise noted, is licensed under a Creative Commons Attribution License