SciELO - Scientific Electronic Library Online

 
 número40Using Sense Clustering for the Disambiguation of WordsRevised N-Gram based Automatic Spelling Correction Tool to Improve Retrieval Effectiveness índice de autoresíndice de materiabúsqueda de artículos
Home Pagelista alfabética de revistas  

Servicios Personalizados

Revista

Articulo

Indicadores

Links relacionados

  • No hay artículos similaresSimilares en SciELO

Compartir


Polibits

versión On-line ISSN 1870-9044

Resumen

IWAKURA, Tomoya  y  OKAMOTO, Seishi. Improving Named Entity Extraction Accuracy using Unlabeled Data and Several Extractors. Polibits [online]. 2009, n.40, pp.29-38. ISSN 1870-9044.

This paper proposes feature augmentation methods using unlabeled data and several Named Entity (NE) extractors. We collect NE-related information of each word (which we call NE-related labels) from unlabeled data by using NE extractors. NE-related labels which we collect include candidate NE class labels of each word and NE class labels of co-occurring words. To accurately collect the NE-related labels from unlabeled data, we consider methods to collect NE-related labels by using outputs of several NE extractors. We use NE-related labels as additional features for creating new NE extractors. We apply our NE extraction methods using the NE-related labels to IREX Japanese NE extraction task. The experimental results show better accuracy than the previous results obtained with NE extractors using handcrafted resources.

Palabras llave : Named entity recognition; unlabeled data; combination of extractors.

        · texto en Inglés     · Inglés ( pdf )

 

Creative Commons License Todo el contenido de esta revista, excepto dónde está identificado, está bajo una Licencia Creative Commons