Improving Named Entity Extraction Accuracy using Unlabeled Data and Several Extractors

Iwakura, Tomoya; Okamoto, Seishi

Servicios Personalizados

Revista

Articulo

Indicadores

Citado por SciELO
Accesos

Links relacionados

Similares en SciELO

Permalink

Polibits

versión On-line ISSN 1870-9044

Resumen

IWAKURA, Tomoya y OKAMOTO, Seishi. Improving Named Entity Extraction Accuracy using Unlabeled Data and Several Extractors. Polibits [online]. 2009, n.40, pp.29-38. ISSN 1870-9044.

This paper proposes feature augmentation methods using unlabeled data and several Named Entity (NE) extractors. We collect NE-related information of each word (which we call NE-related labels) from unlabeled data by using NE extractors. NE-related labels which we collect include candidate NE class labels of each word and NE class labels of co-occurring words. To accurately collect the NE-related labels from unlabeled data, we consider methods to collect NE-related labels by using outputs of several NE extractors. We use NE-related labels as additional features for creating new NE extractors. We apply our NE extraction methods using the NE-related labels to IREX Japanese NE extraction task. The experimental results show better accuracy than the previous results obtained with NE extractors using handcrafted resources.

Palabras llave : Named entity recognition; unlabeled data; combination of extractors.

· texto en Inglés · Inglés (

pdf )