Serviços Personalizados
Journal
Artigo
Indicadores
- Citado por SciELO
- Acessos
Links relacionados
- Similares em SciELO
Compartilhar
Computación y Sistemas
versão On-line ISSN 2007-9737versão impressa ISSN 1405-5546
Resumo
INDIG, Balázs. Less is More, More or Less... Finding the Optimal Threshold for Lexicalization in Chunking. Comp. y Sist. [online]. 2017, vol.21, n.4, pp.637-646. ISSN 2007-9737. https://doi.org/10.13053/cys-21-4-2866.
Lexicalization of the input of sequential taggers has gone a long way since it was invented by Molina and Pla [4]. In this paper we thoroughly investigate the method introduced by Indig and Endrédy [2] to find out the best lexicalization level for chunking and to explore the behavior of different IOB representations. Both tasks are applied to the CoNLL-2000 dataset. Our goal is to introduce a transformation method to accommodate the parameters of the development set to the training set using their frequency distributions which other tasks like POS tagging or NER could benefit too.
Palavras-chave : Phrase chunking; IOB labels; multiple IOB representations; sequential tagging; CRF.