SciELO - Scientific Electronic Library Online

 
vol.21 número4Sentence Similarity Computation based on WordNet and VerbNetParsing Arabic Nominal Sentences with Transducers to Annotate Corpora índice de autoresíndice de assuntospesquisa de artigos
Home Pagelista alfabética de periódicos  

Serviços Personalizados

Journal

Artigo

Indicadores

Links relacionados

  • Não possue artigos similaresSimilares em SciELO

Compartilhar


Computación y Sistemas

versão On-line ISSN 2007-9737versão impressa ISSN 1405-5546

Resumo

INDIG, Balázs. Less is More, More or Less... Finding the Optimal Threshold for Lexicalization in Chunking. Comp. y Sist. [online]. 2017, vol.21, n.4, pp.637-646. ISSN 2007-9737.  https://doi.org/10.13053/cys-21-4-2866.

Lexicalization of the input of sequential taggers has gone a long way since it was invented by Molina and Pla [4]. In this paper we thoroughly investigate the method introduced by Indig and Endrédy [2] to find out the best lexicalization level for chunking and to explore the behavior of different IOB representations. Both tasks are applied to the CoNLL-2000 dataset. Our goal is to introduce a transformation method to accommodate the parameters of the development set to the training set using their frequency distributions which other tasks like POS tagging or NER could benefit too.

Palavras-chave : Phrase chunking; IOB labels; multiple IOB representations; sequential tagging; CRF.

        · texto em Inglês     · Inglês ( pdf )