SciELO - Scientific Electronic Library Online

vol.18 issue2Efficiently Finding the Optimum Number of Clusters in a Dataset with a New Hybrid Cellular Evolutionary AlgorithmA Gaussian Selection Method for Speaker Verification with Short Utterances author indexsubject indexsearch form
Home Pagealphabetic serial listing  

Services on Demand




Related links

  • Have no similar articlesSimilars in SciELO


Computación y Sistemas

Print version ISSN 1405-5546


GELBUKH, Alexander. Unsupervised Learning for Syntactic Disambiguation. Comp. y Sist. [online]. 2014, vol.18, n.2, pp.329-344. ISSN 1405-5546.

We present a methodology framework for syntactic disambiguation in natural language texts. The method takes advantage of an existing manually compiled non-probabilistic and non-lexicalized grammar, and turns it into a probabilistic lexicalized grammar by automatically learning a kind of subcategorization frames or selectional preferences for all words observed in the training corpus. The dictionary of subcategorization frames or selectional preferences obtained in the training process can be subsequently used for syntactic disambiguation of new unseen texts. The learning process is unsupervised and requires no manual markup. The learning algorithm proposed in this paper can take advantage of any existing disambiguation method, including linguistically motivated methods of filtering or weighting competing alternative parse trees or syntactic relations, thus allowing for integration of linguistic knowledge and unsupervised machine learning.

Keywords : Natural language processing; syntactic parsing; syntactic disambiguation; unsupervised machine learning.

        · abstract in Spanish     · text in English     · English ( pdf )


Creative Commons License All the contents of this journal, except where otherwise noted, is licensed under a Creative Commons Attribution License