N-gram Parsing for Jointly Training a Discriminative Constituency Parser

Çelebi, Arda; Özgür, Arzucan

Services on Demand

Journal

Article

Indicators

Cited by SciELO
Access statistics

Polibits

On-line version ISSN 1870-9044

Polibits n.47 México Jan./Jul. 2013

N-gram Parsing for Jointly Training a Discriminative Constituency Parser

Arda Çelebi and Arzucan Özgür

Arda Çelebi and Arzucan Özgür are with Department of Computer Engineering, Boğaziçi University, Bebek, 34342 Istanbul, Turkey (e-mail: arda.celebi@boun.edu.tr, arzucan.ozgur@boun.edu.tr).

Manuscript received on December 7, 2012
Accepted for publication on January 11, 2013.

Abstract

Syntactic parsers are designed to detect the complete syntactic structure of grammatically correct sentences. In this paper, we introduce the concept of n-gram parsing, which corresponds to generating the constituency parse tree of n consecutive words in a sentence. We create a stand-alone n-gram parser derived from a baseline full discriminative constituency parser and analyze the characteristics of the generated n-gram trees for various values of n. Since the produced n-gram trees are in general smaller and less complex compared to full parse trees, it is likely that n-gram parsers are more robust compared to full parsers. Therefore, we use n-gram parsing to boost the accuracy of a full discriminative constituency parser in a hierarchical joint learning setup. Our results show that the full parser jointly trained with an n-gram parser performs statistically significantly better than our baseline full parser on the English Penn Treebank test corpus.

Key words: Constituency parsing, n-gram parsing, discriminative learning, hierarchical joint learning.

DESCARGAR ARTÍCULO EN FORMATO PDF

ACKNOWLEDGMENTS

We thank Brian Roark and Suzan Üskudarli for their invaluable feedback. This work was supported by the Boğaziçi University Research Fund 12A01P6.

REFERENCES

[1] T. Kasami, "An efficient recognition and syntax-analysis algorithm for context-free languages," Technical report, Air Force Cambridge Research Lab, 1965. [ Links ]

[2] J. Earley, "An eff tient context-free parsing algorithm," Communications of the ACM, vol. 13(2), pp. 94-102, 1970. [ Links ]

[3] M. Collins, "Head-driven statistical models for natural languageparsing," Ph.D. dissertation, Department of Computer and Information Science, University of Pennsylvania, 1999. [ Links ]

[4] E. Charniak, "Statistical parsing with a context-free grammar and wordstatistics," Proceedings of AAAI-97, pp. 598-603, 1997. [ Links ]

[5] A. Ratnaparkhi, "Learning to parse natural language with maximumentropy models," Machine Learning, vol. 34(1-3), pp. 151-175, 1999. [ Links ]

[6] E. Charniak, "A maximum-entropy-inspired parser," Proceedings of the North American Association of Computational Linguistics, 2000. [ Links ]

[7] M. Collins, "Discriminative reranking for natural language parsing,"Proceedings of ICML-2000, pp. 175-182, 2000. [ Links ]

[8] L. Huang, "Forest reranking: Discriminative parsing with non-localfeatures," Proceedings of Ninth International Workshop on ParsingTechnology, pp. 53-64, 2005. [ Links ]

[9] D. McClosky, E. Charniak, and M. Johnson, "Effective self-training forparsing," Proceedings of HLT-NAACL, 2006. [ Links ]

[10] S. Abney, "Part-of-speech tagging and partial parsing," Corpus-BasedMethods in Language and Speech Processing, Kluwer AcademicPublishers, Dordrecht, 1999. [ Links ]

[11] J. R. Finkel and C. D. Manning, "Hierarchical joint learning: Improvingjoint parsing and named entity recognition with non-jointly labeled data,"Proceedings of ACL 2010, 2010. [ Links ]

[12] J. R. Finkel, A. Kleeman, and C. D. Manning, "Efficient, feature-basedconditional random field parsing," Proceedings of ACL/HLT-2008, 2008. [ Links ]

[13] M. Marcus, B. Santorini, and M. A. Marcinkiewicz, "Building alarge annotated corpus of English: The Penn Treebank," ComputationalLinguistics, vol. 19(2), pp. 313-330, 1993. [ Links ]

[14] A. Ratnaparkhi, "A linear observed time statistical parser based onmaximum entropy models," Proceedings of EMNLP, pp. 1-10, 1997. [ Links ]

[15] J. Henderson, "Discriminative training of a neural network statisticalparser," 42nd ACL, pp. 96-103, 2004. [ Links ]

[16] H. Zhang, M. Zhang, C. L. Tan, and H. Li, "K-best combination of syntactic parsers," Proceedings of EMNLP 2009, pp. 1552-1560, 2009. [ Links ]

[17] V. Fossum and K. Knight, "Combining constituent parsers," Proceedings of NAACL 2009, pp. 253-256, 2009. [ Links ]

[18] H. Daume III and D. Marcu, "Domain adaptation for statistical classifiers," Journal of Artificial Intelligence Research, 2006. [ Links ]

[19] J. R. Finkel and C. D. Manning, "Nested named entity recognition,"Proceedings of EMNLP 2009, 2009. [ Links ]

[20] ----------, "Joint parsing and named entity recognition," Proceedings of the North American Association of Computational Linguistics, 2009. [ Links ]

[21] R. Bod, R. Scha, and K. Sima'an, "Data oriented parsing," CSLI Publications, Stanford University, 2003. [ Links ]

[22] A. Joshi, L. Levy, and M. Takahashi, "Tree adjunct grammars," Journal of Computer and System Sciences, vol. 10:1, pp. 136-163, 1975. [ Links ]

[23] M. Bansal and D. Klein, "Web-scale features for full-scale parsing,"Proceedings of 49th Annual Meeting of ACL: HLT, pp. 693-702, 2011. [ Links ]

[24] A. Clark, "Combining distributiona and morphological information for part of speech induction," Proceedings of the tenth Annual Meeting of the European Association for Computational Linguistics (EACL), pp.59-66, 2003. [ Links ]

[25] T. Rose, M. Stevenson, and M. Whitehead, "The Reuters corpus volume 1 - from yesterday's news to tomorrow's language resources," Proceedings of the 3rd international conference on language resources and evaluation., 2002. [ Links ]