English-to-Japanese Cross-Language Question-Answering System using Weighted Adding with Multiple Answers

Murata, Masaki; Utiyama, Masao; Kanamaru, Toshiyuki; Isahara, Hitoshi

Services on Demand

Journal

Article

Indicators

Cited by SciELO
Access statistics

Polibits

On-line version ISSN 1870-9044

Polibits n.40 México Jul./Dec. 2009

Special section: Information Retrieval and Natural Language Processing

English–to–Japanese Cross–Language Question–Answering System using Weighted Adding with Multiple Answers

Masaki Murata, Masao Utiyama, Toshiyuki Kanamaru, and Hitoshi Isahara

National Institute of Information and Communications Technology 3–5 Hikaridai, Seika–cho, Soraku–gun, Kyoto, 619–0289, Japan (e–mail: murata@nict.go.jp).

Manuscript received October 19, 2008.
Manuscript accepted for publication August 3, 2009.

Abstract

We describe a method of using multiple documents with decreasing weights as evidence to improve the performance of a question–answering system. We also describe how it was used in cross–language question answering (CLQA) tasks. Sometimes, the answer to a question may be found in multiple documents. In such cases, using multiple documents for prediction generates better answers than using a single document. Therefore, our method uses information from multiple documents by adding the scores of candidate answers extracted from the various documents. Because simply adding scores degrades the performance of question–answering systems, we add scores with decreasing weights to reduce the negative effect of simply adding. We used this method in the CLQA part of NTCIR–5. It was incorporated into a commercially available translation system that carries out cross–language question–answering tasks. Our method obtained relatively good CLQA results.

Key words: Machine translation, cross–language question–answering, decreased adding, multiple documents, NTCIR.

DESCARGAR ARTÍCULO EN FORMATO PDF

REFERENCES

[1] J. Kupiec, "MURAX: A robust linguistic approach for question answering using an on–line encyclopedia," in Proceedings of the Sixteenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 1993. [ Links ]

[2] A. Ittycheriah, M. Franz, W.–J. Zhu, and A. Ratnaparkhi, "IBM's Statistical Question Answering System," in TREC–9 Proceedings, 2001. [ Links ]

[3] C. L. A. Clarke, G. V. Cormack, and T. R. Lynam, "Exploiting redundancy in question answering," in Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2001. [ Links ]

[4] S. Dumis, M. Banko, E. Brill, J. Lin, and A. Ng, "Web question answering: Is more always better?" in Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2002. [ Links ]

[5] B. Magnini, M. Negri, R. Prevete, and H. Tanev, "Is it the right answer? Exploiting web redundancy for answer validation," in Proceedings of the 41st Annual Meeting ofthe Association for Computational Linguistics, 2002. [ Links ]

[6] D. Moldovan, M. Pasca, S. Harabagiu, and M. Surdeanu, "Performance issues and error analysis in an open–domain question answering system," ACM Transactions on Information Systems, vol. 21, no. 2, pp. 133–154, 2003. [ Links ]

[7] TREC–10 committee, "The tenth text retrieval conference," 2001, http://trec.nist.gov/pubs/trec10/t10_proceedings.html. [ Links ]

[8] National Institute of Informatics, Proceedings of the Third NTCIR Workshop (QAC), 2002. [ Links ]

[9] M. Murata, M. Utiyama, and H. Isahara, "Question answering system using syntactic information," 1999, http://xxx.lanl.gov/abs/cs.CL/9911006. [ Links ]

[10] ––––––––––, "A question–answering system using unit estimation and probabilistic near–terms IR," 2002. [ Links ]

[11] T. Takaki and Y. Eriguchi, "NTT DATA question–answering experiment at the NTCIR–3 QAC," 2002. [ Links ]

[12] Mainichi Publishing, "Mainichi Newspaper 1991–2000," 2000. [ Links ]

[13] Y. Matsumoto, A. Kitauchi, T. Yamashita, Y. Hirano, H. Matsuda, and M. Asahara, "Japanese morphological analysis system ChaSen version 2.0 manual 2nd edition," 1999. [ Links ]

[14] S. E. Robertson and S. Walker, "Some simple effective approximations to the 2–Poisson model for probabilistic weighted retrieval," in Proceedings of the Seventeenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 1994. [ Links ]

[15] S. E. Robertson, S. Walker, S. Jones, M. M. Hancock–Beaulieu, and M. Gatford, "Okapi at TREC–3," in TREC–3, 1994. [ Links ]

[16] M. Murata, K. Uchimoto, H. Ozaku, Q. Ma, M. Utiyama, and H. Isahara, "Japanese probabilistic information retrieval using location and category information," 2000, pp. 81–88. [ Links ]

[17] M. Murata, M. Utiyama, Q. Ma, H. Ozaku, and H. Isahara, "CRL at NTCIR2," 2001, pp. 5–21–5–31. [ Links ]

[18] M. Murata, Q. Ma, and H. Isahara, "High performance information retrieval using many characteristics and many techniques," 2002. [ Links ]

[19] H. Yamada, T. Kudoh, and Y. Matsumoto, "Japanese named entity extraction using support vector machine," Transactions of Information Processing Society of Japan, vol. 43, no. 1, pp. 44–53, 2002. [ Links ]