SciELO - Scientific Electronic Library Online

 
 número39SMM: Detailed, Structured Morphological Analysis for SpanishApplication of Pronominal Divergence and Anaphora Resolution in English-Hindi Machine Translation índice de autoresíndice de materiabúsqueda de artículos
Home Pagelista alfabética de revistas  

Servicios Personalizados

Revista

Articulo

Indicadores

Links relacionados

  • No hay artículos similaresSimilares en SciELO

Compartir


Polibits

versión On-line ISSN 1870-9044

Polibits  no.39 México ene./jun. 2009

 

Articles

 

CLAU – A Service–Oriented System for Complex Language Alignment: Architectural Aspects

 

Claudiu Mihăilă1, Corina Forăscul2, and Sabin C. Buraga3

 

1 Faculty of Computer Science, Al.I. Cuza University of Iaşi, 16, General Berthelot, 700483, Iaşi, Romania (e–mail: claudiu.mihaila@info.uaic.ro).

2 Faculty of Computer Science, Al.I. Cuza University of Iaşi, 16, General Berthelot, 700483, Iaşi, Romania and with the Romanian Academy Research Institute for Artificial Intelligence, Romania (email: corinfor@info.uaic.ro).

3 Faculty of Computer Science, Al.I. Cuza University of Iaşi, 16, General Berthelot, 700483, Iaşi, Romania (e–mail: busaco@info.uaic.ro).

 

Manuscript received October 21, 2008.
Manuscript accepted for publication February 19, 2009.

 

Abstract

In the last years, parallel corpora have become an effective framework to study how well the linguistic phenomena and, more specifically, annotation schemata can be applied when importing the annotations from one language to the other(s). In the case of automatic import, the evaluation and correction are better to be performed by linguists using specific software. The paper proposes CLAU – a service–oriented interactive application allowing users to import, evaluate, correct, and share XML–based annotations in parallel texts. The design, general architecture, and implementation are discussed. Also, two use cases are presented: temporal annotations in parallel texts and how CLAU facilitates social Web interactions between language scientists.

Key words: Parallel text processing, cross–language studies, service–oriented architecture.

 

DESCARGAR ARTÍCULO EN FORMATO PDF

 

REFERENCES

[1] Bizer, C., Heath, T., Idehen, K., Berners–Lee, T., "Linked Data on the Web," in Proceedings of WWW2008, Beijing, China, ACM Press, 2008.         [ Links ]

[2] Chakrabarti, S., Mining the Web — Discovering Knowledge from Hypertext Data. Morgan Kaufmann, San Francisco, 2003.         [ Links ]

[3] Chiao, Y.–C., Kraif, O., Laurent, D., Nguyen, T.M.H., Semmar, N., Stuck, F., Véronis, J., Zaghouani, W., "Evaluation of multilingual text alignment systems: the ARCADE II project," in Proceedings of LREC–2006, Geneva, 2006.         [ Links ]

[4] Cristea, D., Forăscu, C., Pistol, I., "Requirements–Driven Automatic Configuration of Natural Language Applications," in Bernadette Sharp (Ed.): Natural Language Understanding and Cognitive Science, Proceedings of the 3rd International Workshop on Natural Language Understanding and Cognitive Science — NLUCS 2006, in conjunction with ICEIS 2006, Cyprus, Paphos. INSTICC Press, Portugal, 2006.         [ Links ]

[5] Cunningham, H., Maynard, D., Bontcheva, K., Tablan, V., "GATE: A Framework and Graphical Development Environment for Robust NLP tools and Applications," in Proceedings of the 40th Anniversary Meeting of the Association for Computational Linguistics (ACL'02), Philadelphia, 2002.         [ Links ]

[6] Cunningham, H., Tablan, V., Bontcheva, K., Dimitrov, M., "Language engineering tools for collaborative corpus annotation," in Proceedings of Corpus Linguistics Conference, Lancaster, UK, 2003.         [ Links ]

[7] Erl, T. Service–Oriented Architecture: Concepts, Technology, and Design. Prentice Hall PTR, 2005.         [ Links ]

[8] Ferrucci, D., Lally, A., "UIMA: an architectural approach to unstructured information processing in the corporate research environment," Natural Language Engineering 10, No. 3–4 (2004)        [ Links ]

[9] Forăscu, C., "Why Don't Romanians Have a Five O'clock Tea, Nor Halloween, but Have a Kind of Valentine's Day?" in A. Gelbukh (Ed.): Computational Linguistics and Intelligent Text Processing, CICLing 2008, LNCS 4919, Springer–Verlag, Berlin Heidelberg, 2008.         [ Links ]

[10] Geilfuss, M., Milde, J.–T., "SAM – an annotation editor for parallel texts," in Proceedings of LREC–2006, Geneva, 2006.         [ Links ]

[11] Josuttis, N., SOA in Practice. The Art of Distributed System Design. O'Reilly, Sebastopol, 2007.         [ Links ]

[12] Monachesi, P., Simov, K., Mossel, E., Osenova, P., Lemnitzer, L., "What Ontologies Can Do for eLearning," in Proceedings of IMCL 2008, Amman, Jordan, 2008.         [ Links ]

[13] Och, F. J., Ney, H., "A Systematic Comparison of Various Statistical Alignment Models," Computational Linguistics, Vol. 29, No. 1, 2003.         [ Links ]

[14] O'Reilly, T., What is Web 2.0 — Design Patterns and Business Models for the Next Generation of Software. O'Reilly, Sebastopol, 2005.         [ Links ]

[15] Postolache, O., Cristea, D., Orãsan, C., "Transferring Coreference Chains through Word Alignment," in Proceedings of LREC–2006, Geneva, 2006.         [ Links ]

[16] Shavor, S., Fairbrother, S., D'Anjou, J., Kehn D., Eclipse. Addison–Wesley Professional, 2004.         [ Links ]

[17] Simov, K., Peev, Z., Kouylekov, M., Simov, A., Dimitrov, M., Kiryakov, A., "CLaRK – an XML–based System for Corpora Development," in Proceedings of the Corpus Linguistics 2001 Conference, 2001.         [ Links ]

[18] Tufiş, D., "Exploiting Aligned Parallel Corpora in Multilingual Studies and Applications," in Toru Ishida, Susan R. Fussell, and Piek T.J.M. Vossen (Eds.), Intercultural Collaboration. First International Workshop (IWIC 2007), LNCS 4568, Springer–Verlag, Berlin Heidelberg, 2007.         [ Links ]

[19] Tufiş, D., Ion, R., Ceausu, A., Ştefănescu, D., "RACAI's Linguistic Web Services," in Proceedings of the 6th Language Resources and Evaluation Conference — LREC 2008, Marrakech, Morocco. ELRA – European Language Resources Association, 2008.         [ Links ]

[20] Véronis, J. (Ed.): Parallel Text Processing: Alignment and Use of Translation corpora. Series: Text, Speech and Language Technology, Vol. 13, Kluwer Academic Publishers, 2000.         [ Links ]

Creative Commons License Todo el contenido de esta revista, excepto dónde está identificado, está bajo una Licencia Creative Commons