<?xml version="1.0" encoding="ISO-8859-1"?><article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<front>
<journal-meta>
<journal-id>1405-5546</journal-id>
<journal-title><![CDATA[Computación y Sistemas]]></journal-title>
<abbrev-journal-title><![CDATA[Comp. y Sist.]]></abbrev-journal-title>
<issn>1405-5546</issn>
<publisher>
<publisher-name><![CDATA[Instituto Politécnico Nacional, Centro de Investigación en Computación]]></publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id>S1405-55462016000400667</article-id>
<article-id pub-id-type="doi">10.13053/cys-20-4-2430</article-id>
<title-group>
<article-title xml:lang="en"><![CDATA[POS Tagging without a Tagger: Using Aligned Corpora for Transferring Knowledge to Under-Resourced Languages]]></article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname><![CDATA[Turki Khemakhem]]></surname>
<given-names><![CDATA[Ines]]></given-names>
</name>
<xref ref-type="aff" rid="Aff"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname><![CDATA[Jamoussi]]></surname>
<given-names><![CDATA[Salma]]></given-names>
</name>
<xref ref-type="aff" rid="Aff"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname><![CDATA[Ben Hamadou]]></surname>
<given-names><![CDATA[Abdelmajid]]></given-names>
</name>
<xref ref-type="aff" rid="Aff"/>
</contrib>
</contrib-group>
<aff id="Af1">
<institution><![CDATA[,University of Sfax  ]]></institution>
<addr-line><![CDATA[ ]]></addr-line>
<country>Tunisia</country>
</aff>
<pub-date pub-type="pub">
<day>00</day>
<month>12</month>
<year>2016</year>
</pub-date>
<pub-date pub-type="epub">
<day>00</day>
<month>12</month>
<year>2016</year>
</pub-date>
<volume>20</volume>
<numero>4</numero>
<fpage>667</fpage>
<lpage>679</lpage>
<copyright-statement/>
<copyright-year/>
<self-uri xlink:href="http://www.scielo.org.mx/scielo.php?script=sci_arttext&amp;pid=S1405-55462016000400667&amp;lng=en&amp;nrm=iso"></self-uri><self-uri xlink:href="http://www.scielo.org.mx/scielo.php?script=sci_abstract&amp;pid=S1405-55462016000400667&amp;lng=en&amp;nrm=iso"></self-uri><self-uri xlink:href="http://www.scielo.org.mx/scielo.php?script=sci_pdf&amp;pid=S1405-55462016000400667&amp;lng=en&amp;nrm=iso"></self-uri><abstract abstract-type="short" xml:lang="en"><p><![CDATA[Abstract Almost all languages lack sufficient resources and tools for developing Human Language Technologies (HLT). These technologies are mostly developed for languages for which large resources and tools are available. In this paper, we deal with the under-resourced languages, which can benefit from the available resources and tools to develop their own HLT. We consider as an example the POS tagging task, which is among the most primordial Natural Language Processing tasks. The task is importatn because it assigns to word tags that highlight their morphological features by considering the corresponding contexts. The solution that we propose in this research work, is based on the use of aligned parallel corpus as a bridge between a rich-resourced language and an under-resourced language. This kind of corpus is usually available. The rich-resourced language side of this corpus is annotated first. These POS-annotations are then exploited to predict the annotation on the under-resourced language side by using alignment training. After this training step, we obtain a matching table between the two languages, which is exploited to annotate an input text. The experimentation of the proposed approach is performed for a pair of languages: English as a rich-resourced language and Arabic as an under-resourced language. We used the IWSLT10 training corpus and English TreeTagger 15. The approach was evaluated on the test corpus extracted from the IWSLT08 and obtained F-score of 89%. It can be extrapolated to the other NLP tasks.]]></p></abstract>
<kwd-group>
<kwd lng="en"><![CDATA[POS tagging]]></kwd>
<kwd lng="en"><![CDATA[alignment]]></kwd>
<kwd lng="en"><![CDATA[parallel corpus]]></kwd>
<kwd lng="en"><![CDATA[under-resourced languages]]></kwd>
</kwd-group>
</article-meta>
</front><back>
<ref-list>
<ref id="B1">
<nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Besacier]]></surname>
<given-names><![CDATA[L.]]></given-names>
</name>
<name>
<surname><![CDATA[Le]]></surname>
<given-names><![CDATA[V.-B.]]></given-names>
</name>
<name>
<surname><![CDATA[Boitet]]></surname>
<given-names><![CDATA[C.]]></given-names>
</name>
<name>
<surname><![CDATA[Berment]]></surname>
<given-names><![CDATA[V.]]></given-names>
</name>
</person-group>
<source><![CDATA[ASR Translation for Under-resourced Languages]]></source>
<year>2006</year>
<conf-name><![CDATA[ IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP]]></conf-name>
<conf-loc> </conf-loc>
<page-range>1221-4</page-range></nlm-citation>
</ref>
<ref id="B2">
<nlm-citation citation-type="journal">
<article-title xml:lang=""><![CDATA[The mathematics of statistical machine translation: parameter estimation]]></article-title>
<person-group person-group-type="author">
<name>
<surname><![CDATA[Brown]]></surname>
<given-names><![CDATA[P.]]></given-names>
</name>
<name>
<surname><![CDATA[Della-Pietra]]></surname>
<given-names><![CDATA[V.]]></given-names>
</name>
<name>
<surname><![CDATA[Della-Pietra]]></surname>
<given-names><![CDATA[S.]]></given-names>
</name>
<name>
<surname><![CDATA[Mercer]]></surname>
<given-names><![CDATA[R.]]></given-names>
</name>
</person-group>
<source><![CDATA[Computational Linguistics]]></source>
<year>1993</year>
<volume>19</volume>
<numero>1</numero>
<issue>1</issue>
<page-range>263-311</page-range></nlm-citation>
</ref>
<ref id="B3">
<nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Dien]]></surname>
<given-names><![CDATA[D.]]></given-names>
</name>
<name>
<surname><![CDATA[Kiem]]></surname>
<given-names><![CDATA[H.]]></given-names>
</name>
</person-group>
<source><![CDATA[POS-Tagger for English-Vietnamese Bilingual Corpus]]></source>
<year>2003</year>
<conf-name><![CDATA[ HLT-NAACL 2003 Workshop on Building and Using Parallel Texts: Data Driven Machine Translation and Beyond]]></conf-name>
<conf-loc> </conf-loc>
</nlm-citation>
</ref>
<ref id="B4">
<nlm-citation citation-type="journal">
<article-title xml:lang=""><![CDATA[Part of Speech Taggers for Morphologically Rich Indian Languages: A Survey]]></article-title>
<person-group person-group-type="author">
<name>
<surname><![CDATA[Dinesh]]></surname>
<given-names><![CDATA[K.]]></given-names>
</name>
<name>
<surname><![CDATA[Gurpreet]]></surname>
<given-names><![CDATA[S. J.]]></given-names>
</name>
</person-group>
<source><![CDATA[International Journal of Computer Applications]]></source>
<year>2010</year>
<volume>6</volume>
<numero>5</numero>
<issue>5</issue>
</nlm-citation>
</ref>
<ref id="B5">
<nlm-citation citation-type="">
<article-title xml:lang=""><![CDATA[POS-Tagger for English-Vietnamese Bilingual Corpus]]></article-title>
<person-group person-group-type="author">
<name>
<surname><![CDATA[Dinh]]></surname>
<given-names><![CDATA[D.]]></given-names>
</name>
<name>
<surname><![CDATA[Hoang]]></surname>
<given-names><![CDATA[K.]]></given-names>
</name>
</person-group>
<source><![CDATA[Workshop: Building and Using Parallel Texts: Data Driven Machine Translation and Beyond]]></source>
<year>2003</year>
<publisher-loc><![CDATA[Edmonton, CA. ]]></publisher-loc>
</nlm-citation>
</ref>
<ref id="B6">
<nlm-citation citation-type="book">
<article-title xml:lang=""><![CDATA[Creating language resources for under-resourced languages: methodologies, and experiments with Arabic]]></article-title>
<person-group person-group-type="author">
<name>
<surname><![CDATA[El-Haj]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
</person-group>
<source><![CDATA[Language Resources and Evaluation]]></source>
<year>2014</year>
<publisher-name><![CDATA[Springer]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B7">
<nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[El-Haj]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
<name>
<surname><![CDATA[Kruschwit]]></surname>
<given-names><![CDATA[C. F.]]></given-names>
</name>
</person-group>
<source><![CDATA[UsingMechanical Turk to Create a Corpus of Arabic Summaries]]></source>
<year>2010</year>
<conf-name><![CDATA[ International Conference on Language Resources and Evaluation]]></conf-name>
<conf-loc> </conf-loc>
</nlm-citation>
</ref>
<ref id="B8">
<nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Habash]]></surname>
<given-names><![CDATA[N.]]></given-names>
</name>
<name>
<surname><![CDATA[Rambow]]></surname>
<given-names><![CDATA[O.]]></given-names>
</name>
<name>
<surname><![CDATA[Rot]]></surname>
<given-names><![CDATA[R.]]></given-names>
</name>
</person-group>
<source><![CDATA[MADA+TOKAN: A toolkit for Arabic tokenization, diacritization, morphological disambiguation, POS tagging, stemming and lemmatization]]></source>
<year>2009</year>
<conf-name><![CDATA[ 2International Conference on Arabic Language Resources]]></conf-name>
<conf-loc> </conf-loc>
</nlm-citation>
</ref>
<ref id="B9">
<nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Jurafsky]]></surname>
<given-names><![CDATA[D.]]></given-names>
</name>
<name>
<surname><![CDATA[Martin]]></surname>
<given-names><![CDATA[J. H.]]></given-names>
</name>
</person-group>
<source><![CDATA[Speech and Language Processing an Introduction to Natural Language Processing]]></source>
<year>2002</year>
<publisher-name><![CDATA[Pearson]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B10">
<nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Koehn]]></surname>
<given-names><![CDATA[P.]]></given-names>
</name>
<name>
<surname><![CDATA[Hoang]]></surname>
<given-names><![CDATA[H.]]></given-names>
</name>
<name>
<surname><![CDATA[Birch]]></surname>
<given-names><![CDATA[A.]]></given-names>
</name>
<name>
<surname><![CDATA[Callison-Burch]]></surname>
<given-names><![CDATA[C.]]></given-names>
</name>
<name>
<surname><![CDATA[Federico]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
<name>
<surname><![CDATA[Bertoldi]]></surname>
<given-names><![CDATA[N.]]></given-names>
</name>
<name>
<surname><![CDATA[Cowa]]></surname>
<given-names><![CDATA[B.]]></given-names>
</name>
<name>
<surname><![CDATA[Shen]]></surname>
<given-names><![CDATA[W.]]></given-names>
</name>
<name>
<surname><![CDATA[Moran]]></surname>
<given-names><![CDATA[C.]]></given-names>
</name>
<name>
<surname><![CDATA[Zens]]></surname>
<given-names><![CDATA[R.]]></given-names>
</name>
<name>
<surname><![CDATA[Dyer]]></surname>
<given-names><![CDATA[C.]]></given-names>
</name>
<name>
<surname><![CDATA[Bojar]]></surname>
<given-names><![CDATA[O.]]></given-names>
</name>
<name>
<surname><![CDATA[Constantin]]></surname>
<given-names><![CDATA[A.]]></given-names>
</name>
<name>
<surname><![CDATA[Herbst]]></surname>
<given-names><![CDATA[E.]]></given-names>
</name>
</person-group>
<source><![CDATA[Moses: Open source toolkit for statistical machine translation]]></source>
<year>2007</year>
<conf-name><![CDATA[ ACL-2007 Demoand Poster Sessions]]></conf-name>
<conf-loc>Prague, Czeck Republic </conf-loc>
<page-range>177-80</page-range></nlm-citation>
</ref>
<ref id="B11">
<nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Maamouri]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
<name>
<surname><![CDATA[Bies]]></surname>
<given-names><![CDATA[A.]]></given-names>
</name>
<name>
<surname><![CDATA[Jin]]></surname>
<given-names><![CDATA[H.]]></given-names>
</name>
<name>
<surname><![CDATA[Buckwalter]]></surname>
<given-names><![CDATA[T.]]></given-names>
</name>
</person-group>
<source><![CDATA[Arabic treebank: Part 1 v 2.0]]></source>
<year>2003</year>
<publisher-loc><![CDATA[Philadelphia ]]></publisher-loc>
<publisher-name><![CDATA[Linguistic Data Consortium]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B12">
<nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Manning]]></surname>
<given-names><![CDATA[C. D.]]></given-names>
</name>
</person-group>
<source><![CDATA[Part-of-speech tagging from 97% to 100%: Is it time for some linguistics?]]></source>
<year>2011</year>
<conf-name><![CDATA[ 12International Conference on Computational Linguistics and Intelligent Text Processing, CICLing&#8217;11]]></conf-name>
<conf-loc> </conf-loc>
<page-range>171-89</page-range></nlm-citation>
</ref>
<ref id="B13">
<nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Mourad]]></surname>
<given-names><![CDATA[A.]]></given-names>
</name>
<name>
<surname><![CDATA[Darwish]]></surname>
<given-names><![CDATA[K.]]></given-names>
</name>
</person-group>
<source><![CDATA[Subjectivity and sentiment analysis of Modern Standard Arabic and Arabic microblogs]]></source>
<year>2013</year>
<conf-name><![CDATA[ 4tWorkshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis, Association for Computational Linguistics]]></conf-name>
<conf-loc>Atlanta, Georgia </conf-loc>
<page-range>55-64</page-range></nlm-citation>
</ref>
<ref id="B14">
<nlm-citation citation-type="journal">
<article-title xml:lang=""><![CDATA[A systematic comparison of various statistical alignment models]]></article-title>
<person-group person-group-type="author">
<name>
<surname><![CDATA[Och]]></surname>
<given-names><![CDATA[F. J.]]></given-names>
</name>
<name>
<surname><![CDATA[Ney]]></surname>
<given-names><![CDATA[H.]]></given-names>
</name>
</person-group>
<source><![CDATA[Computational Linguistics]]></source>
<year>2003</year>
<volume>29</volume>
<numero>1.</numero>
<issue>1.</issue>
</nlm-citation>
</ref>
<ref id="B15">
<nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Schmid]]></surname>
<given-names><![CDATA[H.]]></given-names>
</name>
</person-group>
<source><![CDATA[Probabilistic Part-of-Speech Tagging Using Decision Trees]]></source>
<year>1994</year>
<conf-name><![CDATA[ International Conference on New Methods in Language Processing]]></conf-name>
<conf-loc>Manchester, UK </conf-loc>
<page-range>44-9</page-range></nlm-citation>
</ref>
<ref id="B16">
<nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Tachbelie]]></surname>
<given-names><![CDATA[M. Y.]]></given-names>
</name>
<name>
<surname><![CDATA[Solomon]]></surname>
<given-names><![CDATA[T. A.]]></given-names>
</name>
<name>
<surname><![CDATA[Besacier]]></surname>
<given-names><![CDATA[L.]]></given-names>
</name>
</person-group>
<source><![CDATA[Part-of-Speech Tagging for Under-Resourced and Morphologically Rich Languages: The Case of Amharic]]></source>
<year>2011</year>
<conf-name><![CDATA[ Conference on Human Language Technology for Development]]></conf-name>
<conf-loc>Alexandria, Egypt </conf-loc>
</nlm-citation>
</ref>
<ref id="B17">
<nlm-citation citation-type="">
<article-title xml:lang=""><![CDATA[Token and Type Constraints for Cross-Lingual Part-of-Speech Tagging]]></article-title>
<person-group person-group-type="author">
<name>
<surname><![CDATA[Tackstrom]]></surname>
<given-names><![CDATA[O.]]></given-names>
</name>
<name>
<surname><![CDATA[Das]]></surname>
<given-names><![CDATA[D.]]></given-names>
</name>
<name>
<surname><![CDATA[Petrov]]></surname>
<given-names><![CDATA[S.]]></given-names>
</name>
<name>
<surname><![CDATA[McDonald]]></surname>
<given-names><![CDATA[R.]]></given-names>
</name>
<name>
<surname><![CDATA[Nivre]]></surname>
<given-names><![CDATA[J.]]></given-names>
</name>
</person-group>
<source><![CDATA[Transactions of the Association for Computational Linguistics]]></source>
<year>2012</year>
<page-range>1-12</page-range></nlm-citation>
</ref>
<ref id="B18">
<nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Turki-Khemakhem]]></surname>
<given-names><![CDATA[I.]]></given-names>
</name>
<name>
<surname><![CDATA[Jamoussi]]></surname>
<given-names><![CDATA[S.]]></given-names>
</name>
<name>
<surname><![CDATA[Ben Hamadou]]></surname>
<given-names><![CDATA[A.]]></given-names>
</name>
</person-group>
<source><![CDATA[Arabic morpho-syntactic feature disambiguation in a translation context]]></source>
<year>2010</year>
<conf-name><![CDATA[ SSST-4 Fourth Workshop on Syntax and Structure in Statistical Translation]]></conf-name>
<conf-loc> </conf-loc>
</nlm-citation>
</ref>
</ref-list>
</back>
</article>
