<?xml version="1.0" encoding="ISO-8859-1"?><article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<front>
<journal-meta>
<journal-id>1405-5546</journal-id>
<journal-title><![CDATA[Computación y Sistemas]]></journal-title>
<abbrev-journal-title><![CDATA[Comp. y Sist.]]></abbrev-journal-title>
<issn>1405-5546</issn>
<publisher>
<publisher-name><![CDATA[Instituto Politécnico Nacional, Centro de Investigación en Computación]]></publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id>S1405-55462020000301353</article-id>
<article-id pub-id-type="doi">10.13053/cys-24-3-3775</article-id>
<title-group>
<article-title xml:lang="en"><![CDATA[Highly Language-Independent Word Lemmatization Using a Machine-Learning Classifier]]></article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname><![CDATA[Akhmetov]]></surname>
<given-names><![CDATA[Iskander]]></given-names>
</name>
<xref ref-type="aff" rid="Aff"/>
<xref ref-type="aff" rid="Aaf"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname><![CDATA[Pak]]></surname>
<given-names><![CDATA[Alexandr]]></given-names>
</name>
<xref ref-type="aff" rid="Aff"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname><![CDATA[Ualiyeva]]></surname>
<given-names><![CDATA[Irina]]></given-names>
</name>
<xref ref-type="aff" rid="Aff"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname><![CDATA[Gelbukh]]></surname>
<given-names><![CDATA[Alexander]]></given-names>
</name>
<xref ref-type="aff" rid="Aff"/>
</contrib>
</contrib-group>
<aff id="Af1">
<institution><![CDATA[,Institute of Information and Computational Technologies  ]]></institution>
<addr-line><![CDATA[Almaty ]]></addr-line>
<country>Kazakhstan</country>
</aff>
<aff id="Af2">
<institution><![CDATA[,Kazakh-British Technical University  ]]></institution>
<addr-line><![CDATA[Almaty ]]></addr-line>
<country>Kazakhstan</country>
</aff>
<aff id="Af3">
<institution><![CDATA[,Instituto Politécnico Nacional  ]]></institution>
<addr-line><![CDATA[Mexico City ]]></addr-line>
<country>Mexico</country>
</aff>
<pub-date pub-type="pub">
<day>00</day>
<month>09</month>
<year>2020</year>
</pub-date>
<pub-date pub-type="epub">
<day>00</day>
<month>09</month>
<year>2020</year>
</pub-date>
<volume>24</volume>
<numero>3</numero>
<fpage>1353</fpage>
<lpage>1364</lpage>
<copyright-statement/>
<copyright-year/>
<self-uri xlink:href="http://www.scielo.org.mx/scielo.php?script=sci_arttext&amp;pid=S1405-55462020000301353&amp;lng=en&amp;nrm=iso"></self-uri><self-uri xlink:href="http://www.scielo.org.mx/scielo.php?script=sci_abstract&amp;pid=S1405-55462020000301353&amp;lng=en&amp;nrm=iso"></self-uri><self-uri xlink:href="http://www.scielo.org.mx/scielo.php?script=sci_pdf&amp;pid=S1405-55462020000301353&amp;lng=en&amp;nrm=iso"></self-uri><abstract abstract-type="short" xml:lang="en"><p><![CDATA[Abstract: Lemmatization is a process of finding the base morphological form (lemma) of a word. It is an important step in many natural language processing, information retrieval, and information extraction tasks, among others. We present an open-source language-independent lemmatizer based on the Random Forest classification model. This model is a supervised machine-learning algorithm with decision trees that are constructed corresponding to the grammatical features of the language. This lemmatizer does not require any manual work for hard-coding of the rules, and at the same time it is simple and interpretable. We compare the performance of our lemmatizer with that of the UDPipe lemmatizer on twenty-two out of twenty-five languages we work on for which UDPipe has models. Our lemmatization method shows good performance on different languages from various language groups, and it is easily extensible to other languages. The source code of our lemmatizer is publicly available.]]></p></abstract>
<kwd-group>
<kwd lng="en"><![CDATA[Lemmatization]]></kwd>
<kwd lng="en"><![CDATA[natural language processing]]></kwd>
<kwd lng="en"><![CDATA[text preprocessing]]></kwd>
<kwd lng="en"><![CDATA[Random Forest classifier]]></kwd>
<kwd lng="en"><![CDATA[Decision Tree classifier]]></kwd>
</kwd-group>
</article-meta>
</front><back>
<ref-list>
<ref id="B1">
<label>1</label><nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Akhmetov]]></surname>
<given-names><![CDATA[I.]]></given-names>
</name>
<name>
<surname><![CDATA[Krassovitsky]]></surname>
<given-names><![CDATA[A.]]></given-names>
</name>
<name>
<surname><![CDATA[Ualiyeva]]></surname>
<given-names><![CDATA[I.]]></given-names>
</name>
<name>
<surname><![CDATA[Mussabayev]]></surname>
<given-names><![CDATA[R.]]></given-names>
</name>
<name>
<surname><![CDATA[Gelbukh]]></surname>
<given-names><![CDATA[A.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Lemmatization of russian language by tree regression models]]></article-title>
<source><![CDATA[Research in Computing Science]]></source>
<year>2020</year>
</nlm-citation>
</ref>
<ref id="B2">
<label>2</label><nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Altintas]]></surname>
<given-names><![CDATA[K.]]></given-names>
</name>
<name>
<surname><![CDATA[Cicekli]]></surname>
<given-names><![CDATA[I.]]></given-names>
</name>
</person-group>
<source><![CDATA[A morphological analyser for Crimean Tatar]]></source>
<year>2001</year>
<conf-name><![CDATA[ 10th Turkish Symposium on Artificial Intelligence and Neural Networks (TAINN&#8217;2001)]]></conf-name>
<conf-loc> </conf-loc>
<page-range>180&#8211;189</page-range></nlm-citation>
</ref>
<ref id="B3">
<label>3</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Altman]]></surname>
<given-names><![CDATA[N. S.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[An introduction to kernel and nearest-neighbor non-parametric regression]]></article-title>
<source><![CDATA[The American Statistician]]></source>
<year>1992</year>
<volume>46</volume>
<page-range>175&#8211;185</page-range></nlm-citation>
</ref>
<ref id="B4">
<label>4</label><nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Banko]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
<name>
<surname><![CDATA[Moore]]></surname>
<given-names><![CDATA[R. C.]]></given-names>
</name>
</person-group>
<source><![CDATA[Part-of-speech tagging in context]]></source>
<year>2004</year>
<conf-name><![CDATA[ 20th International Conference on Computational Linguistics]]></conf-name>
<conf-loc> </conf-loc>
<page-range>556&#8211;561</page-range></nlm-citation>
</ref>
<ref id="B5">
<label>5</label><nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Barnes]]></surname>
<given-names><![CDATA[J.]]></given-names>
</name>
</person-group>
<source><![CDATA[Porphyry: Introduction]]></source>
<year>2003</year>
<publisher-loc><![CDATA[UK ]]></publisher-loc>
<publisher-name><![CDATA[Oxford University Press]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B6">
<label>6</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Breiman]]></surname>
<given-names><![CDATA[L.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Random forests]]></article-title>
<source><![CDATA[Machine Learning]]></source>
<year>2001</year>
<volume>45</volume>
<page-range>5&#8211;32</page-range></nlm-citation>
</ref>
<ref id="B7">
<label>7</label><nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Breiman]]></surname>
<given-names><![CDATA[L.]]></given-names>
</name>
<name>
<surname><![CDATA[Friedman]]></surname>
<given-names><![CDATA[J.]]></given-names>
</name>
<name>
<surname><![CDATA[Olshen]]></surname>
<given-names><![CDATA[R.]]></given-names>
</name>
<name>
<surname><![CDATA[Stone]]></surname>
<given-names><![CDATA[C.]]></given-names>
</name>
</person-group>
<source><![CDATA[Classification and Regression Trees]]></source>
<year>1984</year>
<publisher-name><![CDATA[Chapman and Hall]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B8">
<label>8</label><nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Chakrabarty]]></surname>
<given-names><![CDATA[A.]]></given-names>
</name>
<name>
<surname><![CDATA[Chaturvedi]]></surname>
<given-names><![CDATA[A.]]></given-names>
</name>
<name>
<surname><![CDATA[Garain]]></surname>
<given-names><![CDATA[U.]]></given-names>
</name>
</person-group>
<source><![CDATA[CNN-based context sensitive lemmatization]]></source>
<year>2019</year>
<conf-name><![CDATA[ ACM India Joint International Conference on Data Science and Management of Data]]></conf-name>
<conf-loc> </conf-loc>
<page-range>334&#8211;337</page-range></nlm-citation>
</ref>
<ref id="B9">
<label>9</label><nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Clausius]]></surname>
<given-names><![CDATA[R.]]></given-names>
</name>
</person-group>
<source><![CDATA[The mechanical theory of heat]]></source>
<year>1879</year>
<publisher-name><![CDATA[Macmillan]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B10">
<label>10</label><nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Cöltekin]]></surname>
<given-names><![CDATA[C.]]></given-names>
</name>
</person-group>
<source><![CDATA[A freely available morphological analyzer for Turkish]]></source>
<year>2010</year>
<conf-name><![CDATA[ Seventh International Conference on Language Resources and Evaluation (LREC&#8217;10)]]></conf-name>
<conf-loc> </conf-loc>
<page-range>19&#8211;28</page-range></nlm-citation>
</ref>
<ref id="B11">
<label>11</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Dave]]></surname>
<given-names><![CDATA[R.]]></given-names>
</name>
<name>
<surname><![CDATA[Balani]]></surname>
<given-names><![CDATA[P.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Survey paper of different lemmatization approaches]]></article-title>
<source><![CDATA[International Journal of Research in Advent Technology, ICATEST 2015]]></source>
<year>2015</year>
<volume>8</volume>
<page-range>366&#8211;370</page-range></nlm-citation>
</ref>
<ref id="B12">
<label>12</label><nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Diab]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
<name>
<surname><![CDATA[Hacioglu]]></surname>
<given-names><![CDATA[K.]]></given-names>
</name>
<name>
<surname><![CDATA[Jurafsky]]></surname>
<given-names><![CDATA[D.]]></given-names>
</name>
</person-group>
<source><![CDATA[Automatic tagging of Arabic text: From raw text to base phrase chunks]]></source>
<year>2004</year>
<conf-name><![CDATA[ Proceedings of HLT-NAACL 2004: Short papers]]></conf-name>
<conf-loc> </conf-loc>
<page-range>149&#8211;152</page-range></nlm-citation>
</ref>
<ref id="B13">
<label>13</label><nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Gashkov]]></surname>
<given-names><![CDATA[A.]]></given-names>
</name>
<name>
<surname><![CDATA[Eltsova]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
</person-group>
<source><![CDATA[Lemmatization with reversed dictionary and fuzzy sets]]></source>
<year>2018</year>
<volume>55</volume>
<conf-name><![CDATA[ SHS Web of Conferences]]></conf-name>
<conf-loc> </conf-loc>
<page-range>04007</page-range><publisher-name><![CDATA[EDP Sciences]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B14">
<label>14</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Geurts]]></surname>
<given-names><![CDATA[P.]]></given-names>
</name>
<name>
<surname><![CDATA[Ernst]]></surname>
<given-names><![CDATA[D.]]></given-names>
</name>
<name>
<surname><![CDATA[Wehenkel]]></surname>
<given-names><![CDATA[L.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Extremely randomized trees]]></article-title>
<source><![CDATA[Machine Learning]]></source>
<year>2006</year>
<volume>63</volume>
<numero>1</numero>
<issue>1</issue>
<page-range>3&#8211;42</page-range></nlm-citation>
</ref>
<ref id="B15">
<label>15</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Gini]]></surname>
<given-names><![CDATA[C.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Measurement of inequality of incomes]]></article-title>
<source><![CDATA[The economic journal]]></source>
<year>1921</year>
<volume>31</volume>
<numero>121</numero>
<issue>121</issue>
<page-range>124&#8211;126</page-range></nlm-citation>
</ref>
<ref id="B16">
<label>16</label><nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Habash]]></surname>
<given-names><![CDATA[N.]]></given-names>
</name>
<name>
<surname><![CDATA[Rambow]]></surname>
<given-names><![CDATA[O.]]></given-names>
</name>
</person-group>
<source><![CDATA[Arabic tokenization, part-of-speech tagging and morphological disambiguation in one fell swoop]]></source>
<year>2005</year>
<conf-name><![CDATA[ 43rd annual meeting of the association for computational linguistics (ACL&#8217;05)]]></conf-name>
<conf-loc> </conf-loc>
<page-range>573&#8211;580</page-range></nlm-citation>
</ref>
<ref id="B17">
<label>17</label><nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Hansen]]></surname>
<given-names><![CDATA[P.]]></given-names>
</name>
<name>
<surname><![CDATA[Mladenovi&#263;]]></surname>
<given-names><![CDATA[N.]]></given-names>
</name>
<name>
<surname><![CDATA[Brimberg]]></surname>
<given-names><![CDATA[J.]]></given-names>
</name>
<name>
<surname><![CDATA[Pérez]]></surname>
<given-names><![CDATA[J. A. M.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Variable neighborhood search]]></article-title>
<source><![CDATA[Handbook of metaheuristics]]></source>
<year>2019</year>
<page-range>57&#8211;97</page-range><publisher-name><![CDATA[Springer]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B18">
<label>18</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Harris]]></surname>
<given-names><![CDATA[Z. S.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Distributional structure]]></article-title>
<source><![CDATA[Word]]></source>
<year>1954</year>
<volume>10</volume>
<numero>2-3</numero>
<issue>2-3</issue>
<page-range>146&#8211;162</page-range></nlm-citation>
</ref>
<ref id="B19">
<label>19</label><nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Ingólfsdóttir]]></surname>
<given-names><![CDATA[S. L.]]></given-names>
</name>
<name>
<surname><![CDATA[Loftsson]]></surname>
<given-names><![CDATA[H.]]></given-names>
</name>
<name>
<surname><![CDATA[Daðason]]></surname>
<given-names><![CDATA[J. F.]]></given-names>
</name>
<name>
<surname><![CDATA[Bjarnadóttir]]></surname>
<given-names><![CDATA[K.]]></given-names>
</name>
</person-group>
<source><![CDATA[Nefnir: A high accuracy lemmatizer for Icelandic]]></source>
<year>2019</year>
<conf-name><![CDATA[ 22nd Nordic Conference on Computational Linguistics]]></conf-name>
<conf-loc> </conf-loc>
<page-range>310&#8211;315</page-range></nlm-citation>
</ref>
<ref id="B20">
<label>20</label><nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[James]]></surname>
<given-names><![CDATA[G.]]></given-names>
</name>
<name>
<surname><![CDATA[Witten]]></surname>
<given-names><![CDATA[D.]]></given-names>
</name>
<name>
<surname><![CDATA[Hastie]]></surname>
<given-names><![CDATA[T.]]></given-names>
</name>
<name>
<surname><![CDATA[Tibshirani]]></surname>
<given-names><![CDATA[R.]]></given-names>
</name>
</person-group>
<source><![CDATA[An introduction to statistical learning]]></source>
<year>2013</year>
<volume>112</volume>
<publisher-name><![CDATA[Springer]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B21">
<label>21</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Jones]]></surname>
<given-names><![CDATA[K. S.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[A statistical interpretation of term specificity and its application in retrieval]]></article-title>
<source><![CDATA[Journal of Documentation]]></source>
<year>1972</year>
<volume>28</volume>
<numero>1</numero>
<issue>1</issue>
<page-range>11&#8211;21</page-range></nlm-citation>
</ref>
<ref id="B22">
<label>22</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Kanerva]]></surname>
<given-names><![CDATA[J.]]></given-names>
</name>
<name>
<surname><![CDATA[Ginter]]></surname>
<given-names><![CDATA[F.]]></given-names>
</name>
<name>
<surname><![CDATA[Salakoski]]></surname>
<given-names><![CDATA[T.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Universal lemmatizer: A sequence to sequence model for lemmatizing universal dependencies treebanks]]></article-title>
<source><![CDATA[CoRR]]></source>
<year>2019</year>
<volume>abs/1902.00972</volume>
</nlm-citation>
</ref>
<ref id="B23">
<label>23</label><nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[King]]></surname>
<given-names><![CDATA[B.]]></given-names>
</name>
</person-group>
<source><![CDATA[Practical Natural Language Processing for Low-Resource Languages]]></source>
<year>2015</year>
<publisher-name><![CDATA[University of Michigan]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B24">
<label>24</label><nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Kondratyuk]]></surname>
<given-names><![CDATA[D.]]></given-names>
</name>
<name>
<surname><![CDATA[Gaven&#269;iak]]></surname>
<given-names><![CDATA[T.]]></given-names>
</name>
<name>
<surname><![CDATA[Straka]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
<name>
<surname><![CDATA[Haji&#269;]]></surname>
<given-names><![CDATA[J.]]></given-names>
</name>
</person-group>
<source><![CDATA[LemmaTag: Jointly tagging and lemmatizing for morphologically rich languages with BRNNs]]></source>
<year>2018</year>
<conf-name><![CDATA[ 2018 Conference on Empirical Methods in Natural Language Processing]]></conf-name>
<conf-loc> </conf-loc>
<page-range>4921&#8211;4928</page-range></nlm-citation>
</ref>
<ref id="B25">
<label>25</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Ling]]></surname>
<given-names><![CDATA[C. X.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Learning the past tense of English verbs: The symbolic pattern associator vs. connectionist models]]></article-title>
<source><![CDATA[J. Artif. Intell. Res.]]></source>
<year>1994</year>
<volume>1</volume>
<page-range>209&#8211;229</page-range></nlm-citation>
</ref>
<ref id="B26">
<label>26</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Luhn]]></surname>
<given-names><![CDATA[H. P.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[A statistical approach to mechanized encoding and searching of literary information]]></article-title>
<source><![CDATA[IBM Journal of Research and Development]]></source>
<year>1957</year>
<volume>1</volume>
<numero>4</numero>
<issue>4</issue>
<page-range>309&#8211;317</page-range></nlm-citation>
</ref>
<ref id="B27">
<label>27</label><nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Manning]]></surname>
<given-names><![CDATA[C. D.]]></given-names>
</name>
<name>
<surname><![CDATA[Surdeanu]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
<name>
<surname><![CDATA[Bauer]]></surname>
<given-names><![CDATA[J.]]></given-names>
</name>
<name>
<surname><![CDATA[Finkel]]></surname>
<given-names><![CDATA[J. R.]]></given-names>
</name>
<name>
<surname><![CDATA[Bethard]]></surname>
<given-names><![CDATA[S.]]></given-names>
</name>
<name>
<surname><![CDATA[McClosky]]></surname>
<given-names><![CDATA[D.]]></given-names>
</name>
</person-group>
<source><![CDATA[The Stanford CoreNLP natural language processing toolkit]]></source>
<year>2014</year>
<conf-name><![CDATA[ 52nd annual meeting of the association for computational linguistics: system demonstrations]]></conf-name>
<conf-loc> </conf-loc>
<page-range>55&#8211;60</page-range></nlm-citation>
</ref>
<ref id="B28">
<label>28</label><nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[McCarthy]]></surname>
<given-names><![CDATA[A. D.]]></given-names>
</name>
<name>
<surname><![CDATA[Vylomova]]></surname>
<given-names><![CDATA[E.]]></given-names>
</name>
<name>
<surname><![CDATA[Wu]]></surname>
<given-names><![CDATA[S.]]></given-names>
</name>
<name>
<surname><![CDATA[Malaviya]]></surname>
<given-names><![CDATA[C.]]></given-names>
</name>
<name>
<surname><![CDATA[Wolf-Sonkin]]></surname>
<given-names><![CDATA[L.]]></given-names>
</name>
<name>
<surname><![CDATA[Nicolai]]></surname>
<given-names><![CDATA[G.]]></given-names>
</name>
<name>
<surname><![CDATA[Kirov]]></surname>
<given-names><![CDATA[C.]]></given-names>
</name>
<name>
<surname><![CDATA[Silfverberg]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
<name>
<surname><![CDATA[Mielke]]></surname>
<given-names><![CDATA[S. J.]]></given-names>
</name>
<name>
<surname><![CDATA[Heinz]]></surname>
<given-names><![CDATA[J.]]></given-names>
</name>
<name>
<surname><![CDATA[Cotterell]]></surname>
<given-names><![CDATA[R.]]></given-names>
</name>
<name>
<surname><![CDATA[Hulden]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
</person-group>
<source><![CDATA[The SIGMORPHON 2019 shared task: Morphological analysis in context and cross-lingual transfer for inflection]]></source>
<year>2019</year>
<conf-name><![CDATA[ 16th Workshop on Computational Research in Phonetics, Phonology, and Morphology]]></conf-name>
<conf-loc> </conf-loc>
<page-range>229&#8211;244</page-range></nlm-citation>
</ref>
<ref id="B29">
<label>29</label><nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[M&#283;chura]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
</person-group>
<source><![CDATA[Lemmatization-lists]]></source>
<year>2018</year>
</nlm-citation>
</ref>
<ref id="B30">
<label>30</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Mooney]]></surname>
<given-names><![CDATA[R. J.]]></given-names>
</name>
<name>
<surname><![CDATA[Califf]]></surname>
<given-names><![CDATA[M. E.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Induction of first-order decision lists: Results on learning the past tense of English verbs]]></article-title>
<source><![CDATA[J. Artif. Intell. Res.]]></source>
<year>1995</year>
<volume>3</volume>
<page-range>1&#8211;24</page-range></nlm-citation>
</ref>
<ref id="B31">
<label>31</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Pedregosa]]></surname>
<given-names><![CDATA[F.]]></given-names>
</name>
<name>
<surname><![CDATA[Varoquaux]]></surname>
<given-names><![CDATA[G.]]></given-names>
</name>
<name>
<surname><![CDATA[Gramfort]]></surname>
<given-names><![CDATA[A.]]></given-names>
</name>
<name>
<surname><![CDATA[Michel]]></surname>
<given-names><![CDATA[V.]]></given-names>
</name>
<name>
<surname><![CDATA[Thirion]]></surname>
<given-names><![CDATA[B.]]></given-names>
</name>
<name>
<surname><![CDATA[Grisel]]></surname>
<given-names><![CDATA[O.]]></given-names>
</name>
<name>
<surname><![CDATA[Blondel]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
<name>
<surname><![CDATA[Prettenhofer]]></surname>
<given-names><![CDATA[P.]]></given-names>
</name>
<name>
<surname><![CDATA[Weiss]]></surname>
<given-names><![CDATA[R.]]></given-names>
</name>
<name>
<surname><![CDATA[Dubourg]]></surname>
<given-names><![CDATA[V.]]></given-names>
</name>
<name>
<surname><![CDATA[Vanderplas]]></surname>
<given-names><![CDATA[J.]]></given-names>
</name>
<name>
<surname><![CDATA[Passos]]></surname>
<given-names><![CDATA[A.]]></given-names>
</name>
<name>
<surname><![CDATA[Cournapeau]]></surname>
<given-names><![CDATA[D.]]></given-names>
</name>
<name>
<surname><![CDATA[Brucher]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
<name>
<surname><![CDATA[Perrot]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
<name>
<surname><![CDATA[Duchesnay]]></surname>
<given-names><![CDATA[E.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Scikit-learn: Machine learning in Python]]></article-title>
<source><![CDATA[Journal of Machine Learning Research]]></source>
<year>2011</year>
<volume>12</volume>
<page-range>2825&#8211;2830</page-range></nlm-citation>
</ref>
<ref id="B32">
<label>32</label><nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Plisson]]></surname>
<given-names><![CDATA[J.]]></given-names>
</name>
<name>
<surname><![CDATA[Lavrac]]></surname>
<given-names><![CDATA[N.]]></given-names>
</name>
<name>
<surname><![CDATA[Mladeni&#263;]]></surname>
<given-names><![CDATA[D.]]></given-names>
</name>
</person-group>
<source><![CDATA[A rule based approach to word lemmatization]]></source>
<year>2004</year>
<conf-name><![CDATA[ Proceedings of IS04]]></conf-name>
<conf-loc> </conf-loc>
</nlm-citation>
</ref>
<ref id="B33">
<label>33</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Rutemiller]]></surname>
<given-names><![CDATA[H. C.]]></given-names>
</name>
<name>
<surname><![CDATA[Bowers]]></surname>
<given-names><![CDATA[D. A.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Estimation in a heteroscedastic regression model]]></article-title>
<source><![CDATA[Journal of the American Statistical Association]]></source>
<year>1968</year>
<volume>63</volume>
<numero>322</numero>
<issue>322</issue>
<page-range>552&#8211;557</page-range></nlm-citation>
</ref>
<ref id="B34">
<label>34</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Sahlgren]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[The distributional hypothesis]]></article-title>
<source><![CDATA[Italian Journal of Disability Studies]]></source>
<year>2008</year>
<volume>20</volume>
<page-range>33&#8211;53</page-range></nlm-citation>
</ref>
<ref id="B35">
<label>35</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Shannon]]></surname>
<given-names><![CDATA[C. E.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[A mathematical theory of communication]]></article-title>
<source><![CDATA[The Bell System Technical Journal]]></source>
<year>1948</year>
<volume>27</volume>
<numero>3</numero>
<issue>3</issue>
<page-range>379&#8211;423</page-range></nlm-citation>
</ref>
<ref id="B36">
<label>36</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Simon]]></surname>
<given-names><![CDATA[H. A.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Experiments in induction]]></article-title>
<source><![CDATA[The American Journal of Psychology]]></source>
<year>1967</year>
<volume>80</volume>
<numero>4</numero>
<issue>4</issue>
<page-range>651&#8211;653</page-range></nlm-citation>
</ref>
<ref id="B37">
<label>37</label><nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Stankovi&#263;]]></surname>
<given-names><![CDATA[R.]]></given-names>
</name>
<name>
<surname><![CDATA[Krstev]]></surname>
<given-names><![CDATA[C.]]></given-names>
</name>
<name>
<surname><![CDATA[Obradovi&#263;]]></surname>
<given-names><![CDATA[I.]]></given-names>
</name>
<name>
<surname><![CDATA[Lazi&#263;]]></surname>
<given-names><![CDATA[B.]]></given-names>
</name>
<name>
<surname><![CDATA[Trtovac]]></surname>
<given-names><![CDATA[A.]]></given-names>
</name>
</person-group>
<source><![CDATA[Rule-based automatic multi-word term extraction and lemmatization]]></source>
<year>2016</year>
<conf-name><![CDATA[ 10th International Conference on Language Resources and Evaluation, LREC 2016]]></conf-name>
<conf-loc> </conf-loc>
<page-range>507&#8211;514</page-range></nlm-citation>
</ref>
<ref id="B38">
<label>38</label><nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Straka]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
</person-group>
<source><![CDATA[UDPipe 2.0 prototype at CoNLL 2018 UD shared task]]></source>
<year>2018</year>
<conf-name><![CDATA[ CoNLL 2018 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies]]></conf-name>
<conf-loc>Brussels, Belgium </conf-loc>
<page-range>197&#8211;207</page-range></nlm-citation>
</ref>
<ref id="B39">
<label>39</label><nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Thomas]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
<name>
<surname><![CDATA[Ryan]]></surname>
<given-names><![CDATA[C.]]></given-names>
</name>
<name>
<surname><![CDATA[Alexander]]></surname>
<given-names><![CDATA[F.]]></given-names>
</name>
<name>
<surname><![CDATA[Hinrich]]></surname>
<given-names><![CDATA[S.]]></given-names>
</name>
</person-group>
<source><![CDATA[Joint lemmatization and morphological tagging with LEMMING]]></source>
<year>2015</year>
<conf-name><![CDATA[ Conference on Empirical Methods in Natural Language Processing]]></conf-name>
<conf-date>2015</conf-date>
<conf-loc> </conf-loc>
<page-range>2268&#8211;2274</page-range></nlm-citation>
</ref>
<ref id="B40">
<label>40</label><nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Witten]]></surname>
<given-names><![CDATA[I.]]></given-names>
</name>
<name>
<surname><![CDATA[Eibe]]></surname>
<given-names><![CDATA[F.]]></given-names>
</name>
</person-group>
<source><![CDATA[Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations]]></source>
<year>1999</year>
<publisher-name><![CDATA[Morgan Kaufmann]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B41">
<label>41</label><nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Zaliznjak]]></surname>
<given-names><![CDATA[A. A.]]></given-names>
</name>
</person-group>
<source><![CDATA[Open grammar dictionary of the Russian language]]></source>
<year>2014</year>
</nlm-citation>
</ref>
</ref-list>
</back>
</article>
