<?xml version="1.0" encoding="ISO-8859-1"?><article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<front>
<journal-meta>
<journal-id>1405-5546</journal-id>
<journal-title><![CDATA[Computación y Sistemas]]></journal-title>
<abbrev-journal-title><![CDATA[Comp. y Sist.]]></abbrev-journal-title>
<issn>1405-5546</issn>
<publisher>
<publisher-name><![CDATA[Instituto Politécnico Nacional, Centro de Investigación en Computación]]></publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id>S1405-55462024000401865</article-id>
<article-id pub-id-type="doi">10.13053/cys-28-4-5068</article-id>
<title-group>
<article-title xml:lang="en"><![CDATA[No Need to Get Wasteful: The Way to Train a Lightweight Competitive Spelling Checker Using (Concentrated) Synthetic Datasets]]></article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname><![CDATA[Starchenko]]></surname>
<given-names><![CDATA[Vladimir]]></given-names>
</name>
<xref ref-type="aff" rid="Aff"/>
</contrib>
</contrib-group>
<aff id="Af1">
<institution><![CDATA[,Higher School of Economics University School of Linguistics ]]></institution>
<addr-line><![CDATA[Moscow ]]></addr-line>
<country>Russia</country>
</aff>
<pub-date pub-type="pub">
<day>00</day>
<month>12</month>
<year>2024</year>
</pub-date>
<pub-date pub-type="epub">
<day>00</day>
<month>12</month>
<year>2024</year>
</pub-date>
<volume>28</volume>
<numero>4</numero>
<fpage>1865</fpage>
<lpage>1877</lpage>
<copyright-statement/>
<copyright-year/>
<self-uri xlink:href="http://www.scielo.org.mx/scielo.php?script=sci_arttext&amp;pid=S1405-55462024000401865&amp;lng=en&amp;nrm=iso"></self-uri><self-uri xlink:href="http://www.scielo.org.mx/scielo.php?script=sci_abstract&amp;pid=S1405-55462024000401865&amp;lng=en&amp;nrm=iso"></self-uri><self-uri xlink:href="http://www.scielo.org.mx/scielo.php?script=sci_pdf&amp;pid=S1405-55462024000401865&amp;lng=en&amp;nrm=iso"></self-uri><abstract abstract-type="short" xml:lang="en"><p><![CDATA[Abstract: This study focuses on spelling checkers, which remains problematic for modern error correction systems. Based on T5 architecture, we create a lightweight spelling check tool that can be used in combination with a large language model (LLM) and significantly improves the overall result of the error correction system. It also performs competitively compared to other recently developed spelling check tools, despite being considerably smaller in size. The high performance of the model is obtained as a result of introducing two synthetic datasets: a dataset with a high density of spelling errors and the dataset with errors more difficult for correction.]]></p></abstract>
<kwd-group>
<kwd lng="en"><![CDATA[Spelling errors]]></kwd>
<kwd lng="en"><![CDATA[spelling check]]></kwd>
<kwd lng="en"><![CDATA[grammatical error correction]]></kwd>
<kwd lng="en"><![CDATA[preprocessing]]></kwd>
<kwd lng="en"><![CDATA[synthetic datasets]]></kwd>
</kwd-group>
</article-meta>
</front><back>
<ref-list>
<ref id="B1">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Ahmed]]></surname>
<given-names><![CDATA[F.]]></given-names>
</name>
<name>
<surname><![CDATA[Luca]]></surname>
<given-names><![CDATA[E. W. D.]]></given-names>
</name>
<name>
<surname><![CDATA[Nürnberger]]></surname>
<given-names><![CDATA[A.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Revised n-gram based automatic spelling correction tool to improve retrieval effectiveness]]></article-title>
<source><![CDATA[Polibits]]></source>
<year>2009</year>
<volume>1</volume>
<numero>40</numero>
<issue>40</issue>
<page-range>39-48</page-range></nlm-citation>
</ref>
<ref id="B2">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Bryant]]></surname>
<given-names><![CDATA[C.]]></given-names>
</name>
<name>
<surname><![CDATA[Yuan]]></surname>
<given-names><![CDATA[Z.]]></given-names>
</name>
<name>
<surname><![CDATA[Qorib]]></surname>
<given-names><![CDATA[M. R.]]></given-names>
</name>
<name>
<surname><![CDATA[Cao]]></surname>
<given-names><![CDATA[H.]]></given-names>
</name>
<name>
<surname><![CDATA[Ng]]></surname>
<given-names><![CDATA[H. T.]]></given-names>
</name>
<name>
<surname><![CDATA[Briscoe]]></surname>
<given-names><![CDATA[T.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Grammatical error correction: A survey of the state of the art]]></article-title>
<source><![CDATA[Computational Linguistics]]></source>
<year>2023</year>
<volume>49</volume>
<numero>9</numero>
<issue>9</issue>
<page-range>643-701</page-range></nlm-citation>
</ref>
<ref id="B3">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Büyük]]></surname>
<given-names><![CDATA[O.]]></given-names>
</name>
<name>
<surname><![CDATA[Arslan]]></surname>
<given-names><![CDATA[L. M.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Learning from mistakes: Improving spelling correction performance with automatic generation of realistic misspellings]]></article-title>
<source><![CDATA[Expert Systems]]></source>
<year>2021</year>
<volume>38</volume>
<numero>5</numero>
<issue>5</issue>
<page-range>e12692</page-range></nlm-citation>
</ref>
<ref id="B4">
<nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Chollampatt]]></surname>
<given-names><![CDATA[S.]]></given-names>
</name>
<name>
<surname><![CDATA[Ng]]></surname>
<given-names><![CDATA[H. T.]]></given-names>
</name>
</person-group>
<source><![CDATA[Connecting the dots: Towards human-level grammatical error correction]]></source>
<year>2017</year>
<conf-name><![CDATA[ 12th Workshop on Innovative Use of NLP for Building Educational Applications]]></conf-name>
<conf-loc> </conf-loc>
<page-range>327-33</page-range></nlm-citation>
</ref>
<ref id="B5">
<nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Chollampatt]]></surname>
<given-names><![CDATA[S.]]></given-names>
</name>
<name>
<surname><![CDATA[Ng]]></surname>
<given-names><![CDATA[H. T.]]></given-names>
</name>
</person-group>
<source><![CDATA[A reassessment of reference-based grammatical error correction metrics]]></source>
<year>2018</year>
<conf-name><![CDATA[ 27th International Conference on Computational Linguistics]]></conf-name>
<conf-loc> </conf-loc>
<page-range>2730-41</page-range></nlm-citation>
</ref>
<ref id="B6">
<nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Chollampatt]]></surname>
<given-names><![CDATA[S.]]></given-names>
</name>
<name>
<surname><![CDATA[Wang]]></surname>
<given-names><![CDATA[W.]]></given-names>
</name>
<name>
<surname><![CDATA[Ng]]></surname>
<given-names><![CDATA[H. T.]]></given-names>
</name>
</person-group>
<source><![CDATA[Cross-sentence grammatical error correction]]></source>
<year>2019</year>
<conf-name><![CDATA[ 57th Annual Meeting of the Association for Computational Linguistics]]></conf-name>
<conf-loc> </conf-loc>
<page-range>435-45</page-range></nlm-citation>
</ref>
<ref id="B7">
<nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Ge]]></surname>
<given-names><![CDATA[T.]]></given-names>
</name>
<name>
<surname><![CDATA[Wei]]></surname>
<given-names><![CDATA[F.]]></given-names>
</name>
<name>
<surname><![CDATA[Zhou]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
</person-group>
<source><![CDATA[Fluency boost learning and inference for neural grammatical error correction]]></source>
<year>2018</year>
<volume>1</volume>
<conf-name><![CDATA[ 56th Annual Meeting of the Association for Computational Linguistics]]></conf-name>
<conf-loc>Melbourne, Australia </conf-loc>
<page-range>1055-65</page-range></nlm-citation>
</ref>
<ref id="B8">
<nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Ghosh]]></surname>
<given-names><![CDATA[S.]]></given-names>
</name>
<name>
<surname><![CDATA[Kristensson]]></surname>
<given-names><![CDATA[P. O.]]></given-names>
</name>
</person-group>
<source><![CDATA[Neural networks for text correction and completion in keyboard decoding]]></source>
<year>2017</year>
</nlm-citation>
</ref>
<ref id="B9">
<nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Gotou]]></surname>
<given-names><![CDATA[T.]]></given-names>
</name>
<name>
<surname><![CDATA[Nagata]]></surname>
<given-names><![CDATA[R.]]></given-names>
</name>
<name>
<surname><![CDATA[Mita]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
<name>
<surname><![CDATA[Hanawa]]></surname>
<given-names><![CDATA[K.]]></given-names>
</name>
</person-group>
<source><![CDATA[Taking the correction difficulty into account in grammatical error correction evaluation]]></source>
<year>2020</year>
<conf-name><![CDATA[ 28th International Conference on Computational Linguistics]]></conf-name>
<conf-loc> </conf-loc>
<page-range>2085-95</page-range></nlm-citation>
</ref>
<ref id="B10">
<nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Grundkiewicz]]></surname>
<given-names><![CDATA[R.]]></given-names>
</name>
<name>
<surname><![CDATA[Junczys-Dowmunt]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
<name>
<surname><![CDATA[Gillian]]></surname>
<given-names><![CDATA[E.]]></given-names>
</name>
</person-group>
<source><![CDATA[Human evaluation of grammatical error correction systems]]></source>
<year>2015</year>
<conf-name><![CDATA[ Conference on Empirical Methods in Natural Language Processing]]></conf-name>
<conf-date>2015</conf-date>
<conf-loc> </conf-loc>
<page-range>461-70</page-range></nlm-citation>
</ref>
<ref id="B11">
<nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Guo]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
<name>
<surname><![CDATA[Ainslie]]></surname>
<given-names><![CDATA[J.]]></given-names>
</name>
<name>
<surname><![CDATA[Uthus]]></surname>
<given-names><![CDATA[D.]]></given-names>
</name>
<name>
<surname><![CDATA[Ontanon]]></surname>
<given-names><![CDATA[S.]]></given-names>
</name>
<name>
<surname><![CDATA[Ni]]></surname>
<given-names><![CDATA[J.]]></given-names>
</name>
<name>
<surname><![CDATA[Sung]]></surname>
<given-names><![CDATA[Y. H.]]></given-names>
</name>
<name>
<surname><![CDATA[Yang]]></surname>
<given-names><![CDATA[Y.]]></given-names>
</name>
</person-group>
<source><![CDATA[LongT5: Efficient text-to-text transformer for long sequences]]></source>
<year>2022</year>
<conf-name><![CDATA[ Findings of the Association for Computational Linguistics: NAACL]]></conf-name>
<conf-date>2022</conf-date>
<conf-loc> </conf-loc>
<page-range>724-36</page-range></nlm-citation>
</ref>
<ref id="B12">
<nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Kalamkar]]></surname>
<given-names><![CDATA[D.]]></given-names>
</name>
<name>
<surname><![CDATA[Mudigere]]></surname>
<given-names><![CDATA[D.]]></given-names>
</name>
<name>
<surname><![CDATA[Mellempudi]]></surname>
<given-names><![CDATA[N.]]></given-names>
</name>
<name>
<surname><![CDATA[Das]]></surname>
<given-names><![CDATA[D.]]></given-names>
</name>
<name>
<surname><![CDATA[Banerjee]]></surname>
<given-names><![CDATA[K.]]></given-names>
</name>
<name>
<surname><![CDATA[Avancha]]></surname>
<given-names><![CDATA[S.]]></given-names>
</name>
<name>
<surname><![CDATA[Teja-Vooturi]]></surname>
<given-names><![CDATA[D.]]></given-names>
</name>
<name>
<surname><![CDATA[Jammalamadaka]]></surname>
<given-names><![CDATA[N.]]></given-names>
</name>
<name>
<surname><![CDATA[Huang]]></surname>
<given-names><![CDATA[J.]]></given-names>
</name>
<name>
<surname><![CDATA[Yuen]]></surname>
<given-names><![CDATA[H.]]></given-names>
</name>
<name>
<surname><![CDATA[Yang]]></surname>
<given-names><![CDATA[J.]]></given-names>
</name>
<name>
<surname><![CDATA[Park]]></surname>
<given-names><![CDATA[J.]]></given-names>
</name>
<name>
<surname><![CDATA[Heinecke]]></surname>
<given-names><![CDATA[A.]]></given-names>
</name>
<name>
<surname><![CDATA[Georganas]]></surname>
<given-names><![CDATA[E.]]></given-names>
</name>
<name>
<surname><![CDATA[Srinivasan]]></surname>
<given-names><![CDATA[S.]]></given-names>
</name>
<name>
<surname><![CDATA[Kundu]]></surname>
<given-names><![CDATA[A.]]></given-names>
</name>
<name>
<surname><![CDATA[Smelyanskiy]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
<name>
<surname><![CDATA[Kaul]]></surname>
<given-names><![CDATA[B.]]></given-names>
</name>
<name>
<surname><![CDATA[Dubey]]></surname>
<given-names><![CDATA[P.]]></given-names>
</name>
</person-group>
<source><![CDATA[A study of BFLOAT16 for deep learning training]]></source>
<year>2019</year>
</nlm-citation>
</ref>
<ref id="B13">
<nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Katsumata]]></surname>
<given-names><![CDATA[S.]]></given-names>
</name>
<name>
<surname><![CDATA[Komachi]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
</person-group>
<source><![CDATA[Stronger baselines for grammatical error correction using a pretrained encoder-decoder model]]></source>
<year>2020</year>
<conf-name><![CDATA[ 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing]]></conf-name>
<conf-loc> </conf-loc>
<page-range>827-32</page-range></nlm-citation>
</ref>
<ref id="B14">
<nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Kernighan]]></surname>
<given-names><![CDATA[M. D.]]></given-names>
</name>
<name>
<surname><![CDATA[Church]]></surname>
<given-names><![CDATA[K. W.]]></given-names>
</name>
<name>
<surname><![CDATA[Gale]]></surname>
<given-names><![CDATA[W. A.]]></given-names>
</name>
</person-group>
<source><![CDATA[A spelling correction program based on a noisy channel model. COLING]]></source>
<year>1990</year>
<volume>2</volume>
<conf-name><![CDATA[ 13th International Conference on Computational Linguistics]]></conf-name>
<conf-loc> </conf-loc>
<page-range>205-10</page-range></nlm-citation>
</ref>
<ref id="B15">
<nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Kondrak]]></surname>
<given-names><![CDATA[G.]]></given-names>
</name>
<name>
<surname><![CDATA[Sherif]]></surname>
<given-names><![CDATA[T.]]></given-names>
</name>
</person-group>
<source><![CDATA[Evaluation of several phonetic similarity algorithms on the task of cognate identification]]></source>
<year>2006</year>
<conf-name><![CDATA[ Workshop on Linguistic Distances]]></conf-name>
<conf-loc> </conf-loc>
<page-range>43-50</page-range></nlm-citation>
</ref>
<ref id="B16">
<nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Korre]]></surname>
<given-names><![CDATA[K.]]></given-names>
</name>
<name>
<surname><![CDATA[Pavlopoulos]]></surname>
<given-names><![CDATA[J.]]></given-names>
</name>
</person-group>
<source><![CDATA[ERRANT: Assessing and improving grammatical error type classification]]></source>
<year>2020</year>
<conf-name><![CDATA[ 4th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature]]></conf-name>
<conf-loc> </conf-loc>
<page-range>85-9</page-range></nlm-citation>
</ref>
<ref id="B17">
<nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Kudo]]></surname>
<given-names><![CDATA[T.]]></given-names>
</name>
<name>
<surname><![CDATA[Richardson]]></surname>
<given-names><![CDATA[J.]]></given-names>
</name>
</person-group>
<source><![CDATA[SentencePiece: A simple and language independent subword tokenizer and detokenizer for neural text processing]]></source>
<year>2018</year>
<conf-name><![CDATA[ Conference on Empirical Methods in Natural Language Processing: System Demonstrations]]></conf-name>
<conf-date>2018</conf-date>
<conf-loc> </conf-loc>
<page-range>66-71</page-range></nlm-citation>
</ref>
<ref id="B18">
<nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Lewis]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
<name>
<surname><![CDATA[Liu]]></surname>
<given-names><![CDATA[Y.]]></given-names>
</name>
<name>
<surname><![CDATA[Goyal]]></surname>
<given-names><![CDATA[N.]]></given-names>
</name>
<name>
<surname><![CDATA[Ghazvininejad]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
<name>
<surname><![CDATA[Mohamed]]></surname>
<given-names><![CDATA[A.]]></given-names>
</name>
<name>
<surname><![CDATA[Levy]]></surname>
<given-names><![CDATA[O.]]></given-names>
</name>
<name>
<surname><![CDATA[Stoyanov]]></surname>
<given-names><![CDATA[V.]]></given-names>
</name>
<name>
<surname><![CDATA[Zettlemoyer]]></surname>
<given-names><![CDATA[L.]]></given-names>
</name>
</person-group>
<source><![CDATA[BART: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension]]></source>
<year>2020</year>
<conf-name><![CDATA[ 58th Annual Meeting of the Association for Computational Linguistics]]></conf-name>
<conf-loc> </conf-loc>
<page-range>7871-80</page-range></nlm-citation>
</ref>
<ref id="B19">
<nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Li]]></surname>
<given-names><![CDATA[H.]]></given-names>
</name>
<name>
<surname><![CDATA[Wang]]></surname>
<given-names><![CDATA[Y.]]></given-names>
</name>
<name>
<surname><![CDATA[Liu]]></surname>
<given-names><![CDATA[X.]]></given-names>
</name>
<name>
<surname><![CDATA[Sheng]]></surname>
<given-names><![CDATA[Z.]]></given-names>
</name>
<name>
<surname><![CDATA[Wei]]></surname>
<given-names><![CDATA[S.]]></given-names>
</name>
</person-group>
<source><![CDATA[Spelling error correction using a nested RNN model and pseudo training data]]></source>
<year>2018</year>
</nlm-citation>
</ref>
<ref id="B20">
<nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Martynov]]></surname>
<given-names><![CDATA[N.]]></given-names>
</name>
<name>
<surname><![CDATA[Baushenko]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
<name>
<surname><![CDATA[Abramov]]></surname>
<given-names><![CDATA[A.]]></given-names>
</name>
<name>
<surname><![CDATA[Fenogenova]]></surname>
<given-names><![CDATA[A.]]></given-names>
</name>
</person-group>
<source><![CDATA[Augmentation methods for spelling corruptions]]></source>
<year>2023</year>
<conf-name><![CDATA[ International Conference Dialogue]]></conf-name>
<conf-loc> </conf-loc>
</nlm-citation>
</ref>
<ref id="B21">
<nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Martynov]]></surname>
<given-names><![CDATA[N.]]></given-names>
</name>
<name>
<surname><![CDATA[Baushenko]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
<name>
<surname><![CDATA[Kozlova]]></surname>
<given-names><![CDATA[A.]]></given-names>
</name>
<name>
<surname><![CDATA[Kolomeytseva]]></surname>
<given-names><![CDATA[K.]]></given-names>
</name>
<name>
<surname><![CDATA[Abramov]]></surname>
<given-names><![CDATA[A.]]></given-names>
</name>
<name>
<surname><![CDATA[Fenogenova]]></surname>
<given-names><![CDATA[A.]]></given-names>
</name>
</person-group>
<source><![CDATA[A methodology for generative spelling correction via natural spelling errors emulation across multiple domains and languages]]></source>
<year>2023</year>
<conf-name><![CDATA[ Findings of the Association for Computational Linguistics: EACL]]></conf-name>
<conf-date>2024</conf-date>
<conf-loc> </conf-loc>
<page-range>138-55</page-range></nlm-citation>
</ref>
<ref id="B22">
<nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Napoles]]></surname>
<given-names><![CDATA[C.]]></given-names>
</name>
<name>
<surname><![CDATA[Sakaguchi]]></surname>
<given-names><![CDATA[K.]]></given-names>
</name>
<name>
<surname><![CDATA[Post]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
<name>
<surname><![CDATA[Tetreault]]></surname>
<given-names><![CDATA[J.]]></given-names>
</name>
</person-group>
<source><![CDATA[Ground truth for grammatical error correction metrics]]></source>
<year>2015</year>
<conf-name><![CDATA[ 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing]]></conf-name>
<conf-loc> </conf-loc>
<page-range>588-93</page-range></nlm-citation>
</ref>
<ref id="B23">
<nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Napoles]]></surname>
<given-names><![CDATA[C.]]></given-names>
</name>
<name>
<surname><![CDATA[Sakaguchi]]></surname>
<given-names><![CDATA[K.]]></given-names>
</name>
<name>
<surname><![CDATA[Tetreault]]></surname>
<given-names><![CDATA[J.]]></given-names>
</name>
</person-group>
<source><![CDATA[JFLEG: A fluency corpus and benchmark for grammatical error correction]]></source>
<year>2017</year>
<volume>2</volume>
<conf-name><![CDATA[ 15th Conference of the European Chapter of the Association for Computational Linguistics]]></conf-name>
<conf-loc> </conf-loc>
<page-range>229-34</page-range></nlm-citation>
</ref>
<ref id="B24">
<nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Näther]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
</person-group>
<source><![CDATA[An in-depth comparison of 14 spelling correction tools on a common benchmark]]></source>
<year>2020</year>
<conf-name><![CDATA[ Twelfth Language Resources and Evaluation Conference]]></conf-name>
<conf-loc> </conf-loc>
<page-range>1849-57</page-range></nlm-citation>
</ref>
<ref id="B25">
<nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Ng]]></surname>
<given-names><![CDATA[H. T.]]></given-names>
</name>
<name>
<surname><![CDATA[Wu]]></surname>
<given-names><![CDATA[S. M.]]></given-names>
</name>
<name>
<surname><![CDATA[Briscoe]]></surname>
<given-names><![CDATA[T.]]></given-names>
</name>
<name>
<surname><![CDATA[Hadiwinoto]]></surname>
<given-names><![CDATA[C.]]></given-names>
</name>
<name>
<surname><![CDATA[Susanto]]></surname>
<given-names><![CDATA[R. H.]]></given-names>
</name>
<name>
<surname><![CDATA[Bryant]]></surname>
<given-names><![CDATA[C.]]></given-names>
</name>
</person-group>
<source><![CDATA[The CoNLL-2014 shared task on grammatical error correction]]></source>
<year>2014</year>
<conf-name><![CDATA[ Eighteenth Conference on Computational Natural Language Learning: Shared Task]]></conf-name>
<conf-loc> </conf-loc>
<page-range>1-14</page-range></nlm-citation>
</ref>
<ref id="B26">
<nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Qorib]]></surname>
<given-names><![CDATA[M. R.]]></given-names>
</name>
<name>
<surname><![CDATA[Ng]]></surname>
<given-names><![CDATA[H. T.]]></given-names>
</name>
</person-group>
<source><![CDATA[Grammatical error correction: Are we there yet?]]></source>
<year>2022</year>
<conf-name><![CDATA[ 29th International Conference on Computational Linguistics]]></conf-name>
<conf-loc> </conf-loc>
<page-range>2794-800</page-range></nlm-citation>
</ref>
<ref id="B27">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Raffel]]></surname>
<given-names><![CDATA[C.]]></given-names>
</name>
<name>
<surname><![CDATA[Shazeer]]></surname>
<given-names><![CDATA[N.]]></given-names>
</name>
<name>
<surname><![CDATA[Roberts]]></surname>
<given-names><![CDATA[A.]]></given-names>
</name>
<name>
<surname><![CDATA[Lee]]></surname>
<given-names><![CDATA[K.]]></given-names>
</name>
<name>
<surname><![CDATA[Narang]]></surname>
<given-names><![CDATA[S.]]></given-names>
</name>
<name>
<surname><![CDATA[Matena]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
<name>
<surname><![CDATA[Zhou]]></surname>
<given-names><![CDATA[Y.]]></given-names>
</name>
<name>
<surname><![CDATA[Li]]></surname>
<given-names><![CDATA[W.]]></given-names>
</name>
<name>
<surname><![CDATA[Liu]]></surname>
<given-names><![CDATA[P. J.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Exploring the limits of transfer learning with a unified text-to-text transformer]]></article-title>
<source><![CDATA[Journal of machine learning research]]></source>
<year>2020</year>
<volume>21</volume>
<numero>140</numero>
<issue>140</issue>
<page-range>1-67</page-range></nlm-citation>
</ref>
<ref id="B28">
<nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Rothe]]></surname>
<given-names><![CDATA[S.]]></given-names>
</name>
<name>
<surname><![CDATA[Mallinson]]></surname>
<given-names><![CDATA[J.]]></given-names>
</name>
<name>
<surname><![CDATA[Malmi]]></surname>
<given-names><![CDATA[E.]]></given-names>
</name>
<name>
<surname><![CDATA[Krause]]></surname>
<given-names><![CDATA[S.]]></given-names>
</name>
<name>
<surname><![CDATA[Severyn]]></surname>
<given-names><![CDATA[A.]]></given-names>
</name>
</person-group>
<source><![CDATA[A simple recipe for multilingual grammatical error correction]]></source>
<year>2021</year>
<conf-name><![CDATA[ 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing]]></conf-name>
<conf-loc> </conf-loc>
<page-range>702-7</page-range></nlm-citation>
</ref>
<ref id="B29">
<nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Rozovskaya]]></surname>
<given-names><![CDATA[A.]]></given-names>
</name>
<name>
<surname><![CDATA[Roth]]></surname>
<given-names><![CDATA[D.]]></given-names>
</name>
</person-group>
<source><![CDATA[Grammatical error correction: Machine translation and classifiers]]></source>
<year>2016</year>
<volume>1</volume>
<conf-name><![CDATA[ 54th Annual Meeting of the Association for Computational Linguistics]]></conf-name>
<conf-loc> </conf-loc>
<page-range>2205-15</page-range></nlm-citation>
</ref>
<ref id="B30">
<nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Sakaguchi]]></surname>
<given-names><![CDATA[K.]]></given-names>
</name>
<name>
<surname><![CDATA[Post]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
<name>
<surname><![CDATA[van-Durme]]></surname>
<given-names><![CDATA[B.]]></given-names>
</name>
</person-group>
<source><![CDATA[Grammatical error correction with neural reinforcement learning]]></source>
<year>2017</year>
<conf-name><![CDATA[ Eighth International Joint Conference on Natural Language Processing]]></conf-name>
<conf-loc> </conf-loc>
<page-range>366-72</page-range></nlm-citation>
</ref>
<ref id="B31">
<nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Sennrich]]></surname>
<given-names><![CDATA[R.]]></given-names>
</name>
<name>
<surname><![CDATA[Haddow]]></surname>
<given-names><![CDATA[B.]]></given-names>
</name>
<name>
<surname><![CDATA[Birch]]></surname>
<given-names><![CDATA[A.]]></given-names>
</name>
</person-group>
<source><![CDATA[Neural machine translation of rare words with subword units]]></source>
<year>2016</year>
<conf-name><![CDATA[ 54th Annual Meeting of the Association for Computational Linguistics]]></conf-name>
<conf-loc> </conf-loc>
<page-range>1715-25</page-range></nlm-citation>
</ref>
<ref id="B32">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Starchenko]]></surname>
<given-names><![CDATA[V. M.]]></given-names>
</name>
<name>
<surname><![CDATA[Starchenko]]></surname>
<given-names><![CDATA[A. M.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Here we go again: Modern gec models need help with spelling]]></article-title>
<source><![CDATA[Proceedings of the Institute for System Programming of the RAS]]></source>
<year>2023</year>
<volume>35</volume>
<numero>5</numero>
<issue>5</issue>
<page-range>215-28</page-range></nlm-citation>
</ref>
<ref id="B33">
<nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Stüker]]></surname>
<given-names><![CDATA[S.]]></given-names>
</name>
<name>
<surname><![CDATA[Fay]]></surname>
<given-names><![CDATA[J.]]></given-names>
</name>
<name>
<surname><![CDATA[Berkling]]></surname>
<given-names><![CDATA[K.]]></given-names>
</name>
</person-group>
<source><![CDATA[Towards context-dependent phonetic spelling error correction in children&#8217;s freely composed text for diagnostic and pedagogical purposes]]></source>
<year>2011</year>
<conf-name><![CDATA[ INTERSPEECH]]></conf-name>
<conf-loc> </conf-loc>
<page-range>1601-4</page-range></nlm-citation>
</ref>
<ref id="B34">
<nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Susanto]]></surname>
<given-names><![CDATA[R. H.]]></given-names>
</name>
<name>
<surname><![CDATA[Phandi]]></surname>
<given-names><![CDATA[P.]]></given-names>
</name>
<name>
<surname><![CDATA[Ng]]></surname>
<given-names><![CDATA[H. T.]]></given-names>
</name>
</person-group>
<source><![CDATA[System combination for grammatical error correction]]></source>
<year>2014</year>
<conf-name><![CDATA[ Conference on Empirical Methods in Natural Language Processing (EMNLP)]]></conf-name>
<conf-date>2014</conf-date>
<conf-loc> </conf-loc>
<page-range>951-62</page-range></nlm-citation>
</ref>
<ref id="B35">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Taghva]]></surname>
<given-names><![CDATA[K.]]></given-names>
</name>
<name>
<surname><![CDATA[Stofsky]]></surname>
<given-names><![CDATA[E.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[OCRSpell: an interactive spelling correction system for OCR errors in text]]></article-title>
<source><![CDATA[International Journal on Document Analysis and Recognition]]></source>
<year>2001</year>
<volume>3</volume>
<numero>3</numero>
<issue>3</issue>
<page-range>125-37</page-range></nlm-citation>
</ref>
<ref id="B36">
<nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Tay]]></surname>
<given-names><![CDATA[Y.]]></given-names>
</name>
<name>
<surname><![CDATA[Dehghani]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
<name>
<surname><![CDATA[Rao]]></surname>
<given-names><![CDATA[J.]]></given-names>
</name>
<name>
<surname><![CDATA[Fedus]]></surname>
<given-names><![CDATA[W.]]></given-names>
</name>
<name>
<surname><![CDATA[Abnar]]></surname>
<given-names><![CDATA[S.]]></given-names>
</name>
<name>
<surname><![CDATA[Chung]]></surname>
<given-names><![CDATA[H. W.]]></given-names>
</name>
<name>
<surname><![CDATA[Narang]]></surname>
<given-names><![CDATA[S.]]></given-names>
</name>
<name>
<surname><![CDATA[Yogatama]]></surname>
<given-names><![CDATA[D.]]></given-names>
</name>
<name>
<surname><![CDATA[Vaswani]]></surname>
<given-names><![CDATA[A.]]></given-names>
</name>
<name>
<surname><![CDATA[Metzler]]></surname>
<given-names><![CDATA[D.]]></given-names>
</name>
</person-group>
<source><![CDATA[Scale efficiently: Insights from pre-training and fine-tuning transformers]]></source>
<year>2021</year>
</nlm-citation>
</ref>
<ref id="B37">
<nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[van-Delden]]></surname>
<given-names><![CDATA[S.]]></given-names>
</name>
<name>
<surname><![CDATA[Bracewell]]></surname>
<given-names><![CDATA[D.]]></given-names>
</name>
<name>
<surname><![CDATA[Gomez]]></surname>
<given-names><![CDATA[F.]]></given-names>
</name>
</person-group>
<source><![CDATA[Supervised and unsupervised automatic spelling correction algorithms]]></source>
<year>2004</year>
<conf-name><![CDATA[ International Conference on Information Reuse and Integration]]></conf-name>
<conf-date>2004</conf-date>
<conf-loc> </conf-loc>
<page-range>530-5</page-range></nlm-citation>
</ref>
<ref id="B38">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Vilares]]></surname>
<given-names><![CDATA[J.]]></given-names>
</name>
<name>
<surname><![CDATA[Alonso]]></surname>
<given-names><![CDATA[M. A.]]></given-names>
</name>
<name>
<surname><![CDATA[Doval]]></surname>
<given-names><![CDATA[Y.]]></given-names>
</name>
<name>
<surname><![CDATA[Vilares]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Studying the effect and treatment of misspelled queries in cross-language information retrieval]]></article-title>
<source><![CDATA[Information Processing &amp; Management]]></source>
<year>2016</year>
<volume>52</volume>
<numero>4</numero>
<issue>4</issue>
<page-range>646-57</page-range></nlm-citation>
</ref>
<ref id="B39">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Xue]]></surname>
<given-names><![CDATA[L.]]></given-names>
</name>
<name>
<surname><![CDATA[Barua]]></surname>
<given-names><![CDATA[A.]]></given-names>
</name>
<name>
<surname><![CDATA[Constant]]></surname>
<given-names><![CDATA[N.]]></given-names>
</name>
<name>
<surname><![CDATA[Al-Rfou]]></surname>
<given-names><![CDATA[R.]]></given-names>
</name>
<name>
<surname><![CDATA[Narang]]></surname>
<given-names><![CDATA[S.]]></given-names>
</name>
<name>
<surname><![CDATA[Kale]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
<name>
<surname><![CDATA[Roberts]]></surname>
<given-names><![CDATA[A.]]></given-names>
</name>
<name>
<surname><![CDATA[Raffel]]></surname>
<given-names><![CDATA[C.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[ByT5: Towards a token-free future with pre-trained byte-to-byte models]]></article-title>
<source><![CDATA[Transactions of the Association for Computational Linguistics]]></source>
<year>2022</year>
<volume>10</volume>
<page-range>291-306</page-range></nlm-citation>
</ref>
<ref id="B40">
<nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Zhang]]></surname>
<given-names><![CDATA[Y.]]></given-names>
</name>
<name>
<surname><![CDATA[Zhang]]></surname>
<given-names><![CDATA[B.]]></given-names>
</name>
<name>
<surname><![CDATA[Li]]></surname>
<given-names><![CDATA[Z.]]></given-names>
</name>
<name>
<surname><![CDATA[Bao]]></surname>
<given-names><![CDATA[Z.]]></given-names>
</name>
<name>
<surname><![CDATA[Li]]></surname>
<given-names><![CDATA[C.]]></given-names>
</name>
<name>
<surname><![CDATA[Zhang]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
</person-group>
<source><![CDATA[SynGEC: Syntax-enhanced grammatical error correction with a tailored GEC-oriented parser]]></source>
<year>2022</year>
<conf-name><![CDATA[ Conference on Empirical Methods in Natural Language Processing]]></conf-name>
<conf-date>2022</conf-date>
<conf-loc> </conf-loc>
<page-range>2518-31</page-range></nlm-citation>
</ref>
<ref id="B41">
<nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Zhou]]></surname>
<given-names><![CDATA[H.]]></given-names>
</name>
<name>
<surname><![CDATA[Liu]]></surname>
<given-names><![CDATA[Y.]]></given-names>
</name>
<name>
<surname><![CDATA[Li]]></surname>
<given-names><![CDATA[Z.]]></given-names>
</name>
<name>
<surname><![CDATA[Zhang]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
<name>
<surname><![CDATA[Zhang]]></surname>
<given-names><![CDATA[B.]]></given-names>
</name>
<name>
<surname><![CDATA[Li]]></surname>
<given-names><![CDATA[C.]]></given-names>
</name>
<name>
<surname><![CDATA[Zhang]]></surname>
<given-names><![CDATA[J.]]></given-names>
</name>
<name>
<surname><![CDATA[Huang]]></surname>
<given-names><![CDATA[F.]]></given-names>
</name>
</person-group>
<source><![CDATA[Improving Seq2Seq grammatical error correction via decoding interventions]]></source>
<year>2023</year>
<conf-name><![CDATA[ Findings of the Association for Computational Linguistics: EMNLP]]></conf-name>
<conf-date>2023</conf-date>
<conf-loc> </conf-loc>
<page-range>7393-405</page-range></nlm-citation>
</ref>
</ref-list>
</back>
</article>
