<?xml version="1.0" encoding="ISO-8859-1"?><article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<front>
<journal-meta>
<journal-id>1405-5546</journal-id>
<journal-title><![CDATA[Computación y Sistemas]]></journal-title>
<abbrev-journal-title><![CDATA[Comp. y Sist.]]></abbrev-journal-title>
<issn>1405-5546</issn>
<publisher>
<publisher-name><![CDATA[Instituto Politécnico Nacional, Centro de Investigación en Computación]]></publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id>S1405-55462022000200835</article-id>
<article-id pub-id-type="doi">10.13053/cys-26-2-4254</article-id>
<title-group>
<article-title xml:lang="en"><![CDATA[Measuring the Quality of Low-Resourced Statistical Parametric Speech Synthesis Trained with Noise-Degraded Data Supported by the University of Costa Rica]]></article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname><![CDATA[Coto-Jiménez]]></surname>
<given-names><![CDATA[Marvin]]></given-names>
</name>
<xref ref-type="aff" rid="Aff"/>
</contrib>
</contrib-group>
<aff id="Af1">
<institution><![CDATA[,University of Costa Rica  ]]></institution>
<addr-line><![CDATA[ ]]></addr-line>
<country>Costa Rica</country>
</aff>
<pub-date pub-type="pub">
<day>00</day>
<month>06</month>
<year>2022</year>
</pub-date>
<pub-date pub-type="epub">
<day>00</day>
<month>06</month>
<year>2022</year>
</pub-date>
<volume>26</volume>
<numero>2</numero>
<fpage>835</fpage>
<lpage>842</lpage>
<copyright-statement/>
<copyright-year/>
<self-uri xlink:href="http://www.scielo.org.mx/scielo.php?script=sci_arttext&amp;pid=S1405-55462022000200835&amp;lng=en&amp;nrm=iso"></self-uri><self-uri xlink:href="http://www.scielo.org.mx/scielo.php?script=sci_abstract&amp;pid=S1405-55462022000200835&amp;lng=en&amp;nrm=iso"></self-uri><self-uri xlink:href="http://www.scielo.org.mx/scielo.php?script=sci_pdf&amp;pid=S1405-55462022000200835&amp;lng=en&amp;nrm=iso"></self-uri><abstract abstract-type="short" xml:lang="en"><p><![CDATA[Abstract: After the successful implementation of speech synthesis in several languages, the study of robustness became an important topic so as to increase the possibility of building voices from non-standard sources, e.g. historical recordings, children&#8217;s speech, and data freely available on the Internet. In this work, a measure of the influence of noise in the source speech of the statistical parametric speech synthesis system based on HMM is performed, for a case of a low-resourced database. For this purpose, three types of additive noise were considered at five signal-to-noise ratio levels to affect the source speech data. Using objective measures to assess the perceptual quality of the results and the propagation of the noise through all the processes of building speech synthesis, the results show a severe drop in the quality of artificial speech, even for the cases of lower levels of noise. Such degradation seems to be independent of the noise type, and is at lower proportion to the noise level. This results are of importance for any practical implementation of speech synthesis from degraded data in similar conditions, and shows that applying denoising processes became mandatory in order to keep the possibility of building intelligible voices.]]></p></abstract>
<kwd-group>
<kwd lng="en"><![CDATA[Noise]]></kwd>
<kwd lng="en"><![CDATA[robustness]]></kwd>
<kwd lng="en"><![CDATA[speech synthesis]]></kwd>
</kwd-group>
</article-meta>
</front><back>
<ref-list>
<ref id="B1">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Tokuda]]></surname>
<given-names><![CDATA[K.]]></given-names>
</name>
<name>
<surname><![CDATA[Nankaku]]></surname>
<given-names><![CDATA[Y.]]></given-names>
</name>
<name>
<surname><![CDATA[Toda]]></surname>
<given-names><![CDATA[T.]]></given-names>
</name>
<name>
<surname><![CDATA[Zen]]></surname>
<given-names><![CDATA[H.]]></given-names>
</name>
<name>
<surname><![CDATA[Yamagishi]]></surname>
<given-names><![CDATA[J.]]></given-names>
</name>
<name>
<surname><![CDATA[Oura]]></surname>
<given-names><![CDATA[K.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Speech synthesis based on hidden Markov models]]></article-title>
<source><![CDATA[Proceedings of the IEEE]]></source>
<year>2013</year>
<page-range>1234-52</page-range></nlm-citation>
</ref>
<ref id="B2">
<nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Masuko]]></surname>
<given-names><![CDATA[T.]]></given-names>
</name>
<name>
<surname><![CDATA[Tokuda]]></surname>
<given-names><![CDATA[K.]]></given-names>
</name>
<name>
<surname><![CDATA[Kobayashi]]></surname>
<given-names><![CDATA[T.]]></given-names>
</name>
<name>
<surname><![CDATA[Imai]]></surname>
<given-names><![CDATA[S.]]></given-names>
</name>
</person-group>
<source><![CDATA[Speech synthesis using HMMs with dynamic features]]></source>
<year>1996</year>
<volume>1</volume>
<conf-name><![CDATA[ International Conference on Acoustics, Speech, and Signal Processing]]></conf-name>
<conf-loc> </conf-loc>
</nlm-citation>
</ref>
<ref id="B3">
<nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Tokuda]]></surname>
<given-names><![CDATA[K.]]></given-names>
</name>
<name>
<surname><![CDATA[Kobayashi]]></surname>
<given-names><![CDATA[T.]]></given-names>
</name>
<name>
<surname><![CDATA[Imai]]></surname>
<given-names><![CDATA[S.]]></given-names>
</name>
</person-group>
<source><![CDATA[Speech parameter generation from HMM using dynamic features]]></source>
<year>1995</year>
<volume>1</volume>
<conf-name><![CDATA[ International Conference on Acoustics, Speech, and Signal Processing]]></conf-name>
<conf-loc> </conf-loc>
</nlm-citation>
</ref>
<ref id="B4">
<nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Zen]]></surname>
<given-names><![CDATA[H.]]></given-names>
</name>
<name>
<surname><![CDATA[Nose]]></surname>
<given-names><![CDATA[T.]]></given-names>
</name>
<name>
<surname><![CDATA[Yamagishi]]></surname>
<given-names><![CDATA[J.]]></given-names>
</name>
</person-group>
<source><![CDATA[The HMM-based speech synthesis system (HTS)]]></source>
<year>2007</year>
<publisher-name><![CDATA[SSW]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B5">
<nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Gonzalvo]]></surname>
<given-names><![CDATA[X.]]></given-names>
</name>
<name>
<surname><![CDATA[Sanz]]></surname>
<given-names><![CDATA[I.I.]]></given-names>
</name>
<name>
<surname><![CDATA[Socoró-Carrié]]></surname>
<given-names><![CDATA[J.C.]]></given-names>
</name>
<name>
<surname><![CDATA[Alías]]></surname>
<given-names><![CDATA[F.]]></given-names>
</name>
</person-group>
<source><![CDATA[HMM-based Spanish speech synthesis using CBR as F0 estimator]]></source>
<year>2007</year>
<conf-name><![CDATA[ ITRW on NOLISP]]></conf-name>
<conf-loc> </conf-loc>
<page-range>788-93</page-range></nlm-citation>
</ref>
<ref id="B6">
<nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Gonzalvo]]></surname>
<given-names><![CDATA[X.]]></given-names>
</name>
<name>
<surname><![CDATA[Taylor]]></surname>
<given-names><![CDATA[P.]]></given-names>
</name>
<name>
<surname><![CDATA[Monzo]]></surname>
<given-names><![CDATA[C.]]></given-names>
</name>
<name>
<surname><![CDATA[Sanz]]></surname>
<given-names><![CDATA[I.I.]]></given-names>
</name>
</person-group>
<source><![CDATA[High quality emotional HMM-based synthesis in Spanish]]></source>
<year>2009</year>
<conf-name><![CDATA[ International Conference on Nonlinear Speech Processing]]></conf-name>
<conf-loc> </conf-loc>
</nlm-citation>
</ref>
<ref id="B7">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Franco]]></surname>
<given-names><![CDATA[C.A.]]></given-names>
</name>
<name>
<surname><![CDATA[Herrera]]></surname>
<given-names><![CDATA[A.]]></given-names>
</name>
<name>
<surname><![CDATA[Escalante]]></surname>
<given-names><![CDATA[B.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Speech synthesis in Mexican Spanish using voice parameterization]]></article-title>
<source><![CDATA[IIISCI]]></source>
<year>2017</year>
<volume>15</volume>
<numero>4</numero>
<issue>4</issue>
<page-range>72-5</page-range></nlm-citation>
</ref>
<ref id="B8">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Ekpenyong]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
<name>
<surname><![CDATA[Urua]]></surname>
<given-names><![CDATA[E.A.]]></given-names>
</name>
<name>
<surname><![CDATA[Watts]]></surname>
<given-names><![CDATA[O.]]></given-names>
</name>
<name>
<surname><![CDATA[King]]></surname>
<given-names><![CDATA[S.]]></given-names>
</name>
<name>
<surname><![CDATA[Yamagishi]]></surname>
<given-names><![CDATA[J.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Statistical parametric speech synthesis for Ibibio]]></article-title>
<source><![CDATA[Speech Communication]]></source>
<year>2014</year>
<volume>56</volume>
<page-range>243-51</page-range></nlm-citation>
</ref>
<ref id="B9">
<nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Ze]]></surname>
<given-names><![CDATA[H.]]></given-names>
</name>
<name>
<surname><![CDATA[Senior]]></surname>
<given-names><![CDATA[A.]]></given-names>
</name>
<name>
<surname><![CDATA[Schuster]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
</person-group>
<source><![CDATA[Statistical parametric speech synthesis using deep neural networks]]></source>
<year>2013</year>
<conf-name><![CDATA[ International Conference on Acoustics, Speech and Signal Processing]]></conf-name>
<conf-loc> </conf-loc>
</nlm-citation>
</ref>
<ref id="B10">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Ning]]></surname>
<given-names><![CDATA[Y.]]></given-names>
</name>
<name>
<surname><![CDATA[He]]></surname>
<given-names><![CDATA[S.]]></given-names>
</name>
<name>
<surname><![CDATA[Wu]]></surname>
<given-names><![CDATA[Z.]]></given-names>
</name>
<name>
<surname><![CDATA[Xing]]></surname>
<given-names><![CDATA[Ch.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[A review of deep learning based speech synthesis]]></article-title>
<source><![CDATA[Applied Sciences]]></source>
<year>2019</year>
<volume>9</volume>
<numero>19</numero>
<issue>19</issue>
<page-range>4050</page-range></nlm-citation>
</ref>
<ref id="B11">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Hu]]></surname>
<given-names><![CDATA[Y.J.]]></given-names>
</name>
<name>
<surname><![CDATA[Ling]]></surname>
<given-names><![CDATA[Z.H.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[DBN-based spectral feature representation for statistical parametric speech synthesis]]></article-title>
<source><![CDATA[IEEE Signal Processing Letters]]></source>
<year>2016</year>
<volume>23</volume>
<numero>3</numero>
<issue>3</issue>
<page-range>321-5</page-range></nlm-citation>
</ref>
<ref id="B12">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Hu]]></surname>
<given-names><![CDATA[Y.J.]]></given-names>
</name>
<name>
<surname><![CDATA[Ling]]></surname>
<given-names><![CDATA[Z.H.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Extracting spectral features using deep autoencoders with binary distributed hidden units for statistical parametric speech synthesis]]></article-title>
<source><![CDATA[IEEE/ACM Transactions on Audio, Speech, and Language Processing]]></source>
<year>2018</year>
<volume>26</volume>
<numero>4</numero>
<issue>4</issue>
<page-range>713-24</page-range></nlm-citation>
</ref>
<ref id="B13">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Suraj-Pandurang]]></surname>
<given-names><![CDATA[P.]]></given-names>
</name>
<name>
<surname><![CDATA[Laxman-Lahudkar]]></surname>
<given-names><![CDATA[S.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Hidden-Markov-model based statistical parametric speech synthesis for Marathi with optimal number of hidden states]]></article-title>
<source><![CDATA[International Journal of Speech Technology]]></source>
<year>2019</year>
<volume>22</volume>
<numero>1</numero>
<issue>1</issue>
<page-range>93-8</page-range></nlm-citation>
</ref>
<ref id="B14">
<nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Sefara]]></surname>
<given-names><![CDATA[T.J.]]></given-names>
</name>
<name>
<surname><![CDATA[Mokgonyane]]></surname>
<given-names><![CDATA[T.B.]]></given-names>
</name>
<name>
<surname><![CDATA[Manamela]]></surname>
<given-names><![CDATA[M.J.]]></given-names>
</name>
<name>
<surname><![CDATA[Modipa]]></surname>
<given-names><![CDATA[T.I.]]></given-names>
</name>
</person-group>
<source><![CDATA[HMM-based speech synthesis system incorporated with language identification for low-resourced languages]]></source>
<year>2019</year>
<conf-name><![CDATA[ International Conference on Advances in Big Data, Computing and Data Communication Systems (ICABCD)]]></conf-name>
<conf-loc> </conf-loc>
</nlm-citation>
</ref>
<ref id="B15">
<nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Junichi]]></surname>
<given-names><![CDATA[Y.]]></given-names>
</name>
<name>
<surname><![CDATA[Ling]]></surname>
<given-names><![CDATA[Z.]]></given-names>
</name>
<name>
<surname><![CDATA[King]]></surname>
<given-names><![CDATA[S.]]></given-names>
</name>
</person-group>
<source><![CDATA[Robustness of HMM-based speech synthesis]]></source>
<year>2008</year>
</nlm-citation>
</ref>
<ref id="B16">
<nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Valentini-Botinhao]]></surname>
<given-names><![CDATA[C.]]></given-names>
</name>
<name>
<surname><![CDATA[Wang]]></surname>
<given-names><![CDATA[X.]]></given-names>
</name>
<name>
<surname><![CDATA[Takaki]]></surname>
<given-names><![CDATA[S.]]></given-names>
</name>
<name>
<surname><![CDATA[Yamagishi]]></surname>
<given-names><![CDATA[J.]]></given-names>
</name>
</person-group>
<source><![CDATA[Speech Enhancement for a Noise-Robust Text-to-Speech Synthesis System Using Deep Recurrent Neural Networks]]></source>
<year>2016</year>
<publisher-name><![CDATA[Interspeech]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B17">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Karhila]]></surname>
<given-names><![CDATA[R.]]></given-names>
</name>
<name>
<surname><![CDATA[Remes]]></surname>
<given-names><![CDATA[U.]]></given-names>
</name>
<name>
<surname><![CDATA[Kurimo]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Noise in HMM-based speech synthesis adaptation: Analysis, evaluation methods and experiments]]></article-title>
<source><![CDATA[IEEE Journal of Selected Topics in Signal Processing]]></source>
<year>2013</year>
<volume>8</volume>
<numero>2</numero>
<issue>2</issue>
<page-range>285-95</page-range></nlm-citation>
</ref>
<ref id="B18">
<nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Baljekar]]></surname>
<given-names><![CDATA[P.]]></given-names>
</name>
</person-group>
<source><![CDATA[Speech synthesis from found data]]></source>
<year>2018</year>
<publisher-name><![CDATA[Carnegie Mellon University]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B19">
<nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Tokuda]]></surname>
<given-names><![CDATA[K.]]></given-names>
</name>
<name>
<surname><![CDATA[Yoshimura]]></surname>
<given-names><![CDATA[T.]]></given-names>
</name>
<name>
<surname><![CDATA[Masuko]]></surname>
<given-names><![CDATA[T.]]></given-names>
</name>
<name>
<surname><![CDATA[Kobayashi]]></surname>
<given-names><![CDATA[T.]]></given-names>
</name>
<name>
<surname><![CDATA[Kitamura]]></surname>
<given-names><![CDATA[T.]]></given-names>
</name>
</person-group>
<source><![CDATA[Speech parameter generation algorithms for HMM-based speech synthesis]]></source>
<year>2000</year>
<volume>3</volume>
<conf-name><![CDATA[ IEEE International Conference on Acoustics, Speech, and Signal Processing]]></conf-name>
<conf-loc> </conf-loc>
</nlm-citation>
</ref>
<ref id="B20">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Toda]]></surname>
<given-names><![CDATA[T.]]></given-names>
</name>
<name>
<surname><![CDATA[Tokuda]]></surname>
<given-names><![CDATA[K.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[A speech parameter generation algorithm considering global variance for HMM-based speech synthesis]]></article-title>
<source><![CDATA[IEICE Transaction on Information and Systems]]></source>
<year>2007</year>
<volume>90</volume>
<numero>5</numero>
<issue>5</issue>
<page-range>816-24</page-range></nlm-citation>
</ref>
<ref id="B21">
<nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Maegaard]]></surname>
<given-names><![CDATA[B.]]></given-names>
</name>
<name>
<surname><![CDATA[Choukri]]></surname>
<given-names><![CDATA[K.]]></given-names>
</name>
<name>
<surname><![CDATA[Calzolari]]></surname>
<given-names><![CDATA[N.]]></given-names>
</name>
<name>
<surname><![CDATA[Odijk]]></surname>
<given-names><![CDATA[J.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Elra &#8211; European language resources association - background, recent developments and future perspectives]]></article-title>
<source><![CDATA[Language Resources and Evaluation]]></source>
<year>2005</year>
<volume>39</volume>
<numero>1</numero>
<issue>1</issue>
<page-range>9-23</page-range></nlm-citation>
</ref>
</ref-list>
</back>
</article>
