<?xml version="1.0" encoding="ISO-8859-1"?><article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<front>
<journal-meta>
<journal-id>1405-5546</journal-id>
<journal-title><![CDATA[Computación y Sistemas]]></journal-title>
<abbrev-journal-title><![CDATA[Comp. y Sist.]]></abbrev-journal-title>
<issn>1405-5546</issn>
<publisher>
<publisher-name><![CDATA[Instituto Politécnico Nacional, Centro de Investigación en Computación]]></publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id>S1405-55462017000400569</article-id>
<article-id pub-id-type="doi">10.13053/cys-21-4-2849</article-id>
<title-group>
<article-title xml:lang="en"><![CDATA[EDGE2VEC: Edge Representations for Large-Scale Scalable Hierarchical Learning]]></article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname><![CDATA[Golam Sohrab]]></surname>
<given-names><![CDATA[Mohammad]]></given-names>
</name>
<xref ref-type="aff" rid="Aff"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname><![CDATA[Nakata]]></surname>
<given-names><![CDATA[Toru]]></given-names>
</name>
<xref ref-type="aff" rid="Aff"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname><![CDATA[Miwa]]></surname>
<given-names><![CDATA[Makoto]]></given-names>
</name>
<xref ref-type="aff" rid="Aff"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname><![CDATA[Sasaki]]></surname>
<given-names><![CDATA[Yutaka]]></given-names>
</name>
<xref ref-type="aff" rid="Aff"/>
</contrib>
</contrib-group>
<aff id="Af1">
<institution><![CDATA[,National Institute of Advanced Industrial Science and Technology Artificial Intelligence Research Center ]]></institution>
<addr-line><![CDATA[Tokyo ]]></addr-line>
<country>Japan</country>
</aff>
<aff id="Af2">
<institution><![CDATA[,Toyota Technological Institute  ]]></institution>
<addr-line><![CDATA[Nagoya ]]></addr-line>
<country>Japan</country>
</aff>
<pub-date pub-type="pub">
<day>00</day>
<month>12</month>
<year>2017</year>
</pub-date>
<pub-date pub-type="epub">
<day>00</day>
<month>12</month>
<year>2017</year>
</pub-date>
<volume>21</volume>
<numero>4</numero>
<fpage>569</fpage>
<lpage>579</lpage>
<copyright-statement/>
<copyright-year/>
<self-uri xlink:href="http://www.scielo.org.mx/scielo.php?script=sci_arttext&amp;pid=S1405-55462017000400569&amp;lng=en&amp;nrm=iso"></self-uri><self-uri xlink:href="http://www.scielo.org.mx/scielo.php?script=sci_abstract&amp;pid=S1405-55462017000400569&amp;lng=en&amp;nrm=iso"></self-uri><self-uri xlink:href="http://www.scielo.org.mx/scielo.php?script=sci_pdf&amp;pid=S1405-55462017000400569&amp;lng=en&amp;nrm=iso"></self-uri><abstract abstract-type="short" xml:lang="en"><p><![CDATA[Abstract: In present front-line of Big Data, prediction tasks over the nodes and edges in complex deep architecture needs a careful representation of features by assigning hundreds of thousands, or even millions of labels and samples for information access system, especially for hierarchical extreme multi-label classification. We introduce edge2vec, an edge representations framework for learning discrete and continuous features of edges in deep architecture. In edge2vec, we learn a mapping of edges associated with nodes where random samples are augmented by statistical and semantic representations of words and documents. We argue that infusing semantic representations of features for edges by exploiting word2vec and para2vec is the key to learning richer representations for exploring target nodes or labels in the hierarchy. Moreover, we design and implement a balanced stochastic dual coordinate ascent (DCA)-based support vector machine for speeding up training. We introduce a global decision-based top-down walks instead of random walks to predict the most likelihood labels in the deep architecture. We judge the efficiency of edge2vec over the existing state-of-the-art techniques on extreme multi-label hierarchical as well as flat classification tasks. The empirical results show that edge2vec is very promising and computationally very efficient in fast learning and predicting tasks. In deep learning workbench, edge2vec represents a new direction for statistical and semantic representations of features in task-independent networks.]]></p></abstract>
<kwd-group>
<kwd lng="en"><![CDATA[Hierarchical text classification]]></kwd>
<kwd lng="en"><![CDATA[multi-label learning]]></kwd>
<kwd lng="en"><![CDATA[indexing]]></kwd>
<kwd lng="en"><![CDATA[extreme classification]]></kwd>
<kwd lng="en"><![CDATA[tree-structured class hierarchy]]></kwd>
<kwd lng="en"><![CDATA[DAG-structured class hierarchy]]></kwd>
<kwd lng="en"><![CDATA[DG-structured class hierarchy]]></kwd>
</kwd-group>
</article-meta>
</front><back>
<ref-list>
<ref id="B1">
<label>1</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Chakrabarti]]></surname>
<given-names><![CDATA[S.]]></given-names>
</name>
<name>
<surname><![CDATA[Dom]]></surname>
<given-names><![CDATA[B.]]></given-names>
</name>
<name>
<surname><![CDATA[Agrawal]]></surname>
<given-names><![CDATA[R.]]></given-names>
</name>
<name>
<surname><![CDATA[Raghavan]]></surname>
<given-names><![CDATA[P.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Scalable feature selection, classification and signature generation for organizing large text databases into hierarchical topic taxonomies]]></article-title>
<source><![CDATA[International Journal on Very Large data Bases]]></source>
<year>1998</year>
<volume>7</volume>
<numero>3</numero>
<issue>3</issue>
<page-range>163&#8211;178</page-range></nlm-citation>
</ref>
<ref id="B2">
<label>2</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Cortes]]></surname>
<given-names><![CDATA[C.]]></given-names>
</name>
<name>
<surname><![CDATA[Vapnik]]></surname>
<given-names><![CDATA[V.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Support vector networks]]></article-title>
<source><![CDATA[Journal of Machine Learning]]></source>
<year>1995</year>
<volume>20</volume>
<page-range>273&#8211;297</page-range></nlm-citation>
</ref>
<ref id="B3">
<label>3</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Crammer]]></surname>
<given-names><![CDATA[K.]]></given-names>
</name>
<name>
<surname><![CDATA[Dekel]]></surname>
<given-names><![CDATA[O.]]></given-names>
</name>
<name>
<surname><![CDATA[Keshet]]></surname>
<given-names><![CDATA[J.]]></given-names>
</name>
<name>
<surname><![CDATA[Shalev-Shwartz]]></surname>
<given-names><![CDATA[S.]]></given-names>
</name>
<name>
<surname><![CDATA[Singer]]></surname>
<given-names><![CDATA[Y.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Online passive-aggressive algorithm]]></article-title>
<source><![CDATA[Journal of Machine Learning Research]]></source>
<year>2006</year>
<volume>7</volume>
<page-range>551&#8211;585</page-range></nlm-citation>
</ref>
<ref id="B4">
<label>4</label><nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Dumais]]></surname>
<given-names><![CDATA[S.]]></given-names>
</name>
<name>
<surname><![CDATA[Platt]]></surname>
<given-names><![CDATA[J.]]></given-names>
</name>
<name>
<surname><![CDATA[Heckerman]]></surname>
<given-names><![CDATA[D.]]></given-names>
</name>
</person-group>
<source><![CDATA[Inductive learning algorithms and representations for text categorization]]></source>
<year>1998</year>
<conf-name><![CDATA[ CIKM]]></conf-name>
<conf-loc> </conf-loc>
<page-range>148&#8211;155</page-range></nlm-citation>
</ref>
<ref id="B5">
<label>5</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Fattah]]></surname>
<given-names><![CDATA[M. A.]]></given-names>
</name>
<name>
<surname><![CDATA[Sohrab]]></surname>
<given-names><![CDATA[M. G.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Combined term weighting scheme using ffnn, ga, mr, sum, and average for text classification]]></article-title>
<source><![CDATA[International Journal of Scientific and Engineering Research]]></source>
<year>2016</year>
<volume>7</volume>
<numero>8</numero>
<issue>8</issue>
<page-range>2031&#8211;2040</page-range></nlm-citation>
</ref>
<ref id="B6">
<label>6</label><nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Jeffrey]]></surname>
<given-names><![CDATA[P.]]></given-names>
</name>
<name>
<surname><![CDATA[Richard]]></surname>
<given-names><![CDATA[S.]]></given-names>
</name>
<name>
<surname><![CDATA[Christopher]]></surname>
<given-names><![CDATA[D. M.]]></given-names>
</name>
</person-group>
<source><![CDATA[Glove: Global vectors for word representation]]></source>
<year>2014</year>
<conf-name><![CDATA[ EMNLP]]></conf-name>
<conf-loc> </conf-loc>
<page-range>1532&#8211;1543</page-range><publisher-loc><![CDATA[Qatar ]]></publisher-loc>
</nlm-citation>
</ref>
<ref id="B7">
<label>7</label><nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Koller]]></surname>
<given-names><![CDATA[D.]]></given-names>
</name>
<name>
<surname><![CDATA[Sahami]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
</person-group>
<source><![CDATA[Hierarchically classifying documents using very few words]]></source>
<year>1997</year>
<conf-name><![CDATA[ ICML]]></conf-name>
<conf-loc> </conf-loc>
<page-range>170&#8211;178</page-range></nlm-citation>
</ref>
<ref id="B8">
<label>8</label><nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Lee]]></surname>
<given-names><![CDATA[D. H]]></given-names>
</name>
</person-group>
<source><![CDATA[Multi-stage rocchio classification for large-scale multi-labeled text data]]></source>
<year>2012</year>
<conf-name><![CDATA[ Proceedings of the 2012 ECML/PKDD Discovery Challenge Workshop on Large-Scale Hierarchical Text Classification]]></conf-name>
<conf-loc>Bristol </conf-loc>
</nlm-citation>
</ref>
<ref id="B9">
<label>9</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Li]]></surname>
<given-names><![CDATA[Y.]]></given-names>
</name>
<name>
<surname><![CDATA[Long]]></surname>
<given-names><![CDATA[P.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[The relaxed online maximum margin algorithm]]></article-title>
<source><![CDATA[Journal of Machine Learning]]></source>
<year>2002</year>
<volume>46</volume>
<page-range>1&#8211;3</page-range></nlm-citation>
</ref>
<ref id="B10">
<label>10</label><nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[McCallum]]></surname>
<given-names><![CDATA[A.]]></given-names>
</name>
<name>
<surname><![CDATA[Rosenfeld]]></surname>
<given-names><![CDATA[R.]]></given-names>
</name>
<name>
<surname><![CDATA[Mitchell]]></surname>
<given-names><![CDATA[T.]]></given-names>
</name>
<name>
<surname><![CDATA[Ng]]></surname>
<given-names><![CDATA[A.]]></given-names>
</name>
</person-group>
<source><![CDATA[Improving text classification by shrinkage in a hierarchy of classes]]></source>
<year>1998</year>
<conf-name><![CDATA[ ICML]]></conf-name>
<conf-loc> </conf-loc>
<page-range>359&#8211;367</page-range></nlm-citation>
</ref>
<ref id="B11">
<label>11</label><nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Platt]]></surname>
<given-names><![CDATA[J]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Fast training of support vector machines using sequential minimal optimization]]></article-title>
<source><![CDATA[Advances in Kernel Methods: Support Vector Learning]]></source>
<year>1998</year>
<publisher-name><![CDATA[MIT]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B12">
<label>12</label><nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Quoc]]></surname>
<given-names><![CDATA[L.]]></given-names>
</name>
<name>
<surname><![CDATA[Mikolov]]></surname>
<given-names><![CDATA[T.]]></given-names>
</name>
</person-group>
<source><![CDATA[Distributed representations of sentences and documents]]></source>
<year>2014</year>
<conf-name><![CDATA[ ICML]]></conf-name>
<conf-loc> </conf-loc>
<page-range>1188&#8211;1196</page-range></nlm-citation>
</ref>
<ref id="B13">
<label>13</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Ren]]></surname>
<given-names><![CDATA[F.]]></given-names>
</name>
<name>
<surname><![CDATA[Sohrab]]></surname>
<given-names><![CDATA[M. G.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Class-indexing-based term weighting for automatic text classification]]></article-title>
<source><![CDATA[Information Sciences]]></source>
<year>2013</year>
<volume>236</volume>
<page-range>109&#8211;125</page-range></nlm-citation>
</ref>
<ref id="B14">
<label>14</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Sohrab]]></surname>
<given-names><![CDATA[M. G.]]></given-names>
</name>
<name>
<surname><![CDATA[Miwa]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
<name>
<surname><![CDATA[Sasaki]]></surname>
<given-names><![CDATA[Y.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Centroid-means-embedding: An approach to infusing word embeddings into features for text classification]]></article-title>
<source><![CDATA[Proceedings of the PAKDD, LNCS]]></source>
<year>2015</year>
<volume>9077</volume>
<page-range>289&#8211;300</page-range></nlm-citation>
</ref>
<ref id="B15">
<label>15</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Sohrab]]></surname>
<given-names><![CDATA[M. G.]]></given-names>
</name>
<name>
<surname><![CDATA[Miwa]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
<name>
<surname><![CDATA[Sasaki]]></surname>
<given-names><![CDATA[Y.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Indeductive and dag-tree approaches for large-scale extreme multi-label hierarchical text classification]]></article-title>
<source><![CDATA[Journal of Polibits]]></source>
<year>2016</year>
<volume>54</volume>
<page-range>61&#8211;70</page-range></nlm-citation>
</ref>
<ref id="B16">
<label>16</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Sohrab]]></surname>
<given-names><![CDATA[M. G.]]></given-names>
</name>
<name>
<surname><![CDATA[Miwa]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
<name>
<surname><![CDATA[Sasaki]]></surname>
<given-names><![CDATA[Y.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Word embeddings in large-scale deep architecture learning]]></article-title>
<source><![CDATA[The Association for Natural Language Processing]]></source>
<year>2016</year>
<volume>9077</volume>
<page-range>625&#8211;628</page-range></nlm-citation>
</ref>
<ref id="B17">
<label>17</label><nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Sohrab]]></surname>
<given-names><![CDATA[M. G.]]></given-names>
</name>
<name>
<surname><![CDATA[Ren]]></surname>
<given-names><![CDATA[F.]]></given-names>
</name>
</person-group>
<source><![CDATA[The effectiveness of class-space-density in high and low-dimensional vector space for text classification]]></source>
<year>2012</year>
<conf-name><![CDATA[ Proceedings of the IEEE International Conference of CCIS]]></conf-name>
<conf-loc>China </conf-loc>
<page-range>2034&#8211;2042</page-range></nlm-citation>
</ref>
<ref id="B18">
<label>18</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Tsoumakes]]></surname>
<given-names><![CDATA[G.]]></given-names>
</name>
<name>
<surname><![CDATA[Katakis]]></surname>
<given-names><![CDATA[I.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[Multi-label classification: An overview]]></article-title>
<source><![CDATA[International Journal of Data Warehousing and Mining]]></source>
<year>2007</year>
<volume>3</volume>
<numero>3</numero>
<issue>3</issue>
<page-range>1&#8211;13</page-range></nlm-citation>
</ref>
<ref id="B19">
<label>19</label><nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Tsoumakes]]></surname>
<given-names><![CDATA[G.]]></given-names>
</name>
<name>
<surname><![CDATA[Katakis]]></surname>
<given-names><![CDATA[I.]]></given-names>
</name>
<name>
<surname><![CDATA[Vlahavas]]></surname>
<given-names><![CDATA[I.]]></given-names>
</name>
</person-group>
<source><![CDATA[Random k-labelsets for multi-label classificatio]]></source>
<year>2010</year>
<conf-name><![CDATA[ Proceeding of the Knowledge Discovery and Data Engineering]]></conf-name>
<conf-loc> </conf-loc>
</nlm-citation>
</ref>
<ref id="B20">
<label>20</label><nlm-citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Vapnik]]></surname>
<given-names><![CDATA[V]]></given-names>
</name>
</person-group>
<source><![CDATA[The Nature of Statistical learning Theory]]></source>
<year>1995</year>
<publisher-name><![CDATA[Springer]]></publisher-name>
</nlm-citation>
</ref>
<ref id="B21">
<label>21</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Wang]]></surname>
<given-names><![CDATA[K. L.]]></given-names>
</name>
<name>
<surname><![CDATA[Zhao]]></surname>
<given-names><![CDATA[H.]]></given-names>
</name>
<name>
<surname><![CDATA[Lu]]></surname>
<given-names><![CDATA[B. L.]]></given-names>
</name>
</person-group>
<article-title xml:lang=""><![CDATA[A meta-top down method for large scale hierarchical classification]]></article-title>
<source><![CDATA[IEEE Transactions on Knowledge and Data Engineering]]></source>
<year>2014</year>
<volume>26</volume>
<page-range>500&#8211;513</page-range></nlm-citation>
</ref>
</ref-list>
</back>
</article>
