<?xml version="1.0" encoding="ISO-8859-1"?><article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<front>
<journal-meta>
<journal-id>1405-5546</journal-id>
<journal-title><![CDATA[Computación y Sistemas]]></journal-title>
<abbrev-journal-title><![CDATA[Comp. y Sist.]]></abbrev-journal-title>
<issn>1405-5546</issn>
<publisher>
<publisher-name><![CDATA[Instituto Politécnico Nacional, Centro de Investigación en Computación]]></publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id>S1405-55462012000100003</article-id>
<title-group>
<article-title xml:lang="en"><![CDATA[A Reorder Buffer Design for High Performance Processors]]></article-title>
<article-title xml:lang="es"><![CDATA[Diseño de un búfer de reordenamiento para procesadores de alto desempeño]]></article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname><![CDATA[García Ordaz]]></surname>
<given-names><![CDATA[José R]]></given-names>
</name>
<xref ref-type="aff" rid="A01"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname><![CDATA[Ramírez Salinas]]></surname>
<given-names><![CDATA[Marco A]]></given-names>
</name>
<xref ref-type="aff" rid="A01"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname><![CDATA[Villa Vargas]]></surname>
<given-names><![CDATA[Luis A]]></given-names>
</name>
<xref ref-type="aff" rid="A01"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname><![CDATA[Molina Lozano]]></surname>
<given-names><![CDATA[Herón]]></given-names>
</name>
<xref ref-type="aff" rid="A01"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname><![CDATA[Peredo Macías]]></surname>
<given-names><![CDATA[Cuauhtémoc]]></given-names>
</name>
<xref ref-type="aff" rid="A01"/>
</contrib>
</contrib-group>
<aff id="A01">
<institution><![CDATA[,Instituto Politécnico Nacional Centro de Investigación en Computación Microtechnology and Embedded System Laboratory]]></institution>
<addr-line><![CDATA[ ]]></addr-line>
</aff>
<pub-date pub-type="pub">
<day>00</day>
<month>03</month>
<year>2012</year>
</pub-date>
<pub-date pub-type="epub">
<day>00</day>
<month>03</month>
<year>2012</year>
</pub-date>
<volume>16</volume>
<numero>1</numero>
<fpage>15</fpage>
<lpage>25</lpage>
<copyright-statement/>
<copyright-year/>
<self-uri xlink:href="http://www.scielo.org.mx/scielo.php?script=sci_arttext&amp;pid=S1405-55462012000100003&amp;lng=en&amp;nrm=iso"></self-uri><self-uri xlink:href="http://www.scielo.org.mx/scielo.php?script=sci_abstract&amp;pid=S1405-55462012000100003&amp;lng=en&amp;nrm=iso"></self-uri><self-uri xlink:href="http://www.scielo.org.mx/scielo.php?script=sci_pdf&amp;pid=S1405-55462012000100003&amp;lng=en&amp;nrm=iso"></self-uri><abstract abstract-type="short" xml:lang="en"><p><![CDATA[Modern reorder buffers (ROBs) were conceived to improve processor performance by allowing instruction execution out of the original program order and run ahead of sequential instruction code exploiting existing instruction level parallelism (ILP). The ROB is a functional structure of a processor execution engine that supports speculative execution, physical register recycling, and precise exception recovering. Traditionally, the ROB is considered as a monolithic circular buffer with incoming instructions at the tail pointer after the decoding stage and completing instructions at the head pointer after the commitment stage. The latter stage verifies instructions that have been dispatched, issued, executed, and are not completed speculatively. This paper presents a design of distributed reorder buffer microarchitecture by using small structures near building blocks which work together, using the same tail and head pointer values on all structures for synchronization. The reduction of area, and therefore, the reduction of power and delay make this design suitable for both embedded and high performance microprocessors.]]></p></abstract>
<abstract abstract-type="short" xml:lang="es"><p><![CDATA[El búfer de reordenamiento de instrucciones (ROB) fue conceptualizado para mejorar el desempeño de los procesadores al permitir ejecutar instrucciones fuera del orden original del programa y en avance al instante preciso de la ejecución secuencial, explotando el paralelismo que existe a nivel de las instrucciones ILP. El ROB es una estructura funcional de la máquina de ejecución de los procesadores para dar soporte a la ejecución especulativa, al reciclado de los registros físicos y a la recuperación precisa de excepciones. Tradicionalmente el ROB es considerado un búfer circular monolítico en donde las instrucciones entran en la dirección especificada por un apuntador de cola después de la etapa de decodificación y son terminadas en la dirección especificada por un apuntador de cabecera después de la etapa de finalización. El artículo presenta el diseño de un búfer de reordenamiento de instrucciones distribuido en pequeñas estructuras cercanas a los bloques funcionales con los cuales interactúan, usando los mismos valores de apuntadores de cola y cabecera por sincronía. La reducción de área y por consecuencia la reducción de consumo de energía y retardo hacen de este diseño apropiado para procesadores embebidos y procesadores de alto desempeño.]]></p></abstract>
<kwd-group>
<kwd lng="en"><![CDATA[Superscalar processors]]></kwd>
<kwd lng="en"><![CDATA[reorder-buffer]]></kwd>
<kwd lng="en"><![CDATA[instruction window]]></kwd>
<kwd lng="en"><![CDATA[low power consumption]]></kwd>
<kwd lng="es"><![CDATA[Procesadores súper escalares]]></kwd>
<kwd lng="es"><![CDATA[búfer de reordenamiento]]></kwd>
<kwd lng="es"><![CDATA[ventana de instrucciones]]></kwd>
<kwd lng="es"><![CDATA[consumo de baja potencia]]></kwd>
</kwd-group>
</article-meta>
</front><body><![CDATA[  	    <p align="justify"><font face="verdana" size="4">Art&iacute;culos</font></p>  	    <p align="justify"><font face="verdana" size="2">&nbsp;</font></p>  	    <p align="center"><font face="verdana" size="4"><b>A Reorder Buffer Design for High Performance Processors</b></font></p>  	    <p align="center"><font face="verdana" size="2">&nbsp;</font></p>  	    <p align="center"><font face="verdana" size="3"><b>Dise&ntilde;o de un b&uacute;fer de reordenamiento para procesadores de alto desempe&ntilde;o</b></font></p>  	    <p align="center"><font face="verdana" size="2">&nbsp;</font></p>  	    <p align="center"><font face="verdana" size="2"><b>Jos&eacute; R. Garc&iacute;a Ordaz, Marco A. Ram&iacute;rez Salinas, Luis A. Villa Vargas, Her&oacute;n Molina Lozano, and Cuauht&eacute;moc Peredo Mac&iacute;as</b></font></p>  	    <p align="justify"><font face="verdana" size="2">&nbsp;</font></p>  	    <p align="justify"><font face="verdana" size="2"><i>Microtechnology and Embedded System Laboratory, Centro de Investigaci&oacute;n en Computaci&oacute;n, Instituto Polit&eacute;cnico Nacional, Av. Juan de Dios B&aacute;tiz, s/n, Zacatenco, 07738, M&eacute;xico DF, Mexico. Correo:</i> <a href="malto:jgarcia@cic.ipn.mx">jgarcia@cic.ipn.mx</a>, <a href="mailto:mars@cic.ipn.mx">mars@cic.ipn.mx</a>, <a href="mailto:lvilla@cic.ipn.mx">lvilla@cic.ipn.mx</a>, <a href="mailto:hmolina@cic.ipn.mx">hmolina@cic.ipn.mx</a>, <a href="mailto:cperedo@cic.ipn.mx">cperedo@cic.ipn.mx</a>.</font></p>  	    ]]></body>
<body><![CDATA[<p align="justify"><font face="verdana" size="2">&nbsp;</font></p>  	    <p align="justify"><font face="verdana" size="2">Article received on 01/02/2010.    <br> 	Accepted on 15/04/2011.</font></p>  	    <p align="justify"><font face="verdana" size="2">&nbsp;</font></p>  	    <p align="justify"><font face="verdana" size="2"><b>Abstract</b></font></p>  	    <p align="justify"><font face="verdana" size="2">Modern reorder buffers (ROBs) were conceived to improve processor performance by allowing instruction execution out of the original program order and run ahead of sequential instruction code exploiting existing instruction level parallelism (ILP). The ROB is a functional structure of a processor execution engine that supports speculative execution, physical register recycling, and precise exception recovering. Traditionally, the ROB is considered as a monolithic circular buffer with incoming instructions at the tail pointer after the decoding stage and completing instructions at the head pointer after the commitment stage. The latter stage verifies instructions that have been dispatched, issued, executed, and are not completed speculatively. This paper presents a design of distributed reorder buffer microarchitecture by using small structures near building blocks which work together, using the same tail and head pointer values on all structures for synchronization. The reduction of area, and therefore, the reduction of power and delay make this design suitable for both embedded and high performance microprocessors.</font></p>  	    <p align="justify"><font face="verdana" size="2"><b>Keywords:</b> Superscalar processors, reorder&#45;buffer, instruction window, low power consumption.</font></p>  	    <p align="justify"><font face="verdana" size="2">&nbsp;</font></p>  	    <p align="justify"><font face="verdana" size="2"><b>Resumen</b></font></p>  	    <p align="justify"><font face="verdana" size="2">El b&uacute;fer de reordenamiento de instrucciones (ROB) fue conceptualizado para mejorar el desempe&ntilde;o de los procesadores al permitir ejecutar instrucciones fuera del orden original del programa y en avance al instante preciso de la ejecuci&oacute;n secuencial, explotando el paralelismo que existe a nivel de las instrucciones ILP. El ROB es una estructura funcional de la m&aacute;quina de ejecuci&oacute;n de los procesadores para dar soporte a la ejecuci&oacute;n especulativa, al reciclado de los registros f&iacute;sicos y a la recuperaci&oacute;n precisa de excepciones. Tradicionalmente el ROB es considerado un b&uacute;fer circular monol&iacute;tico en donde las instrucciones entran en la direcci&oacute;n especificada por un apuntador de cola despu&eacute;s de la etapa de decodificaci&oacute;n y son terminadas en la direcci&oacute;n especificada por un apuntador de cabecera despu&eacute;s de la etapa de finalizaci&oacute;n. El art&iacute;culo presenta el dise&ntilde;o de un b&uacute;fer de reordenamiento de instrucciones distribuido en peque&ntilde;as estructuras cercanas a los bloques funcionales con los cuales interact&uacute;an, usando los mismos valores de apuntadores de cola y cabecera por sincron&iacute;a. La reducci&oacute;n de &aacute;rea y por consecuencia la reducci&oacute;n de consumo de energ&iacute;a y retardo hacen de este dise&ntilde;o apropiado para procesadores embebidos y procesadores de alto desempe&ntilde;o.</font></p>  	    ]]></body>
<body><![CDATA[<p align="justify"><font face="verdana" size="2"><b>Palabras Clave:</b> Procesadores s&uacute;per escalares, b&uacute;fer de reordenamiento, ventana de instrucciones, consumo de baja potencia.</font></p>  	    <p align="justify"><font face="verdana" size="2">&nbsp;</font></p>  	    <p align="justify"><font face="verdana" size="2"><a href="/pdf/cys/v16n1/v16n1a3.pdf" target="_blank">DESCARGAR ART&Iacute;CULO EN FORMATO PDF</a></font></p>  	    <p align="justify"><font face="verdana" size="2">&nbsp;</font></p>  	    <p align="justify"><font face="verdana" size="2"><b>Acknowledgments</b></font></p>  	    <p align="justify"><font face="verdana" size="2">This work has been partially supported by grants under agreements SIP&#45;20101320 and SIP&#45;20101154 of the Graduate Studies and Research Department of the National Polytechnic Institute (IPN), Mexico, and by grants under agreements 124104 and 115976 of the National Council for Science and Technology (CONACyT), Mexico.</font></p>  	    <p align="justify"><font face="verdana" size="2">&nbsp;</font></p>  	    <p align="justify"><font face="verdana" size="2"><b>References</b></font></p>  	    <!-- ref --><p align="justify"><font face="verdana" size="2"><b>1. Burger, D. &amp; Austing, T.M. (1997).</b> The Simplescalar Tool Set Ver. 2.0. <i>ACM SIGARCH Computer Architecture news,</i> 25(3), 13&#45;25.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=2055388&pid=S1405-5546201200010000300001&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --></font></p>  	    ]]></body>
<body><![CDATA[<!-- ref --><p align="justify"><font face="verdana" size="2"><b>2. Cristal, A., Ortega, D., Llosa, J., &amp; Valero, M. (2004).</b> Out&#45;of&#45;Order Commit Processors. <i>10th International Symposium on High Performance Computer Architecture (HPCA '04),</i> 48&#45;59.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=2055390&pid=S1405-5546201200010000300002&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --></font></p>  	    <!-- ref --><p align="justify"><font face="verdana" size="2"><b>3. Edmondson, J.H., Rubinfeld, P., Preston, R., &amp; Rajagopalan, V. (1995).</b> Superscalar Instruction Execution in the 21164 Alpha Microprocessor. <i>IEEE</i> <i>micro</i>,15(2), 33&#45;43.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=2055392&pid=S1405-5546201200010000300003&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --></font></p>  	    <!-- ref --><p align="justify"><font face="verdana" size="2"><b>4. Hinton, G., Sager, D., Upton, M., Boggs, D., Carmean, D., Kyker, A., &amp; Roussel, P. (2001).</b> The Microarchitecture of the Pentium 4 Processor. <i>Intel Technology Journal,</i> 5(1), 1&#45;13.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=2055394&pid=S1405-5546201200010000300004&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --></font></p>  	    <!-- ref --><p align="justify"><font face="verdana" size="2"><b>5. Kessler, R.E., McLellan, E.J., &amp; Webb, D.A.</b> <b>(1999).</b> The Alpha 21264 Microprocessor Architecture. <i>IEEE micro,</i> 19(2), 24&#45;36.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=2055396&pid=S1405-5546201200010000300005&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --></font></p>  	    <!-- ref --><p align="justify"><font face="verdana" size="2"><b>6. Kucuk, G., Ponomarev, D.V., Ergin, O., &amp; Ghose, K. (2004).</b> Complexity&#45;Effective Reorder Buffer Designs for Superscalar Processors. <i>IEEE Transaction on Computers,</i> 53(6), 653&#45;665.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=2055398&pid=S1405-5546201200010000300006&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --></font></p>  	    ]]></body>
<body><![CDATA[<!-- ref --><p align="justify"><font face="verdana" size="2"><b>7. Leibholz, D. &amp; Razdan, R. (1997).</b> The Alpha 21264: A 500mhz out&#45;Of.Order Execution Microprocessor. <i>IEEE COMPCON 97,</i> San Jose, CA , USA, 28&#45;36.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=2055400&pid=S1405-5546201200010000300007&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --></font></p>  	    <!-- ref --><p align="justify"><font face="verdana" size="2"><b>8. Lenell, J., Wallace, S., &amp; Bagherzadeh, N. (1992).</b> A 20mhz Cmos Reorder Buffer for a Superscalar Microprocessor. <i>4th NASA Symposium on VLSI DESIGN,</i> Idaho, Moscow, 2.3.1&#45;2.3.12.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=2055402&pid=S1405-5546201200010000300008&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --></font></p>  	    <!-- ref --><p align="justify"><font face="verdana" size="2"><b>9. Mart&iacute;, S.P., Borr&aacute;s, J.S., Rodr&iacute;guez, P.L., Tena, R.U., &amp; Mar&iacute;n, J.D. (2009).</b> A Complexity&#45;Effective out&#45;of&#45;Order Retirement Microarchitecture. <i>IEEE Transactions on Computers,</i> 58(12), 1626&#45;1639.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=2055404&pid=S1405-5546201200010000300009&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --></font></p>  	    <!-- ref --><p align="justify"><font face="verdana" size="2"><b>10. Ramirez, M.A., Cristal, A., Veidenbaum, A.V., Villa, L., &amp; Valero, M. (2005).</b> A New Pointer&#45;Based Instruction Queue Design and Its Power&#45;Performance Evaluation. <i>2005</i> IEEE <i>International Conference on Computer Design: VLSI in Computers an Processors,</i> San Jose CA, USA, 647&#45;653.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=2055406&pid=S1405-5546201200010000300010&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --></font></p>  	    <!-- ref --><p align="justify"><font face="verdana" size="2"><b>11. Veidenbaum, A.V., Ramirez, M.A., Cristal, A., &amp; Valero, M. (2008).</b> Pointer&#45;Based Instruction Queue Design for out of Order Processors. US 2008/0082788A1</font>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=2055408&pid=S1405-5546201200010000300011&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --><!-- ref --><p align="justify"><font face="verdana" size="2"><b>12. Wang, C.J. &amp; Emnett, F. (1993).</b> Implementing Precise Interruptions in Pipeline Risc Processors. <i>IEEE micro,</i> 13(4), 36&#45;43.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=2055409&pid=S1405-5546201200010000300012&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --></font></p>  	    <!-- ref --><p align="justify"><font face="verdana" size="2"><b>13. Yeaguer, K. C. (1996).</b> The Mips R10000 Superescalar Microprocessors. <i>IEEE micro,</i> 16(2), 28&#45;41.    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[&#160;<a href="javascript:void(0);" onclick="javascript: window.open('/scielo.php?script=sci_nlinks&ref=2055411&pid=S1405-5546201200010000300013&lng=','','width=640,height=500,resizable=yes,scrollbars=1,menubar=yes,');">Links</a>&#160;]<!-- end-ref --></font></p>      ]]></body><back>
<ref-list>
<ref id="B1">
<label>1</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Burger]]></surname>
<given-names><![CDATA[D.]]></given-names>
</name>
<name>
<surname><![CDATA[Austing]]></surname>
<given-names><![CDATA[T.M.]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[The Simplescalar Tool Set Ver]]></article-title>
<source><![CDATA[2.0. ACM SIGARCH Computer Architecture news]]></source>
<year>1997</year>
<volume>25</volume>
<numero>3</numero>
<issue>3</issue>
<page-range>13-25</page-range></nlm-citation>
</ref>
<ref id="B2">
<label>2</label><nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Cristal]]></surname>
<given-names><![CDATA[A]]></given-names>
</name>
<name>
<surname><![CDATA[Ortega]]></surname>
<given-names><![CDATA[D]]></given-names>
</name>
<name>
<surname><![CDATA[Llosa]]></surname>
<given-names><![CDATA[J]]></given-names>
</name>
<name>
<surname><![CDATA[Valero]]></surname>
<given-names><![CDATA[M]]></given-names>
</name>
</person-group>
<source><![CDATA[Out-of-Order Commit Processors]]></source>
<year>2004</year>
<conf-name><![CDATA[10 International Symposium on High Performance Computer Architecture]]></conf-name>
<conf-loc> </conf-loc>
<page-range>48-59</page-range></nlm-citation>
</ref>
<ref id="B3">
<label>3</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Edmondson]]></surname>
<given-names><![CDATA[J.H.]]></given-names>
</name>
<name>
<surname><![CDATA[Rubinfeld]]></surname>
<given-names><![CDATA[P.]]></given-names>
</name>
<name>
<surname><![CDATA[Preston]]></surname>
<given-names><![CDATA[R.]]></given-names>
</name>
<name>
<surname><![CDATA[Rajagopalan]]></surname>
<given-names><![CDATA[V.]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[Superscalar Instruction Execution in the 21164 Alpha Microprocessor]]></article-title>
<source><![CDATA[IEEE micro]]></source>
<year>1995</year>
<volume>15</volume>
<numero>2</numero>
<issue>2</issue>
<page-range>33-43</page-range></nlm-citation>
</ref>
<ref id="B4">
<label>4</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Hinton]]></surname>
<given-names><![CDATA[G.]]></given-names>
</name>
<name>
<surname><![CDATA[Sager]]></surname>
<given-names><![CDATA[D.]]></given-names>
</name>
<name>
<surname><![CDATA[Upton]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
<name>
<surname><![CDATA[Boggs]]></surname>
<given-names><![CDATA[D.]]></given-names>
</name>
<name>
<surname><![CDATA[Carmean]]></surname>
<given-names><![CDATA[D.]]></given-names>
</name>
<name>
<surname><![CDATA[Kyker]]></surname>
<given-names><![CDATA[A.]]></given-names>
</name>
<name>
<surname><![CDATA[Roussel]]></surname>
<given-names><![CDATA[P.]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[The Microarchitecture of the Pentium 4 Processor]]></article-title>
<source><![CDATA[Intel Technology Journal]]></source>
<year>2001</year>
<volume>5</volume>
<numero>1</numero>
<issue>1</issue>
<page-range>1-13</page-range></nlm-citation>
</ref>
<ref id="B5">
<label>5</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Kessler]]></surname>
<given-names><![CDATA[R.E.]]></given-names>
</name>
<name>
<surname><![CDATA[McLellan]]></surname>
<given-names><![CDATA[E.J.]]></given-names>
</name>
<name>
<surname><![CDATA[Webb]]></surname>
<given-names><![CDATA[D.A.]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[The Alpha 21264 Microprocessor Architecture]]></article-title>
<source><![CDATA[IEEE micro]]></source>
<year>1999</year>
<volume>19</volume>
<numero>2</numero>
<issue>2</issue>
<page-range>24-36</page-range></nlm-citation>
</ref>
<ref id="B6">
<label>6</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Kucuk]]></surname>
<given-names><![CDATA[G.]]></given-names>
</name>
<name>
<surname><![CDATA[Ponomarev]]></surname>
<given-names><![CDATA[D.V.]]></given-names>
</name>
<name>
<surname><![CDATA[Ergin]]></surname>
<given-names><![CDATA[O.]]></given-names>
</name>
<name>
<surname><![CDATA[Ghose]]></surname>
<given-names><![CDATA[K.]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[Complexity-Effective Reorder Buffer Designs for Superscalar Processors]]></article-title>
<source><![CDATA[IEEE Transaction on Computers]]></source>
<year>2004</year>
<volume>53</volume>
<numero>6</numero>
<issue>6</issue>
<page-range>653-665</page-range></nlm-citation>
</ref>
<ref id="B7">
<label>7</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Leibholz]]></surname>
<given-names><![CDATA[D]]></given-names>
</name>
<name>
<surname><![CDATA[Razdan]]></surname>
<given-names><![CDATA[R]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[The Alpha 21264: A 500mhz out-Of.Order Execution Microprocessor]]></article-title>
<source><![CDATA[IEEE COMPCON 97]]></source>
<year>1997</year>
<page-range>28-36</page-range><publisher-loc><![CDATA[San JoseCA ]]></publisher-loc>
</nlm-citation>
</ref>
<ref id="B8">
<label>8</label><nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Lenell]]></surname>
<given-names><![CDATA[J.]]></given-names>
</name>
<name>
<surname><![CDATA[Wallace]]></surname>
<given-names><![CDATA[S.]]></given-names>
</name>
<name>
<surname><![CDATA[Bagherzadeh]]></surname>
<given-names><![CDATA[N.]]></given-names>
</name>
</person-group>
<source><![CDATA[A 20mhz Cmos Reorder Buffer for a Superscalar Microprocessor]]></source>
<year>1992</year>
<conf-name><![CDATA[4 NASA Symposium on VLSI DESIGN]]></conf-name>
<conf-loc>IdahoMoscow </conf-loc>
<page-range>2.3.1-2.3.12</page-range></nlm-citation>
</ref>
<ref id="B9">
<label>9</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Martí]]></surname>
<given-names><![CDATA[S.P.]]></given-names>
</name>
<name>
<surname><![CDATA[Borrás]]></surname>
<given-names><![CDATA[J.S.]]></given-names>
</name>
<name>
<surname><![CDATA[Rodríguez]]></surname>
<given-names><![CDATA[P.L.]]></given-names>
</name>
<name>
<surname><![CDATA[Tena]]></surname>
<given-names><![CDATA[R.U.]]></given-names>
</name>
<name>
<surname><![CDATA[Marín]]></surname>
<given-names><![CDATA[J.D.]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[A Complexity-Effective out-of-Order Retirement Microarchitecture]]></article-title>
<source><![CDATA[IEEE Transactions on Computers]]></source>
<year>2009</year>
<volume>58</volume>
<numero>12</numero>
<issue>12</issue>
<page-range>1626-1639</page-range></nlm-citation>
</ref>
<ref id="B10">
<label>10</label><nlm-citation citation-type="confpro">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Ramirez]]></surname>
<given-names><![CDATA[M.A.]]></given-names>
</name>
<name>
<surname><![CDATA[Cristal]]></surname>
<given-names><![CDATA[A.]]></given-names>
</name>
<name>
<surname><![CDATA[Veidenbaum]]></surname>
<given-names><![CDATA[A.V.]]></given-names>
</name>
<name>
<surname><![CDATA[Villa]]></surname>
<given-names><![CDATA[L.]]></given-names>
</name>
<name>
<surname><![CDATA[Valero]]></surname>
<given-names><![CDATA[M.]]></given-names>
</name>
</person-group>
<source><![CDATA[A New Pointer-Based Instruction Queue Design and Its Power-Performance Evaluation]]></source>
<year>2005</year>
<conf-name><![CDATA[ IEEE International Conference on Computer Design: VLSI in Computers an Processors]]></conf-name>
<conf-date>2005</conf-date>
<conf-loc>San Jose CA</conf-loc>
<page-range>647-653</page-range></nlm-citation>
</ref>
<ref id="B11">
<nlm-citation citation-type="">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Veidenbaum]]></surname>
<given-names><![CDATA[A.V.]]></given-names>
</name>
<name>
<surname><![CDATA[Ramirez]]></surname>
<given-names><![CDATA[M.A.]]></given-names>
</name>
<name>
<surname><![CDATA[Cristal]]></surname>
<given-names><![CDATA[A]]></given-names>
</name>
<name>
<surname><![CDATA[Valero]]></surname>
<given-names><![CDATA[M]]></given-names>
</name>
</person-group>
<source><![CDATA[Pointer-Based Instruction Queue Design for out of Order Processors]]></source>
<year>2008</year>
</nlm-citation>
</ref>
<ref id="B12">
<label>12</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Wang]]></surname>
<given-names><![CDATA[C.J.]]></given-names>
</name>
<name>
<surname><![CDATA[Emnett]]></surname>
<given-names><![CDATA[F.]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[Implementing Precise Interruptions in Pipeline Risc Processors]]></article-title>
<source><![CDATA[IEEE micro]]></source>
<year>1993</year>
<volume>13</volume>
<numero>4</numero>
<issue>4</issue>
<page-range>36-43</page-range></nlm-citation>
</ref>
<ref id="B13">
<label>13</label><nlm-citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname><![CDATA[Yeaguer]]></surname>
<given-names><![CDATA[K. C.]]></given-names>
</name>
</person-group>
<article-title xml:lang="en"><![CDATA[The Mips R10000 Superescalar Microprocessors]]></article-title>
<source><![CDATA[IEEE micro]]></source>
<year>1996</year>
<volume>16</volume>
<numero>2</numero>
<issue>2</issue>
<page-range>28-41</page-range></nlm-citation>
</ref>
</ref-list>
</back>
</article>
