Operand forwarding

{{Short description|CPU optimization technique to improve instruction-level parallelism}} {{Use American English|date = March 2019}}

'''Operand forwarding''' (or '''data forwarding''', '''register bypass''') is an optimization in pipelined CPUs to limit performance deficits which occur due to pipeline stalls caused by data hazards.<ref>{{cite web|url=http://www.csee.umbc.edu/~squire/cs411_l19.html |title=CMSC 411 Lecture 19, Pipelining Data Forwarding |publisher=University of Maryland Baltimore County Computer Science and Electrical Engineering Department |access-date=2020-01-22}}</ref><ref>{{Cite web |url=http://hpc.serc.iisc.ernet.in/~govind/hpc/L10-Pipeline.txt |title=High performance computing, Notes of class 11 |publisher=hpc.serc.iisc.ernet.in |date=September 2000 |access-date=2014-02-08 |url-status=dead |archive-url=https://web.archive.org/web/20131227033204/http://hpc.serc.iisc.ernet.in/~govind/hpc/L10-Pipeline.txt |archive-date=2013-12-27 }}</ref> A data hazard can lead to a pipeline stall when the current operation has to wait for the results of an earlier operation which has not yet finished.

It is very common that an instruction requires a value computed by the immediately preceding instruction. It may take a few clock cycles to write a result to the register file and then read it back for the subsequent instruction. To improve performance, the register file write/read is bypassed. The result of an instruction is forwarded directly to the execute stage of a subsequent instruction.

==Example==

ADD A B C #A=B+C SUB D C A #D=C-A

If these two assembly pseudocode instructions run in a pipeline, after fetching and decoding the second instruction, the pipeline stalls, waiting until the result of the addition is written and read.

{| class="wikitable" align=center style="margin:0.46em 0.2em" |+ Without operand forwarding ! 1 || 2 || 3 || 4 || style="width: 10em;" | 5 || style="width: 10em;" | 6 || 7 || 8 |- | Fetch ADD || Decode ADD || Read Operands ADD || Execute ADD || Write result || || || |- | || Fetch SUB || Decode SUB || ''stall'' || ''stall'' || Read Operands SUB || Execute SUB || Write result |}

{| class="wikitable" align=center style="margin:0.46em 0.2em" |+ With operand forwarding ! 1 || 2 || 3 || 4 || style="width: 10em;" | 5 || style="width: 10em;" | 6 || 7 |- | Fetch ADD || Decode ADD || Read Operands ADD || Execute ADD || Write result || || |- | || Fetch SUB || Decode SUB || ''stall'' || Read Operands SUB: use result from previous operation || Execute SUB || Write result |}

In some cases all stalls from such read-after-write data hazards can be completely eliminated by operand forwarding:<ref> Gurpur M. Prabhu. "Computer Architecture Tutorial". Sections [https://web.cs.iastate.edu/~prabhu/Tutorial/PIPELINE/forward.html "Forwarding"]. and [https://web.cs.iastate.edu/~prabhu/Tutorial/PIPELINE/dataHazClass.html "Data Hazard Classification"]. </ref><ref> Dr. Orion Lawlor. [https://www.cs.uaf.edu/2011/fall/cs441/lecture/09_20_pipelining.html "Pipelining, Pipeline Stalls, and Operand Forwarding"]. </ref><ref> Larry Snyder. [https://courses.cs.washington.edu/courses/cse378/09au/lectures/cse378au09-15.pdf "Pipeline Review"]. </ref>

{| class="wikitable" align=center style="margin:0.46em 0.2em" |+ With operand forwarding (enhanced) ! 1 || 2 || 3 || 4 || style="width: 10em;" | 5 || style="width: 10em;" | 6 |- | Fetch ADD || Decode ADD || Read Operands ADD || Execute ADD || Write result || |- | || Fetch SUB || Decode SUB || Read Operands SUB: use result from previous operation || Execute SUB || Write result |}

==Technical realization== The CPU control unit must implement logic to detect dependencies where operand forwarding makes sense. A multiplexer can then be used to select the proper register or flip-flop to read the operand from.

==See also== *Feed forward (control)

==References== {{reflist}}

==External links== * [http://www.cs.uaf.edu/2010/fall/cs441/lecture/09_16_pipelining.html Introduction to Pipelining]

Category:Instruction processing