US20050278513A1 - Systems and methods of dynamic branch prediction in a microprocessor - Google Patents
Systems and methods of dynamic branch prediction in a microprocessor Download PDFInfo
- Publication number
- US20050278513A1 US20050278513A1 US11/132,423 US13242305A US2005278513A1 US 20050278513 A1 US20050278513 A1 US 20050278513A1 US 13242305 A US13242305 A US 13242305A US 2005278513 A1 US2005278513 A1 US 2005278513A1
- Authority
- US
- United States
- Prior art keywords
- branch prediction
- branch
- microprocessor
- bpu
- instruction
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
- 238000000034 method Methods 0.000 title claims abstract description 55
- 230000003068 static effect Effects 0.000 claims abstract description 25
- 230000002708 enhancing effect Effects 0.000 claims description 3
- 230000036316 preload Effects 0.000 claims 2
- 238000004088 simulation Methods 0.000 claims 1
- 230000008901 benefit Effects 0.000 description 7
- 238000010586 diagram Methods 0.000 description 3
- 230000006399 behavior Effects 0.000 description 2
- 230000001965 increasing effect Effects 0.000 description 2
- 239000000463 material Substances 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- XUIMIQQOPSSXEZ-UHFFFAOYSA-N Silicon Chemical compound [Si] XUIMIQQOPSSXEZ-UHFFFAOYSA-N 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 230000007717 exclusion Effects 0.000 description 1
- 230000000979 retarding effect Effects 0.000 description 1
- 230000000630 rising effect Effects 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 229910052710 silicon Inorganic materials 0.000 description 1
- 239000010703 silicon Substances 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline, look ahead
- G06F9/3861—Recovery, e.g. branch miss-prediction, exception handling
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F5/00—Methods or arrangements for data conversion without changing the order or content of the data handled
- G06F5/01—Methods or arrangements for data conversion without changing the order or content of the data handled for shifting, e.g. justifying, scaling, normalising
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/36—Preventing errors by testing or debugging software
- G06F11/362—Software debugging
- G06F11/3648—Software debugging using additional hardware
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/76—Architectures of general purpose stored program computers
- G06F15/78—Architectures of general purpose stored program computers comprising a single central processing unit
- G06F15/7867—Architectures of general purpose stored program computers comprising a single central processing unit with reconfigurable architecture
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30003—Arrangements for executing specific machine instructions
- G06F9/30007—Arrangements for executing specific machine instructions to perform operations on data operands
- G06F9/30032—Movement instructions, e.g. MOVE, SHIFT, ROTATE, SHUFFLE
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30003—Arrangements for executing specific machine instructions
- G06F9/30007—Arrangements for executing specific machine instructions to perform operations on data operands
- G06F9/30036—Instructions to perform operations on packed data, e.g. vector, tile or matrix operations
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30145—Instruction analysis, e.g. decoding, instruction word fields
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30145—Instruction analysis, e.g. decoding, instruction word fields
- G06F9/30149—Instruction analysis, e.g. decoding, instruction word fields of variable length instructions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30181—Instruction operation extension or modification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/32—Address formation of the next instruction, e.g. by incrementing the instruction counter
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/32—Address formation of the next instruction, e.g. by incrementing the instruction counter
- G06F9/322—Address formation of the next instruction, e.g. by incrementing the instruction counter for non-sequential address
- G06F9/325—Address formation of the next instruction, e.g. by incrementing the instruction counter for non-sequential address for loops, e.g. loop detection or loop counter
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline, look ahead
- G06F9/3802—Instruction prefetching
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline, look ahead
- G06F9/3802—Instruction prefetching
- G06F9/3804—Instruction prefetching for branches, e.g. hedging, branch folding
- G06F9/3806—Instruction prefetching for branches, e.g. hedging, branch folding using address prediction, e.g. return stack, branch history buffer
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline, look ahead
- G06F9/3802—Instruction prefetching
- G06F9/3816—Instruction alignment, e.g. cache line crossing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline, look ahead
- G06F9/3836—Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution
- G06F9/3842—Speculative instruction execution
- G06F9/3844—Speculative instruction execution using dynamic branch prediction, e.g. using branch history tables
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline, look ahead
- G06F9/3836—Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution
- G06F9/3842—Speculative instruction execution
- G06F9/3846—Speculative instruction execution using static prediction, e.g. branch taken strategy
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline, look ahead
- G06F9/3885—Concurrent instruction execution, e.g. pipeline, look ahead using a plurality of independent parallel functional units
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline, look ahead
- G06F9/3885—Concurrent instruction execution, e.g. pipeline, look ahead using a plurality of independent parallel functional units
- G06F9/3893—Concurrent instruction execution, e.g. pipeline, look ahead using a plurality of independent parallel functional units controlled in tandem, e.g. multiplier-accumulator
- G06F9/3895—Concurrent instruction execution, e.g. pipeline, look ahead using a plurality of independent parallel functional units controlled in tandem, e.g. multiplier-accumulator for complex operations, e.g. multidimensional or interleaved address generators, macros
- G06F9/3897—Concurrent instruction execution, e.g. pipeline, look ahead using a plurality of independent parallel functional units controlled in tandem, e.g. multiplier-accumulator for complex operations, e.g. multidimensional or interleaved address generators, macros with adaptable data path
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Definitions
- This invention relates generally to microprocessor architecture and more specifically to improved systems and methods for performing branch prediction in a multi-stage pipelined microprocessor.
- Multistage pipeline microprocessor architecture is known in the art.
- a typical microprocessor pipeline consists of several stages of instruction handling hardware, wherein each rising pulse of a clock signal propagates instructions one stage further in the pipeline.
- the clock speed dictates the number of clock signals and therefore pipeline propagations per second, the effective operational speed of the processor is dependent partially upon the rate that instructions and operands are transferred between memory and the processor.
- branch prediction has a sufficiently high success rate that the benefits associated with correct predictions outweigh the cost of occasional incorrect predictions—i.e., pipeline flush.
- branch prediction can achieve accuracy over ninety percent of the time.
- dynamic branch prediction records runtime program flow behavior in order to establish a history that can be used at the front of the pipeline to predict future non-sequential program flow.
- the look up table is referenced for the address of the next instruction which is then predictively injected into the pipeline.
- dynamic branch prediction significantly increases performance.
- this technique is initially ineffective, and can even reduce system performance until a sufficient number of instructions have been processed to fill the branch history tables. Because of the required “warm-up” period for this technique to become effective, runtime behavior of critical code could become unpredictable making it unacceptable for certain embedded applications.
- mistaken branch predictions result in a flush of the entire pipeline wasting clock cycles and retarding performance.
- Various embodiments of the invention may ameliorate or overcome one or more of the shortcomings of conventional branch prediction techniques through a hybrid branch prediction technique that takes advantage of features of both static and dynamic branch prediction.
- At least one exemplary embodiment of the invention may provide a method of performing branch prediction in a microprocessor having a multi-stage instruction pipeline.
- the method of performing branch prediction according to this embodiment comprises building a branch prediction history table of branch prediction data through static branch prediction prior to microprocessor deployment, storing the branch prediction data in a memory in the microprocessor, loading the branch prediction data into a branch prediction unit (BPU) of the microprocessor upon powering on, and performing dynamic branch prediction with the BPU based on the preloaded branch prediction data.
- BPU branch prediction unit
- At least one additional exemplary embodiment of the invention may provide a method of enhancing branch prediction performance of a multi-stage pipelined microprocessor employing dynamic branch prediction.
- the method of enhancing branch prediction performance comprises performing static branch prediction to build a branch prediction history table of branch prediction data prior to microprocessor deployment, storing the branch prediction history table in a memory in the microprocessor, loading the branch prediction history table into a branch prediction unit (BPU) of the microprocessor, and performing dynamic branch prediction with the BPU based on the preloaded branch prediction data.
- BPU branch prediction unit
- FIG. 1 is a block diagram illustrating a multistage instruction pipeline of a conventional microprocessor core
- FIG. 2 is a flow chart illustrating the steps of a method for performing dynamic branch prediction based on preloaded static branch prediction data in accordance with at least one exemplary embodiment of the invention.
- FIG. 3 is a block diagram illustrating the flow of data into and out of a branch prediction unit in accordance with at least one exemplary embodiment of the invention.
- FIG. 1 illustrates a typical microprocessor core 100 with a multistage instruction pipeline.
- the first stage of the microprocessor core 100 is the instruction fetch stage (FET) 110 .
- FET instruction fetch stage
- instructions are retrieved or fetched from instruction RAM 170 based on their N-bit instruction address.
- a copy of the instruction, indexed by its address will be stored in the instruction cache 112 .
- future calls to the same instruction may be retrieved from the instruction cache 112 , rather than the relatively slower instruction RAM 170 .
- the branch prediction unit 114 increases processing speed by predicting whether a branch to a non-sequential instruction will be taken based upon past instruction processing history.
- the BPU 114 contains a branch look-up or prediction table that stores the address of branch instructions and an indication as to whether the branch was taken. Thus, when a branch instruction is fetched, the look-up table is referenced to make a prediction as to the address of the next instruction. As discussed herein, whether or not the prediction is correct will not be known until a later stage of the pipeline. In the example shown in FIG. 1 , it will not be known until the sixth stage of the pipeline.
- the next stage of the typical microprocessor core instruction pipeline is the instruction decode stage (DEC) 120 , where the actual instruction is decoded into machine language for the processor to interpret. If the instruction involves a branch or a jump, the target address is generated.
- stage (REG) 130 any required operands are read from the register file.
- stage (EXEC) 140 the particular instruction is executed by the appropriate unit.
- Typical execute stage units include a floating point unit 143 , a multiplier unit 144 , an arithmetic unit 145 , a shifter 146 , a logical unit 147 and an adder unit 148 .
- the result of the execute stage 140 is selected in the select stage (SEL) 150 and finally, this data is written back to the register file by the write back stage (WB) 160 .
- the instruction pipeline increments with each clock cycle.
- branch prediction unit branch prediction unit
- the present invention discloses a hybrid branch prediction technique that combines the benefits of both dynamic and static branch prediction.
- the technique begins in step 200 and advances to step 205 where static branch prediction is performed offline before final deployment of the processor, but based on applications which will be executed by the microprocessor after deployment.
- this static branch prediction may be performed using the assistance of a complier or simulator. For example, if the processor is to be deployed in a particular embedded application, such as an electronic device, the simulator can simulate various instructions for the discrete instruction set to be executed by the processor prior to the processor being deployed. By performing static branch prediction a table of branch history can be fully populated with the actual addresses of the next instruction after a branch instruction is executed.
- step 220 Operation of the method then advances to step 220 where, during ordinary operation, dynamic branch prediction is performed based on the preloaded branch prediction data without requiring a warm-up period or without unstable results. Then, in step 225 , after resolving each branch in the selection stage of the multistage processor pipeline, the branch prediction table in the BPU is updated with the results to improve accuracy of the prediction information as necessary. Operation of the method terminates in step 230 .
- the “current” branch prediction table may be stored in non-volatile memory so that each time the processor is powered up, the most recent branch prediction data is loaded into the BPU.
- the pipeline must be flushed and the correct instruction address injected back at the fetch stage 310 .
- the look-up table 316 is updated with the actual address of the next instruction so that it will be available for the next instance of that branch instruction.
Abstract
A hybrid branch prediction scheme for a multi-stage pipelined microprocessor that combines features of static and dynamic branch prediction to reduce complexity and enhance performance over conventional branch prediction techniques. Prior to microprocessor deployment, a branch prediction table is populated using static branch prediction techniques by executing instructions analogous to those to be executed during microprocessor deployment. The branch prediction table is stored, and then loaded into the BPU during deployment, for example, at the time of microprocessor power on. Dynamic branch prediction is then performed using the pre-loaded data, thereby enabling dynamic branch prediction with a required “warm-up” period. After resolving each branch in the selection stage of the microprocessor instruction pipeline, the BPU is updated with the address of the next instruction that resulted from that branch to enhance performance.
Description
- This application claims priority to provisional application No. 60/572,238 filed May 19, 2004, entitled “Microprocessor Architecture” hereby incorporated by reference in its entirety.
- This invention relates generally to microprocessor architecture and more specifically to improved systems and methods for performing branch prediction in a multi-stage pipelined microprocessor.
- Multistage pipeline microprocessor architecture is known in the art. A typical microprocessor pipeline consists of several stages of instruction handling hardware, wherein each rising pulse of a clock signal propagates instructions one stage further in the pipeline. Although the clock speed dictates the number of clock signals and therefore pipeline propagations per second, the effective operational speed of the processor is dependent partially upon the rate that instructions and operands are transferred between memory and the processor.
- One method of increasing processor performance is branch prediction. Branch prediction uses instruction history to predict whether a branch or non-sequential instruction will be taken. Branch or non-sequential instructions are processor instructions that require a jump to a non-sequential memory address if a condition is satisfied. When an instruction is retrieved or fetched, if the instruction is a conditional branch, the result of the conditional branch, that is, the address of the next instruction to be executed following the conditional branch, is speculatively predicted based on past branch history. This predictive or speculative result is injected into the pipeline by referencing a branch history table. Whether or not the prediction is correct will not be known until a later stage of the pipeline. However, if the prediction is correct, several clock cycles will be saved by not having to go back to get the next non-sequential instruction address.
- If the prediction is incorrect, the current pipeline behind the stage in which the prediction is determined to be incorrect must be flushed and the correct branch inserted back in the first stage. This may seem like a severe penalty in the event of an incorrect prediction because it results in the same number of clock cycles as if no branch prediction were used. However, in applications where small loops are repeated many times, such as applications typically implemented with embedded processors, branch prediction has a sufficiently high success rate that the benefits associated with correct predictions outweigh the cost of occasional incorrect predictions—i.e., pipeline flush. In these types of embedded applications branch prediction can achieve accuracy over ninety percent of the time. Thus, the risk of predicting an incorrect branch resulting in a pipeline flush is outweighed by the benefit of saved clock cycles.
- There are essentially two techniques for implementing branch prediction. The first, dynamic branch prediction, records runtime program flow behavior in order to establish a history that can be used at the front of the pipeline to predict future non-sequential program flow. When a branch instruction comes in, the look up table is referenced for the address of the next instruction which is then predictively injected into the pipeline. Once the look up table is populated with a sufficient amount of data, dynamic branch prediction significantly increases performance. However, this technique is initially ineffective, and can even reduce system performance until a sufficient number of instructions have been processed to fill the branch history tables. Because of the required “warm-up” period for this technique to become effective, runtime behavior of critical code could become unpredictable making it unacceptable for certain embedded applications. Moreover, as noted above, mistaken branch predictions result in a flush of the entire pipeline wasting clock cycles and retarding performance.
- The other primary branch prediction technique is static branch prediction. Static branch prediction uses profiling techniques to guide the complier to generate special branch instructions. These special branch instructions typically include hints to guide the processor to perform speculative branch prediction earlier in the pipeline when not all information required for branch resolution is yet available. However, a disadvantage of static branch prediction techniques is that they typically complicate the processor pipeline design because speculative as well as actual branch resolution has to be performed in several pipeline stages. Complication of design translates to increased silicon footprint and higher cost. Static branch prediction techniques can yield accurate results but they cannot cope with variation of run-time conditions. Therefore, static branch prediction also suffers from limitations which reduce its appeal for critical embedded applications.
- Thus, it would be desirable to have a branch prediction technique that ameliorates and ideally eliminates one or more of the above-noted deficiencies of conventional branch prediction techniques. However, it should be appreciated that the description herein of various advantages and disadvantages associated with known apparatus, methods, and materials is not intended to limit the scope of the invention to their exclusion. Indeed, various embodiments of the invention may include one or more of the known apparatus, methods, and materials without suffering from their disadvantages.
- As background to the techniques discussed herein, the following references are incorporated herein by reference: U.S. Pat. No. 6,862,563 issued Mar. 1, 2005 entitled “Method And Apparatus For Managing The Configuration And Functionality Of A Semiconductor Design” (Hakewill et al.); U.S. Ser. No. 10/423,745 filed Apr. 25, 2003, entitled “Apparatus and Method for Managing Integrated Circuit Designs”; and U.S. Ser. No. 10/651,560 filed Aug. 29, 2003, entitled “Improved Computerized Extension Apparatus and Methods”, all assigned to the assignee of the present invention.
- Various embodiments of the invention may ameliorate or overcome one or more of the shortcomings of conventional branch prediction techniques through a hybrid branch prediction technique that takes advantage of features of both static and dynamic branch prediction.
- At least one exemplary embodiment of the invention may provide a method of performing branch prediction in a microprocessor having a multi-stage instruction pipeline. The method of performing branch prediction according to this embodiment comprises building a branch prediction history table of branch prediction data through static branch prediction prior to microprocessor deployment, storing the branch prediction data in a memory in the microprocessor, loading the branch prediction data into a branch prediction unit (BPU) of the microprocessor upon powering on, and performing dynamic branch prediction with the BPU based on the preloaded branch prediction data.
- At least one additional exemplary embodiment of the invention may provide a method of enhancing branch prediction performance of a multi-stage pipelined microprocessor employing dynamic branch prediction. The method of enhancing branch prediction performance according to this embodiment comprises performing static branch prediction to build a branch prediction history table of branch prediction data prior to microprocessor deployment, storing the branch prediction history table in a memory in the microprocessor, loading the branch prediction history table into a branch prediction unit (BPU) of the microprocessor, and performing dynamic branch prediction with the BPU based on the preloaded branch prediction data.
- Yet an additional exemplary embodiment of the invention may provide an embedded microprocessor architecture. The embedded microprocessor architecture according to this embodiment comprises a multi-stage instruction pipeline, and a BPU adapted to perform dynamic branch prediction, wherein the BPU is preloaded with branch history table created through static branch prediction, and subsequently updated to contain the actual address of the next instructed that resulted from that branch during dynamic branch prediction.
- Other aspects and advantages of the invention will become apparent from the following detailed description, taken in conjunction with the accompanying drawings, illustrating by way of example the principles of the invention.
-
FIG. 1 is a block diagram illustrating a multistage instruction pipeline of a conventional microprocessor core; -
FIG. 2 is a flow chart illustrating the steps of a method for performing dynamic branch prediction based on preloaded static branch prediction data in accordance with at least one exemplary embodiment of the invention; and -
FIG. 3 is a block diagram illustrating the flow of data into and out of a branch prediction unit in accordance with at least one exemplary embodiment of the invention. - The following description is intended to convey a thorough understanding of the invention by providing specific embodiments and details involving various aspects of a new and useful microprocessor architecture. It is understood, however, that the invention is not limited to these specific embodiments and details, which are exemplary only. It further is understood that one possessing ordinary skill in the art, in light of known systems and methods, would appreciate the use of the invention for its intended purposes and benefits in any number of alternative embodiments, depending upon specific design and other needs.
-
FIG. 1 illustrates atypical microprocessor core 100 with a multistage instruction pipeline. The first stage of themicroprocessor core 100 is the instruction fetch stage (FET) 110. In theinstruction fetch stage 110, instructions are retrieved or fetched frominstruction RAM 170 based on their N-bit instruction address. During instruction fetches, a copy of the instruction, indexed by its address, will be stored in theinstruction cache 112. As a result, future calls to the same instruction may be retrieved from theinstruction cache 112, rather than the relativelyslower instruction RAM 170. - Another typical component of the
fetch stage 110 of a multi-stage pipelined microprocessor is the branch prediction unit (BPU) 114. Thebranch prediction unit 114 increases processing speed by predicting whether a branch to a non-sequential instruction will be taken based upon past instruction processing history. TheBPU 114 contains a branch look-up or prediction table that stores the address of branch instructions and an indication as to whether the branch was taken. Thus, when a branch instruction is fetched, the look-up table is referenced to make a prediction as to the address of the next instruction. As discussed herein, whether or not the prediction is correct will not be known until a later stage of the pipeline. In the example shown inFIG. 1 , it will not be known until the sixth stage of the pipeline. - With continued reference to
FIG. 1 , the next stage of the typical microprocessor core instruction pipeline is the instruction decode stage (DEC) 120, where the actual instruction is decoded into machine language for the processor to interpret. If the instruction involves a branch or a jump, the target address is generated. Next, in stage (REG) 130, any required operands are read from the register file. Then, in stage (EXEC) 140, the particular instruction is executed by the appropriate unit. Typical execute stage units include a floating point unit 143, a multiplier unit 144, an arithmetic unit 145, a shifter 146, a logical unit 147 and an adder unit 148. The result of the executestage 140 is selected in the select stage (SEL) 150 and finally, this data is written back to the register file by the write back stage (WB) 160. The instruction pipeline increments with each clock cycle. - Referring now to
FIG. 2 , a flow chart illustrating the steps of a method for performing dynamic branch prediction based on preloaded static branch prediction data in accordance with at least one exemplary embodiment of this invention is illustrated. As discussed above, dynamic branch prediction is a technique often employed to increase pipeline performance when software instructions lead to a non-sequential program flow. The problem arises because instructions are sequentially fed into the pipeline, but are not executed until later stages of the pipeline. Thus, the decision as to whether a non-sequential program flow (hereinafter also referred to as a branch) is to be taken or not, is not resolved until the end of the pipeline, but the related decision of which address to use to fetch the next instruction is required at the front of the pipeline. In the absence of branch prediction, the fetch stage would then have to fetch the next instruction after the branch is resolved leaving all stages of the pipeline between the resolution stage and the fetch stage unused. This is an undesired hindrance to performance. As a result, the choice as to which instruction to fetch next is made speculatively or predictively based on historical performance. A branch history table is used in the branch prediction unit (BPU) which indexes non-sequential instructions by their addresses in association with the next instruction taken. After resolving a branch in the select stage of the pipeline, the BPU is updated with the address of the next instruction that resulted from that branch. - To alleviate the limitations of both dynamic and static branch prediction techniques, the present invention discloses a hybrid branch prediction technique that combines the benefits of both dynamic and static branch prediction. With continued reference to
FIG. 2 , the technique begins instep 200 and advances to step 205 where static branch prediction is performed offline before final deployment of the processor, but based on applications which will be executed by the microprocessor after deployment. In various exemplary embodiments, this static branch prediction may be performed using the assistance of a complier or simulator. For example, if the processor is to be deployed in a particular embedded application, such as an electronic device, the simulator can simulate various instructions for the discrete instruction set to be executed by the processor prior to the processor being deployed. By performing static branch prediction a table of branch history can be fully populated with the actual addresses of the next instruction after a branch instruction is executed. - After developing a table of branch prediction data during static branch prediction, operation of the method continues to step 210 where the branch prediction table is stored in memory. In various exemplary embodiments, this step will involve storing the branch prediction table in a non-volatile memory that will be available for future use by the processor. Then, in
step 215, when the processor is deployed in the desired embedded application, the static branch prediction data is preloaded into the branch history table in the BPU. In various exemplary embodiments, the branch prediction data is preloaded at power-up of the microprocessor, such as, for example, at power-up of the particular product containing the processor. - Operation of the method then advances to step 220 where, during ordinary operation, dynamic branch prediction is performed based on the preloaded branch prediction data without requiring a warm-up period or without unstable results. Then, in
step 225, after resolving each branch in the selection stage of the multistage processor pipeline, the branch prediction table in the BPU is updated with the results to improve accuracy of the prediction information as necessary. Operation of the method terminates instep 230. It should be appreciated that in various exemplary embodiments, each time the processor is powered down, that the “current” branch prediction table may be stored in non-volatile memory so that each time the processor is powered up, the most recent branch prediction data is loaded into the BPU. - Referring now to
FIG. 3 , a block diagram illustrating the flow of data into and out of abranch prediction unit 314 in accordance with at least one exemplary embodiment of the invention is illustrated. In the Fetchstage 310 of the instruction pipeline, theBPU 314 maintains a branch prediction look-up table 316 that stores the address of the next instruction indexed by the address of the branch instruction. Thus, when the branch instruction enters the pipeline, the look-up table 316 is referenced by the instruction's address. The address of the next instruction is taken from the table 316 and injected in the pipeline directly following the branch instruction. Therefore, if the branch is taken then the next instruction address is available at the next clock signal. If the branch is not taken, the pipeline must be flushed and the correct instruction address injected back at the fetchstage 310. In the event that a pipeline flush is required, the look-up table 316 is updated with the actual address of the next instruction so that it will be available for the next instance of that branch instruction. - While the foregoing description includes many details and specificities, it is to be understood that these have been included for purposes of explanation only. The embodiments of the present invention are not to be limited in scope by the specific embodiments described herein. For example, although many of the embodiments disclosed herein have been described with reference to branch prediction in embedded RISC-type microprocessors, the principles herein are equally applicable to branch prediction in microprocessors in general. Indeed, various modifications of the embodiments of the present inventions, in addition to those described herein, will be apparent to those of ordinary skill in the art from the foregoing description and accompanying drawings. Thus, such modifications are intended to fall within the scope of the following appended claims. Further, although the embodiments of the present inventions have been described herein in the context of a particular implementation in a particular environment for a particular purpose, those of ordinary skill in the art will recognize that its usefulness is not limited thereto and that the embodiments of the present inventions can be beneficially implemented in any number of environments for any number of purposes. Accordingly, the claims set forth below should be construed in view of the full breadth and spirit of the embodiments of the present inventions as disclosed herein.
Claims (22)
1. A method of performing branch prediction in a microprocessor having a multistage instruction pipeline, the method comprising:
building a branch prediction history table of branch prediction data through static branch prediction prior to microprocessor deployment;
storing the branch prediction data in a memory;
loading the branch prediction data into a branch prediction unit (BPU) of the microprocessor upon power on; and
performing dynamic branch prediction with the BPU based on the preloaded branch prediction data.
2. The method according to claim 1 , further comprising updating the branch prediction data in the BPU if, during instruction processing, prediction data changes.
3. The method according to claim 2 wherein updating comprises after resolving a branch in a select stage of the instruction pipeline, updating the BPU with the address of a next instruction that resulted from that branch.
4. The method according to claim 1 , wherein building a branch prediction history table comprises simulating instructions that will be executed by the processor during deployment and populating a table of branch history with information indicating whether conditional branches were taken or not.
5. The method according to claim 4 , wherein building comprises using at least one of a simulator and a compiler to generate branch history.
6. The method according to claim 1 , wherein performing dynamic branch prediction with the branch prediction unit based on the preloaded branch prediction data comprises parsing a branch history table in the BPU that indexes non-sequential instructions by their addresses in association with the next instruction taken.
7. The method according to claim 1 , wherein the microprocessor is an embedded microprocessor.
8. The method according to claim 1 , further comprising after performing dynamic branch prediction, storing branch history data in the branch prediction unit in a non-volatile memory for preload upon subsequent microprocessor use.
9. In a multistage pipeline microprocessor employing dynamic branch prediction, the method of enhancing branch prediction performance comprising:
performing static branch prediction to build a branch prediction history table of branch prediction data prior to microprocessor deployment;
storing the branch prediction history table in a memory;
loading the branch prediction history table into a branch prediction unit (BPU) of the microprocessor; and
performing dynamic branch prediction with the BPU based on the preloaded branch prediction data.
10. The method according to claim 9 , wherein static branch prediction is performed prior to microprocessor deployment.
11. The method according to claim 9 , wherein loading the branch prediction table is performed subsequent to microprocessor power on.
12. The method according to claim 9 , further comprising updating the branch prediction data in the BPU if, during instruction processing, prediction data changes.
13. The method according to claim 12 , wherein the microprocessor includes an instruction pipeline having a select stage, and updating comprises after resolving a branch in the select stage, updating the BPU with the address of the next instruction resulting from that branch.
14. The method according to claim 9 , wherein building a branch prediction history table comprises simulating instructions that will be executed by the processor during deployment and populating a table of branch history with information indicating whether conditional branches were taken or not.
15. The method according to claim 14 , wherein building comprises using at least one of a simulator and a compiler to generate branch history.
16. The method according to claim 9 , wherein performing dynamic branch prediction with the branch prediction unit based on the preloaded branch prediction data comprises parsing a branch history table in the BPU that indexes non-sequential instructions by their addresses in association with the next instruction taken.
17. The method according to claim 9 , wherein the microprocessor is an embedded microprocessor.
18. The method according to claim 9 , further comprising after performing dynamic branch prediction, storing branch history data in the branch prediction unit in a non-volatile memory for preload upon subsequent microprocessor use
19. An embedded microprocessor comprising:
a multistage instruction pipeline; and
a BPU adapted to perform dynamic branch prediction, wherein the BPU is preloaded with branch history table created through static branch prediction, and subsequently updated to contain the actual address of the next instruction that resulted from that branch during dynamic branch prediction.
20. The microprocessor according to claim 19 , wherein the branch history table contains data generated prior to microprocessor deployment and the BPU is preloaded at power on of the microprocessor.
21. The microprocessor according to claim 19 , wherein after resolving a branch in a select stage of the instruction pipeline, the BPU is updated to contain the address of the next instruction that resulted from that branch.
22. The microprocessor according to claim 19 , wherein the BPU is preloaded with a branch history table created through static branch prediction during a simulation processing that simulated instructions that will be executed by the microprocessor during deployment and wherein the BPU comprises a branch history table that indexes non-sequential instructions by their addresses in association with the next instruction taken.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US11/132,423 US20050278513A1 (en) | 2004-05-19 | 2005-05-19 | Systems and methods of dynamic branch prediction in a microprocessor |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US57223804P | 2004-05-19 | 2004-05-19 | |
US11/132,423 US20050278513A1 (en) | 2004-05-19 | 2005-05-19 | Systems and methods of dynamic branch prediction in a microprocessor |
Publications (1)
Publication Number | Publication Date |
---|---|
US20050278513A1 true US20050278513A1 (en) | 2005-12-15 |
Family
ID=35429033
Family Applications (7)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US11/132,428 Abandoned US20050278517A1 (en) | 2004-05-19 | 2005-05-19 | Systems and methods for performing branch prediction in a variable length instruction set microprocessor |
US11/132,424 Active 2031-02-12 US8719837B2 (en) | 2004-05-19 | 2005-05-19 | Microprocessor architecture having extendible logic |
US11/132,447 Abandoned US20050278505A1 (en) | 2004-05-19 | 2005-05-19 | Microprocessor architecture including zero impact predictive data pre-fetch mechanism for pipeline data memory |
US11/132,423 Abandoned US20050278513A1 (en) | 2004-05-19 | 2005-05-19 | Systems and methods of dynamic branch prediction in a microprocessor |
US11/132,448 Abandoned US20050289323A1 (en) | 2004-05-19 | 2005-05-19 | Barrel shifter for a microprocessor |
US11/132,432 Abandoned US20050273559A1 (en) | 2004-05-19 | 2005-05-19 | Microprocessor architecture including unified cache debug unit |
US14/222,194 Active US9003422B2 (en) | 2004-05-19 | 2014-03-21 | Microprocessor architecture having extendible logic |
Family Applications Before (3)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US11/132,428 Abandoned US20050278517A1 (en) | 2004-05-19 | 2005-05-19 | Systems and methods for performing branch prediction in a variable length instruction set microprocessor |
US11/132,424 Active 2031-02-12 US8719837B2 (en) | 2004-05-19 | 2005-05-19 | Microprocessor architecture having extendible logic |
US11/132,447 Abandoned US20050278505A1 (en) | 2004-05-19 | 2005-05-19 | Microprocessor architecture including zero impact predictive data pre-fetch mechanism for pipeline data memory |
Family Applications After (3)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US11/132,448 Abandoned US20050289323A1 (en) | 2004-05-19 | 2005-05-19 | Barrel shifter for a microprocessor |
US11/132,432 Abandoned US20050273559A1 (en) | 2004-05-19 | 2005-05-19 | Microprocessor architecture including unified cache debug unit |
US14/222,194 Active US9003422B2 (en) | 2004-05-19 | 2014-03-21 | Microprocessor architecture having extendible logic |
Country Status (5)
Country | Link |
---|---|
US (7) | US20050278517A1 (en) |
CN (1) | CN101002169A (en) |
GB (1) | GB2428842A (en) |
TW (1) | TW200602974A (en) |
WO (1) | WO2005114441A2 (en) |
Cited By (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20050278505A1 (en) * | 2004-05-19 | 2005-12-15 | Lim Seow C | Microprocessor architecture including zero impact predictive data pre-fetch mechanism for pipeline data memory |
US20070260862A1 (en) * | 2006-05-03 | 2007-11-08 | Mcfarling Scott | Providing storage in a memory hierarchy for prediction information |
US20090204798A1 (en) * | 2008-02-11 | 2009-08-13 | Alexander Gregory W | Simplified Implementation of Branch Target Preloading |
US7779241B1 (en) * | 2007-04-10 | 2010-08-17 | Dunn David A | History based pipelined branch prediction |
US20100287358A1 (en) * | 2009-05-05 | 2010-11-11 | International Business Machines Corporation | Branch Prediction Path Instruction |
US7971042B2 (en) | 2005-09-28 | 2011-06-28 | Synopsys, Inc. | Microprocessor system and method for instruction-initiated recording and execution of instruction sequences in a dynamically decoupleable extended instruction pipeline |
US20110225401A1 (en) * | 2010-03-11 | 2011-09-15 | International Business Machines Corporation | Prefetching branch prediction mechanisms |
US8131982B2 (en) * | 2008-06-13 | 2012-03-06 | International Business Machines Corporation | Branch prediction instructions having mask values involving unloading and loading branch history data |
US20140101418A1 (en) * | 2012-06-14 | 2014-04-10 | International Business Machines Corporation | Mitigating instruction prediction latency with independently filtered presence predictors |
US20140229721A1 (en) * | 2012-03-30 | 2014-08-14 | Andrew T. Forsyth | Dynamic branch hints using branches-to-nowhere conditional branch |
US9135013B2 (en) | 2012-06-14 | 2015-09-15 | International Business Machines Corporation | Instruction filtering |
US10372459B2 (en) | 2017-09-21 | 2019-08-06 | Qualcomm Incorporated | Training and utilization of neural branch predictor |
US10902348B2 (en) | 2017-05-19 | 2021-01-26 | International Business Machines Corporation | Computerized branch predictions and decisions |
US11163577B2 (en) | 2018-11-26 | 2021-11-02 | International Business Machines Corporation | Selectively supporting static branch prediction settings only in association with processor-designated types of instructions |
Families Citing this family (57)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7577795B2 (en) * | 2006-01-25 | 2009-08-18 | International Business Machines Corporation | Disowning cache entries on aging out of the entry |
US7752468B2 (en) | 2006-06-06 | 2010-07-06 | Intel Corporation | Predict computing platform memory power utilization |
US7555605B2 (en) * | 2006-09-28 | 2009-06-30 | Freescale Semiconductor, Inc. | Data processing system having cache memory debugging support and method therefor |
US7716460B2 (en) * | 2006-09-29 | 2010-05-11 | Qualcomm Incorporated | Effective use of a BHT in processor having variable length instruction set execution modes |
US7529909B2 (en) * | 2006-12-28 | 2009-05-05 | Microsoft Corporation | Security verified reconfiguration of execution datapath in extensible microcomputer |
US8209488B2 (en) * | 2008-02-01 | 2012-06-26 | International Business Machines Corporation | Techniques for prediction-based indirect data prefetching |
US8166277B2 (en) * | 2008-02-01 | 2012-04-24 | International Business Machines Corporation | Data prefetching using indirect addressing |
US9201655B2 (en) * | 2008-03-19 | 2015-12-01 | International Business Machines Corporation | Method, computer program product, and hardware product for eliminating or reducing operand line crossing penalty |
US8181003B2 (en) * | 2008-05-29 | 2012-05-15 | Axis Semiconductor, Inc. | Instruction set design, control and communication in programmable microprocessor cores and the like |
US8225069B2 (en) * | 2009-03-31 | 2012-07-17 | Intel Corporation | Control of on-die system fabric blocks |
JP5423156B2 (en) * | 2009-06-01 | 2014-02-19 | 富士通株式会社 | Information processing apparatus and branch prediction method |
US8954714B2 (en) * | 2010-02-01 | 2015-02-10 | Altera Corporation | Processor with cycle offsets and delay lines to allow scheduling of instructions through time |
US8495287B2 (en) * | 2010-06-24 | 2013-07-23 | International Business Machines Corporation | Clock-based debugging for embedded dynamic random access memory element in a processor core |
CN104011646B (en) | 2011-12-22 | 2018-03-27 | 英特尔公司 | For processor, method, system and the instruction of the sequence for producing the continuous integral number according to numerical order |
WO2013095564A1 (en) | 2011-12-22 | 2013-06-27 | Intel Corporation | Processors, methods, systems, and instructions to generate sequences of integers in numerical order that differ by a constant stride |
WO2013095563A1 (en) | 2011-12-22 | 2013-06-27 | Intel Corporation | Packed data rearrangement control indexes precursors generation processors, methods, systems, and instructions |
US10223112B2 (en) | 2011-12-22 | 2019-03-05 | Intel Corporation | Processors, methods, systems, and instructions to generate sequences of integers in which integers in consecutive positions differ by a constant integer stride and where a smallest integer is offset from zero by an integer offset |
US9395994B2 (en) | 2011-12-30 | 2016-07-19 | Intel Corporation | Embedded branch prediction unit |
KR101826080B1 (en) * | 2012-06-15 | 2018-02-06 | 인텔 코포레이션 | A virtual load store queue having a dynamic dispatch window with a unified structure |
US9378017B2 (en) * | 2012-12-29 | 2016-06-28 | Intel Corporation | Apparatus and method of efficient vector roll operation |
CN103425498B (en) * | 2013-08-20 | 2018-07-24 | 复旦大学 | A kind of long instruction words command memory of low-power consumption and its method for optimizing power consumption |
US10372590B2 (en) * | 2013-11-22 | 2019-08-06 | International Business Corporation | Determining instruction execution history in a debugger |
US9870226B2 (en) * | 2014-07-03 | 2018-01-16 | The Regents Of The University Of Michigan | Control of switching between executed mechanisms |
US9910670B2 (en) | 2014-07-09 | 2018-03-06 | Intel Corporation | Instruction set for eliminating misaligned memory accesses during processing of an array having misaligned data rows |
US9740607B2 (en) | 2014-09-03 | 2017-08-22 | Micron Technology, Inc. | Swap operations in memory |
TWI569207B (en) * | 2014-10-28 | 2017-02-01 | 上海兆芯集成電路有限公司 | Fractional use of prediction history storage for operating system routines |
US9665374B2 (en) * | 2014-12-18 | 2017-05-30 | Intel Corporation | Binary translation mechanism |
CN107533461B (en) * | 2015-04-24 | 2022-03-18 | 优创半导体科技有限公司 | Computer processor with different registers for addressing memory |
US10346168B2 (en) * | 2015-06-26 | 2019-07-09 | Microsoft Technology Licensing, Llc | Decoupled processor instruction window and operand buffer |
US10776115B2 (en) * | 2015-09-19 | 2020-09-15 | Microsoft Technology Licensing, Llc | Debug support for block-based processor |
US10664280B2 (en) * | 2015-11-09 | 2020-05-26 | MIPS Tech, LLC | Fetch ahead branch target buffer |
GB2548601B (en) * | 2016-03-23 | 2019-02-13 | Advanced Risc Mach Ltd | Processing vector instructions |
US10599428B2 (en) | 2016-03-23 | 2020-03-24 | Arm Limited | Relaxed execution of overlapping mixed-scalar-vector instructions |
US10192281B2 (en) * | 2016-07-07 | 2019-01-29 | Intel Corporation | Graphics command parsing mechanism |
WO2018149495A1 (en) * | 2017-02-16 | 2018-08-23 | Huawei Technologies Co., Ltd. | A method and system to fetch multicore instruction traces from a virtual platform emulator to a performance simulation model |
US9959247B1 (en) | 2017-02-17 | 2018-05-01 | Google Llc | Permuting in a matrix-vector processor |
CN107179895B (en) * | 2017-05-17 | 2020-08-28 | 北京中科睿芯科技有限公司 | Method for accelerating instruction execution speed in data stream structure by applying composite instruction |
GB2564390B (en) * | 2017-07-04 | 2019-10-02 | Advanced Risc Mach Ltd | An apparatus and method for controlling use of a register cache |
US11114138B2 (en) | 2017-09-15 | 2021-09-07 | Groq, Inc. | Data structures with multiple read ports |
US11360934B1 (en) | 2017-09-15 | 2022-06-14 | Groq, Inc. | Tensor streaming processor architecture |
US11868804B1 (en) | 2019-11-18 | 2024-01-09 | Groq, Inc. | Processor instruction dispatch configuration |
US11243880B1 (en) | 2017-09-15 | 2022-02-08 | Groq, Inc. | Processor architecture |
US11170307B1 (en) | 2017-09-21 | 2021-11-09 | Groq, Inc. | Predictive model compiler for generating a statically scheduled binary with known resource constraints |
US20200065112A1 (en) * | 2018-08-22 | 2020-02-27 | Qualcomm Incorporated | Asymmetric speculative/nonspeculative conditional branching |
US11204976B2 (en) | 2018-11-19 | 2021-12-21 | Groq, Inc. | Expanded kernel generation |
US11086631B2 (en) | 2018-11-30 | 2021-08-10 | Western Digital Technologies, Inc. | Illegal instruction exception handling |
CN109783384A (en) * | 2019-01-10 | 2019-05-21 | 未来电视有限公司 | Log use-case test method, log use-case test device and electronic equipment |
US11182166B2 (en) | 2019-05-23 | 2021-11-23 | Samsung Electronics Co., Ltd. | Branch prediction throughput by skipping over cachelines without branches |
CN110442382B (en) * | 2019-07-31 | 2021-06-15 | 西安芯海微电子科技有限公司 | Prefetch cache control method, device, chip and computer readable storage medium |
CN110727463B (en) * | 2019-09-12 | 2021-08-10 | 无锡江南计算技术研究所 | Zero-level instruction circular buffer prefetching method and device based on dynamic credit |
CN114930351A (en) | 2019-11-26 | 2022-08-19 | 格罗克公司 | Loading operands from a multidimensional array and outputting results using only a single side |
CN112015490A (en) * | 2020-11-02 | 2020-12-01 | 鹏城实验室 | Method, apparatus and medium for programmable device implementing and testing reduced instruction set |
CN113076277A (en) * | 2021-03-26 | 2021-07-06 | 大唐微电子技术有限公司 | Method and device for realizing pipeline scheduling, computer storage medium and terminal |
US11599358B1 (en) | 2021-08-12 | 2023-03-07 | Tenstorrent Inc. | Pre-staged instruction registers for variable length instruction set machine |
US11663007B2 (en) * | 2021-10-01 | 2023-05-30 | Arm Limited | Control of branch prediction for zero-overhead loop |
CN115495155B (en) * | 2022-11-18 | 2023-03-24 | 北京数渡信息科技有限公司 | Hardware circulation processing device suitable for general processor |
CN117193861B (en) * | 2023-11-07 | 2024-03-15 | 芯来智融半导体科技(上海)有限公司 | Instruction processing method, apparatus, computer device and storage medium |
Citations (77)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5155843A (en) * | 1990-06-29 | 1992-10-13 | Digital Equipment Corporation | Error transition mode for multi-processor system |
US5327536A (en) * | 1990-05-22 | 1994-07-05 | Nec Corporation | Microprocessor having branch prediction function |
US5339365A (en) * | 1989-08-31 | 1994-08-16 | Canon Kabushiki Kaisha | Image processing apparatus capable of using fuzzy logic |
US5422739A (en) * | 1991-07-08 | 1995-06-06 | Canon Kabushiki Kaisha | Color expressing method, color image reading apparatus and color image processing apparatus |
US5454117A (en) * | 1993-08-25 | 1995-09-26 | Nexgen, Inc. | Configurable branch prediction for a processor performing speculative execution |
US5577217A (en) * | 1993-05-14 | 1996-11-19 | Intel Corporation | Method and apparatus for a branch target buffer with shared branch pattern tables for associated branch predictions |
US5655122A (en) * | 1995-04-05 | 1997-08-05 | Sequent Computer Systems, Inc. | Optimizing compiler with static prediction of branch probability, branch frequency and function frequency |
US5659752A (en) * | 1995-06-30 | 1997-08-19 | International Business Machines Corporation | System and method for improving branch prediction in compiled program code |
US5692168A (en) * | 1994-10-18 | 1997-11-25 | Cyrix Corporation | Prefetch buffer using flow control bit to identify changes of flow within the code stream |
US5752014A (en) * | 1996-04-29 | 1998-05-12 | International Business Machines Corporation | Automatic selection of branch prediction methodology for subsequent branch instruction based on outcome of previous branch prediction |
US5761723A (en) * | 1994-02-04 | 1998-06-02 | Motorola, Inc. | Data processor with branch prediction and method of operation |
US5778423A (en) * | 1990-06-29 | 1998-07-07 | Digital Equipment Corporation | Prefetch instruction for improving performance in reduced instruction set processor |
US5995248A (en) * | 1996-03-22 | 1999-11-30 | Minolta Co., Ltd. | Image forming device and method having MTF correction |
US6076158A (en) * | 1990-06-29 | 2000-06-13 | Digital Equipment Corporation | Branch prediction in high-performance processor |
US6151672A (en) * | 1998-02-23 | 2000-11-21 | Hewlett-Packard Company | Methods and apparatus for reducing interference in a branch history table of a microprocessor |
US6189091B1 (en) * | 1998-12-02 | 2001-02-13 | Ip First, L.L.C. | Apparatus and method for speculatively updating global history and restoring same on branch misprediction detection |
US6253287B1 (en) * | 1998-09-09 | 2001-06-26 | Advanced Micro Devices, Inc. | Using three-dimensional storage to make variable-length instructions appear uniform in two dimensions |
US20010021974A1 (en) * | 2000-02-01 | 2001-09-13 | Samsung Electronics Co., Ltd. | Branch predictor suitable for multi-processing microprocessor |
US20010032309A1 (en) * | 1999-03-18 | 2001-10-18 | Henry G. Glenn | Static branch prediction mechanism for conditional branch instructions |
US20010040686A1 (en) * | 1998-06-26 | 2001-11-15 | Heidi M. Schoolcraft | Streamlined tetrahedral interpolation |
US20010044892A1 (en) * | 1997-09-10 | 2001-11-22 | Shinichi Yamaura | Method and system for high performance implementation of microprocessors |
US20010056531A1 (en) * | 1998-03-19 | 2001-12-27 | Mcfarling Scott | Branch predictor with serially connected predictor stages for improving branch prediction accuracy |
US6339822B1 (en) * | 1998-10-02 | 2002-01-15 | Advanced Micro Devices, Inc. | Using padded instructions in a block-oriented cache |
US20020066006A1 (en) * | 2000-11-29 | 2002-05-30 | Lsi Logic Corporation | Simple branch prediction and misprediction recovery method |
US20020069351A1 (en) * | 2000-12-05 | 2002-06-06 | Shyh-An Chi | Memory data access structure and method suitable for use in a processor |
US20020073301A1 (en) * | 2000-12-07 | 2002-06-13 | International Business Machines Corporation | Hardware for use with compiler generated branch information |
US20020078332A1 (en) * | 2000-12-19 | 2002-06-20 | Seznec Andre C. | Conflict free parallel read access to a bank interleaved branch predictor in a processor |
US20020083312A1 (en) * | 2000-12-27 | 2002-06-27 | Balaram Sinharoy | Branch Prediction apparatus and process for restoring replaced branch history for use in future branch predictions for an executing program |
US20020087851A1 (en) * | 2000-12-28 | 2002-07-04 | Matsushita Electric Industrial Co., Ltd. | Microprocessor and an instruction converter |
US20020087852A1 (en) * | 2000-12-28 | 2002-07-04 | Jourdan Stephan J. | Method and apparatus for predicting branches using a meta predictor |
US6427206B1 (en) * | 1999-05-03 | 2002-07-30 | Intel Corporation | Optimized branch predictions for strongly predicted compiler branches |
US20020138236A1 (en) * | 2001-03-21 | 2002-09-26 | Akihiro Takamura | Processor having execution result prediction function for instruction |
US20020157000A1 (en) * | 2001-03-01 | 2002-10-24 | International Business Machines Corporation | Software hint to improve the branch target prediction accuracy |
US6477683B1 (en) * | 1999-02-05 | 2002-11-05 | Tensilica, Inc. | Automated processor generation system for designing a configurable processor and method for the same |
US20020188833A1 (en) * | 2001-05-04 | 2002-12-12 | Ip First Llc | Dual call/return stack branch prediction system |
US20020194464A1 (en) * | 2001-05-04 | 2002-12-19 | Ip First Llc | Speculative branch target address cache with selective override by seconday predictor based on branch instruction type |
US20020194462A1 (en) * | 2001-05-04 | 2002-12-19 | Ip First Llc | Apparatus and method for selecting one of multiple target addresses stored in a speculative branch target address cache per instruction cache line |
US20020194463A1 (en) * | 2001-05-04 | 2002-12-19 | Ip First Llc, | Speculative hybrid branch direction predictor |
US20020194461A1 (en) * | 2001-05-04 | 2002-12-19 | Ip First Llc | Speculative branch target address cache |
US20020199092A1 (en) * | 1999-11-05 | 2002-12-26 | Ip-First Llc | Split history tables for branch prediction |
US20030023838A1 (en) * | 2001-07-27 | 2003-01-30 | Karim Faraydon O. | Novel fetch branch architecture for reducing branch penalty without branch prediction |
US6560754B1 (en) * | 1999-05-13 | 2003-05-06 | Arc International Plc | Method and apparatus for jump control in a pipelined processor |
US20030204705A1 (en) * | 2002-04-30 | 2003-10-30 | Oldfield William H. | Prediction of branch instructions in a data processing apparatus |
US6647491B2 (en) * | 1999-02-18 | 2003-11-11 | Hewlett-Packard Development Company, L.P. | Hardware/software system for profiling instructions and selecting a trace using branch history information for branch predictions |
US6681295B1 (en) * | 2000-08-31 | 2004-01-20 | Hewlett-Packard Development Company, L.P. | Fast lane prefetching |
US20040015683A1 (en) * | 2002-07-18 | 2004-01-22 | International Business Machines Corporation | Two dimensional branch history table prefetching mechanism |
US20040049660A1 (en) * | 2002-09-06 | 2004-03-11 | Mips Technologies, Inc. | Method and apparatus for clearing hazards using jump instructions |
US20040068643A1 (en) * | 1997-08-01 | 2004-04-08 | Dowling Eric M. | Method and apparatus for high performance branching in pipelined microsystems |
US6763452B1 (en) * | 1999-01-28 | 2004-07-13 | Ati International Srl | Modifying program execution based on profiling |
US20040139281A1 (en) * | 2003-01-14 | 2004-07-15 | Ip-First, Llc. | Apparatus and method for efficiently updating branch target address cache |
US20040172524A1 (en) * | 2001-06-29 | 2004-09-02 | Jan Hoogerbrugge | Method, apparatus and compiler for predicting indirect branch target addresses |
US20040186985A1 (en) * | 2003-03-21 | 2004-09-23 | Analog Devices, Inc. | Method and apparatus for branch prediction based on branch targets |
US20040193855A1 (en) * | 2003-03-31 | 2004-09-30 | Nicolas Kacevas | System and method for branch prediction access |
US20040193843A1 (en) * | 2003-03-31 | 2004-09-30 | Eran Altshuler | System and method for early branch prediction |
US20040225870A1 (en) * | 2003-05-07 | 2004-11-11 | Srinivasan Srikanth T. | Method and apparatus for reducing wrong path execution in a speculative multi-threaded processor |
US20040225872A1 (en) * | 2002-06-04 | 2004-11-11 | International Business Machines Corporation | Hybrid branch prediction using a global selection counter and a prediction method comparison table |
US20040225871A1 (en) * | 1999-10-01 | 2004-11-11 | Naohiko Irie | Branch control memory |
US20040230782A1 (en) * | 2003-05-12 | 2004-11-18 | International Business Machines Corporation | Method and system for processing loop branch instructions |
US20040255104A1 (en) * | 2003-06-12 | 2004-12-16 | Intel Corporation | Method and apparatus for recycling candidate branch outcomes after a wrong-path execution in a superscalar processor |
US20040268102A1 (en) * | 2003-06-30 | 2004-12-30 | Combs Jonathan D. | Mechanism to remove stale branch predictions at a microprocessor |
US20050027974A1 (en) * | 2003-07-31 | 2005-02-03 | Oded Lempel | Method and system for conserving resources in an instruction pipeline |
US20050050309A1 (en) * | 2003-08-29 | 2005-03-03 | Renesas Technology Corp. | Data processor |
US20050066305A1 (en) * | 2003-09-22 | 2005-03-24 | Lisanke Robert John | Method and machine for efficient simulation of digital hardware within a software development environment |
US20050076193A1 (en) * | 2003-09-08 | 2005-04-07 | Ip-First, Llc. | Apparatus and method for selectively overriding return stack prediction in response to detection of non-standard return sequence |
US20050091479A1 (en) * | 2003-10-24 | 2005-04-28 | Sung-Woo Chung | Branch predictor, system and method of branch prediction |
US20050125634A1 (en) * | 2002-10-04 | 2005-06-09 | Fujitsu Limited | Processor and instruction control method |
US20050125632A1 (en) * | 2003-12-03 | 2005-06-09 | Advanced Micro Devices, Inc. | Transitioning from instruction cache to trace cache on label boundaries |
US20050125613A1 (en) * | 2003-12-03 | 2005-06-09 | Sangwook Kim | Reconfigurable trace cache |
US20050154867A1 (en) * | 2004-01-14 | 2005-07-14 | International Business Machines Corporation | Autonomic method and apparatus for counting branch instructions to improve branch predictions |
US20050172277A1 (en) * | 2004-02-04 | 2005-08-04 | Saurabh Chheda | Energy-focused compiler-assisted branch prediction |
US6948052B2 (en) * | 1991-07-08 | 2005-09-20 | Seiko Epson Corporation | High-performance, superscalar-based computer system with out-of-order instruction execution |
US20050216703A1 (en) * | 2004-03-26 | 2005-09-29 | International Business Machines Corporation | Apparatus and method for decreasing the latency between an instruction cache and a pipeline processor |
US20050216713A1 (en) * | 2004-03-25 | 2005-09-29 | International Business Machines Corporation | Instruction text controlled selectively stated branches for prediction via a branch target buffer |
US20050223202A1 (en) * | 2004-03-31 | 2005-10-06 | Intel Corporation | Branch prediction in a pipelined processor |
US20060015706A1 (en) * | 2004-06-30 | 2006-01-19 | Chunrong Lai | TLB correlated branch predictor and method for use thereof |
US20060036836A1 (en) * | 1998-12-31 | 2006-02-16 | Metaflow Technologies, Inc. | Block-based branch target buffer |
US20060041868A1 (en) * | 2004-08-23 | 2006-02-23 | Cheng-Yen Huang | Method for verifying branch prediction mechanism and accessible recording medium for storing program thereof |
Family Cites Families (145)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4342082A (en) | 1977-01-13 | 1982-07-27 | International Business Machines Corp. | Program instruction mechanism for shortened recursive handling of interruptions |
US4216539A (en) | 1978-05-05 | 1980-08-05 | Zehntel, Inc. | In-circuit digital tester |
US4400773A (en) | 1980-12-31 | 1983-08-23 | International Business Machines Corp. | Independent handling of I/O interrupt requests and associated status information transfers |
US4594659A (en) * | 1982-10-13 | 1986-06-10 | Honeywell Information Systems Inc. | Method and apparatus for prefetching instructions for a central execution pipeline unit |
JPS63225822A (en) | 1986-08-11 | 1988-09-20 | Toshiba Corp | Barrel shifter |
US4905178A (en) | 1986-09-19 | 1990-02-27 | Performance Semiconductor Corporation | Fast shifter method and structure |
JPS6398729A (en) | 1986-10-15 | 1988-04-30 | Fujitsu Ltd | Barrel shifter |
US4914622A (en) | 1987-04-17 | 1990-04-03 | Advanced Micro Devices, Inc. | Array-organized bit map with a barrel shifter |
DE3889812T2 (en) | 1987-08-28 | 1994-12-15 | Nec Corp | Data processor with a test structure for multi-position shifters. |
KR970005453B1 (en) * | 1987-12-25 | 1997-04-16 | 가부시기가이샤 히다찌세이사꾸쇼 | Data processing apparatus for high speed processing |
US4926323A (en) | 1988-03-03 | 1990-05-15 | Advanced Micro Devices, Inc. | Streamlined instruction processor |
JPH01263820A (en) | 1988-04-15 | 1989-10-20 | Hitachi Ltd | Microprocessor |
EP0344347B1 (en) | 1988-06-02 | 1993-12-29 | Deutsche ITT Industries GmbH | Digital signal processing unit |
GB2229832B (en) | 1989-03-30 | 1993-04-07 | Intel Corp | Byte swap instruction for memory format conversion within a microprocessor |
JPH03185530A (en) | 1989-12-14 | 1991-08-13 | Mitsubishi Electric Corp | Data processor |
DE69030648T2 (en) * | 1990-01-02 | 1997-11-13 | Motorola Inc | Method for sequential prefetching of 1-word, 2-word or 3-word instructions |
JPH03248226A (en) | 1990-02-26 | 1991-11-06 | Nec Corp | Microprocessor |
JP2556612B2 (en) | 1990-08-29 | 1996-11-20 | 日本電気アイシーマイコンシステム株式会社 | Barrel shifter circuit |
US5636363A (en) | 1991-06-14 | 1997-06-03 | Integrated Device Technology, Inc. | Hardware control structure and method for off-chip monitoring entries of an on-chip cache |
US5493687A (en) | 1991-07-08 | 1996-02-20 | Seiko Epson Corporation | RISC microprocessor architecture implementing multiple typed register sets |
US5450586A (en) * | 1991-08-14 | 1995-09-12 | Hewlett-Packard Company | System for analyzing and debugging embedded software through dynamic and interactive use of code markers |
CA2073516A1 (en) | 1991-11-27 | 1993-05-28 | Peter Michael Kogge | Dynamic multi-mode parallel processor array architecture computer system |
US5423011A (en) * | 1992-06-11 | 1995-06-06 | International Business Machines Corporation | Apparatus for initializing branch prediction information |
US5485625A (en) | 1992-06-29 | 1996-01-16 | Ford Motor Company | Method and apparatus for monitoring external events during a microprocessor's sleep mode |
US5274770A (en) | 1992-07-29 | 1993-12-28 | Tritech Microelectronics International Pte Ltd. | Flexible register-based I/O microcontroller with single cycle instruction execution |
US5294928A (en) | 1992-08-31 | 1994-03-15 | Microchip Technology Incorporated | A/D converter with zero power mode |
US5333119A (en) | 1992-09-30 | 1994-07-26 | Regents Of The University Of Minnesota | Digital signal processor with delayed-evaluation array multipliers and low-power memory addressing |
US5542074A (en) | 1992-10-22 | 1996-07-30 | Maspar Computer Corporation | Parallel processor system with highly flexible local control capability, including selective inversion of instruction signal and control of bit shift amount |
US5696958A (en) | 1993-01-11 | 1997-12-09 | Silicon Graphics, Inc. | Method and apparatus for reducing delays following the execution of a branch instruction in an instruction pipeline |
GB2275119B (en) * | 1993-02-03 | 1997-05-14 | Motorola Inc | A cached processor |
JPH06332693A (en) | 1993-05-27 | 1994-12-02 | Hitachi Ltd | Issuing system of suspending instruction with time-out function |
US5584031A (en) | 1993-11-09 | 1996-12-10 | Motorola Inc. | System and method for executing a low power delay instruction |
JP2801135B2 (en) | 1993-11-26 | 1998-09-21 | 富士通株式会社 | Instruction reading method and instruction reading device for pipeline processor |
US5509129A (en) | 1993-11-30 | 1996-04-16 | Guttag; Karl M. | Long instruction word controlling plural independent processor operations |
US5590350A (en) | 1993-11-30 | 1996-12-31 | Texas Instruments Incorporated | Three input arithmetic logic unit with mask generator |
US6116768A (en) | 1993-11-30 | 2000-09-12 | Texas Instruments Incorporated | Three input arithmetic logic unit with barrel rotator |
US5590351A (en) | 1994-01-21 | 1996-12-31 | Advanced Micro Devices, Inc. | Superscalar execution unit for sequential instruction pointer updates and segment limit checks |
JPH07253922A (en) * | 1994-03-14 | 1995-10-03 | Texas Instr Japan Ltd | Address generating circuit |
US5530825A (en) | 1994-04-15 | 1996-06-25 | Motorola, Inc. | Data processor with branch target address cache and method of operation |
US5517436A (en) | 1994-06-07 | 1996-05-14 | Andreas; David C. | Digital signal processor for audio applications |
US5809293A (en) * | 1994-07-29 | 1998-09-15 | International Business Machines Corporation | System and method for program execution tracing within an integrated processor |
US5566357A (en) | 1994-10-06 | 1996-10-15 | Qualcomm Incorporated | Power reduction in a cellular radiotelephone |
JPH08202469A (en) | 1995-01-30 | 1996-08-09 | Fujitsu Ltd | Microcontroller unit equipped with universal asychronous transmitting and receiving circuit |
US5600674A (en) | 1995-03-02 | 1997-02-04 | Motorola Inc. | Method and apparatus of an enhanced digital signal processor |
US5835753A (en) | 1995-04-12 | 1998-11-10 | Advanced Micro Devices, Inc. | Microprocessor with dynamically extendable pipeline stages and a classifying circuit |
US5920711A (en) * | 1995-06-02 | 1999-07-06 | Synopsys, Inc. | System for frame-based protocol, graphical capture, synthesis, analysis, and simulation |
US5768602A (en) | 1995-08-04 | 1998-06-16 | Apple Computer, Inc. | Sleep mode controller for power management |
US5842004A (en) | 1995-08-04 | 1998-11-24 | Sun Microsystems, Inc. | Method and apparatus for decompression of compressed geometric three-dimensional graphics data |
US6292879B1 (en) * | 1995-10-25 | 2001-09-18 | Anthony S. Fong | Method and apparatus to specify access control list and cache enabling and cache coherency requirement enabling on individual operands of an instruction of a computer |
US5727211A (en) * | 1995-11-09 | 1998-03-10 | Chromatic Research, Inc. | System and method for fast context switching between tasks |
US5778438A (en) | 1995-12-06 | 1998-07-07 | Intel Corporation | Method and apparatus for maintaining cache coherency in a computer system with a highly pipelined bus and multiple conflicting snoop requests |
US5774709A (en) | 1995-12-06 | 1998-06-30 | Lsi Logic Corporation | Enhanced branch delay slot handling with single exception program counter |
US5996071A (en) | 1995-12-15 | 1999-11-30 | Via-Cyrix, Inc. | Detecting self-modifying code in a pipelined processor with branch processing by comparing latched store address to subsequent target address |
JP3663710B2 (en) * | 1996-01-17 | 2005-06-22 | ヤマハ株式会社 | Program generation method and processor interrupt control method |
US5896305A (en) | 1996-02-08 | 1999-04-20 | Texas Instruments Incorporated | Shifter circuit for an arithmetic logic unit in a microprocessor |
US5784636A (en) | 1996-05-28 | 1998-07-21 | National Semiconductor Corporation | Reconfigurable computer architecture for use in signal processing applications |
US20010025337A1 (en) | 1996-06-10 | 2001-09-27 | Frank Worrell | Microprocessor including a mode detector for setting compression mode |
US5826079A (en) | 1996-07-05 | 1998-10-20 | Ncr Corporation | Method for improving the execution efficiency of frequently communicating processes utilizing affinity process scheduling by identifying and assigning the frequently communicating processes to the same processor |
US5964884A (en) | 1996-09-30 | 1999-10-12 | Advanced Micro Devices, Inc. | Self-timed pulse control circuit |
US5805876A (en) * | 1996-09-30 | 1998-09-08 | International Business Machines Corporation | Method and system for reducing average branch resolution time and effective misprediction penalty in a processor |
US5848264A (en) * | 1996-10-25 | 1998-12-08 | S3 Incorporated | Debug and video queue for multi-processor chip |
US6058142A (en) | 1996-11-29 | 2000-05-02 | Sony Corporation | Image processing apparatus |
US5909572A (en) | 1996-12-02 | 1999-06-01 | Compaq Computer Corp. | System and method for conditionally moving an operand from a source register to a destination register |
US6061521A (en) | 1996-12-02 | 2000-05-09 | Compaq Computer Corp. | Computer having multimedia operations executable as two distinct sets of operations within a single instruction cycle |
EP0855645A3 (en) * | 1996-12-31 | 2000-05-24 | Texas Instruments Incorporated | System and method for speculative execution of instructions with data prefetch |
KR100236533B1 (en) | 1997-01-16 | 2000-01-15 | 윤종용 | Digital signal processor |
EP0855718A1 (en) | 1997-01-28 | 1998-07-29 | Hewlett-Packard Company | Memory low power mode control |
US6154857A (en) * | 1997-04-08 | 2000-11-28 | Advanced Micro Devices, Inc. | Microprocessor-based device incorporating a cache for capturing software performance profiling data |
US6185732B1 (en) | 1997-04-08 | 2001-02-06 | Advanced Micro Devices, Inc. | Software debug port for a microprocessor |
US6584525B1 (en) | 1998-11-19 | 2003-06-24 | Edwin E. Klingman | Adaptation of standard microprocessor architectures via an interface to a configurable subsystem |
US6021500A (en) | 1997-05-07 | 2000-02-01 | Intel Corporation | Processor with sleep and deep sleep modes |
US5950120A (en) | 1997-06-17 | 1999-09-07 | Lsi Logic Corporation | Apparatus and method for shutdown of wireless communications mobile station with multiple clocks |
US5931950A (en) | 1997-06-17 | 1999-08-03 | Pc-Tel, Inc. | Wake-up-on-ring power conservation for host signal processing communication system |
US5808876A (en) * | 1997-06-20 | 1998-09-15 | International Business Machines Corporation | Multi-function power distribution system |
US6035374A (en) | 1997-06-25 | 2000-03-07 | Sun Microsystems, Inc. | Method of executing coded instructions in a multiprocessor having shared execution resources including active, nap, and sleep states in accordance with cache miss latency |
US6088786A (en) | 1997-06-27 | 2000-07-11 | Sun Microsystems, Inc. | Method and system for coupling a stack based processor to register based functional unit |
US5878264A (en) | 1997-07-17 | 1999-03-02 | Sun Microsystems, Inc. | Power sequence controller with wakeup logic for enabling a wakeup interrupt handler procedure |
US6760833B1 (en) | 1997-08-01 | 2004-07-06 | Micron Technology, Inc. | Split embedded DRAM processor |
US6026478A (en) | 1997-08-01 | 2000-02-15 | Micron Technology, Inc. | Split embedded DRAM processor |
US6226738B1 (en) | 1997-08-01 | 2001-05-01 | Micron Technology, Inc. | Split embedded DRAM processor |
JPH11143571A (en) | 1997-11-05 | 1999-05-28 | Mitsubishi Electric Corp | Data processor |
US5978909A (en) * | 1997-11-26 | 1999-11-02 | Intel Corporation | System for speculative branch target prediction having a dynamic prediction history buffer and a static prediction history buffer |
US6044458A (en) * | 1997-12-12 | 2000-03-28 | Motorola, Inc. | System for monitoring program flow utilizing fixwords stored sequentially to opcodes |
US6014743A (en) | 1998-02-05 | 2000-01-11 | Intergrated Device Technology, Inc. | Apparatus and method for recording a floating point error pointer in zero cycles |
US6289417B1 (en) | 1998-05-18 | 2001-09-11 | Arm Limited | Operand supply to an execution unit |
US6308279B1 (en) | 1998-05-22 | 2001-10-23 | Intel Corporation | Method and apparatus for power mode transition in a multi-thread processor |
JPH11353225A (en) | 1998-05-26 | 1999-12-24 | Internatl Business Mach Corp <Ibm> | Memory that processor addressing gray code system in sequential execution style accesses and method for storing code and data in memory |
US20020053015A1 (en) | 1998-07-14 | 2002-05-02 | Sony Corporation And Sony Electronics Inc. | Digital signal processor particularly suited for decoding digital audio |
US6327651B1 (en) | 1998-09-08 | 2001-12-04 | International Business Machines Corporation | Wide shifting in the vector permute unit |
US6240521B1 (en) | 1998-09-10 | 2001-05-29 | International Business Machines Corp. | Sleep mode transition between processors sharing an instruction set and an address space |
US6347379B1 (en) | 1998-09-25 | 2002-02-12 | Intel Corporation | Reducing power consumption of an electronic device |
US6862563B1 (en) | 1998-10-14 | 2005-03-01 | Arc International | Method and apparatus for managing the configuration and functionality of a semiconductor design |
US6671743B1 (en) * | 1998-11-13 | 2003-12-30 | Creative Technology, Ltd. | Method and system for exposing proprietary APIs in a privileged device driver to an application |
EP1351154A2 (en) * | 1998-11-20 | 2003-10-08 | Altera Corporation | Reconfigurable programmable logic device computer system |
US6341348B1 (en) * | 1998-12-03 | 2002-01-22 | Sun Microsystems, Inc. | Software branch prediction filtering for a microprocessor |
US6438700B1 (en) | 1999-05-18 | 2002-08-20 | Koninklijke Philips Electronics N.V. | System and method to reduce power consumption in advanced RISC machine (ARM) based systems |
US6622240B1 (en) * | 1999-06-18 | 2003-09-16 | Intrinsity, Inc. | Method and apparatus for pre-branch instruction |
JP2001034504A (en) | 1999-07-19 | 2001-02-09 | Mitsubishi Electric Corp | Source level debugger |
US6571333B1 (en) | 1999-11-05 | 2003-05-27 | Intel Corporation | Initializing a memory controller by executing software in second memory to wakeup a system |
US6609194B1 (en) | 1999-11-12 | 2003-08-19 | Ip-First, Llc | Apparatus for performing branch target address calculation based on branch type |
US6909744B2 (en) | 1999-12-09 | 2005-06-21 | Redrock Semiconductor, Inc. | Processor architecture for compression and decompression of video and images |
US6412038B1 (en) | 2000-02-14 | 2002-06-25 | Intel Corporation | Integral modular cache for a processor |
JP2001282548A (en) | 2000-03-29 | 2001-10-12 | Matsushita Electric Ind Co Ltd | Communication equipment and communication method |
US6519696B1 (en) | 2000-03-30 | 2003-02-11 | I.P. First, Llc | Paired register exchange using renaming register map |
US6718460B1 (en) * | 2000-09-05 | 2004-04-06 | Sun Microsystems, Inc. | Mechanism for error handling in a computer system |
US20030070013A1 (en) | 2000-10-27 | 2003-04-10 | Daniel Hansson | Method and apparatus for reducing power consumption in a digital processor |
US6963554B1 (en) * | 2000-12-27 | 2005-11-08 | National Semiconductor Corporation | Microwire dynamic sequencer pipeline stall |
US6925634B2 (en) | 2001-01-24 | 2005-08-02 | Texas Instruments Incorporated | Method for maintaining cache coherency in software in a shared memory system |
US7039901B2 (en) * | 2001-01-24 | 2006-05-02 | Texas Instruments Incorporated | Software shared memory bus |
EP1381957A2 (en) | 2001-03-02 | 2004-01-21 | Atsana Semiconductor Corp. | Data processing apparatus and system and method for controlling memory access |
US7010558B2 (en) | 2001-04-19 | 2006-03-07 | Arc International | Data processor with enhanced instruction execution and method |
GB0112275D0 (en) | 2001-05-21 | 2001-07-11 | Micron Technology Inc | Method and circuit for normalization of floating point significands in a simd array mpp |
GB0112269D0 (en) | 2001-05-21 | 2001-07-11 | Micron Technology Inc | Method and circuit for alignment of floating point significands in a simd array mpp |
US6823444B1 (en) * | 2001-07-03 | 2004-11-23 | Ip-First, Llc | Apparatus and method for selectively accessing disparate instruction buffer stages based on branch target address cache hit and instruction stage wrap |
US7162619B2 (en) | 2001-07-03 | 2007-01-09 | Ip-First, Llc | Apparatus and method for densely packing a branch instruction predicted by a branch target address cache and associated target instructions into a byte-wide instruction buffer |
US7191445B2 (en) * | 2001-08-31 | 2007-03-13 | Texas Instruments Incorporated | Method using embedded real-time analysis components with corresponding real-time operating system software objects |
US6751331B2 (en) | 2001-10-11 | 2004-06-15 | United Global Sourcing Incorporated | Communication headset |
JP2003131902A (en) * | 2001-10-24 | 2003-05-09 | Toshiba Corp | Software debugger, system-level debugger, debug method and debug program |
US7051239B2 (en) * | 2001-12-28 | 2006-05-23 | Hewlett-Packard Development Company, L.P. | Method and apparatus for efficiently implementing trace and/or logic analysis mechanisms on a processor chip |
WO2003065165A2 (en) | 2002-01-31 | 2003-08-07 | Arc International | Configurable data processor with multi-length instruction set architecture |
US7168067B2 (en) | 2002-02-08 | 2007-01-23 | Agere Systems Inc. | Multiprocessor system with cache-based software breakpoints |
US7529912B2 (en) | 2002-02-12 | 2009-05-05 | Via Technologies, Inc. | Apparatus and method for instruction-level specification of floating point format |
US7181596B2 (en) | 2002-02-12 | 2007-02-20 | Ip-First, Llc | Apparatus and method for extending a microprocessor instruction set |
US7328328B2 (en) | 2002-02-19 | 2008-02-05 | Ip-First, Llc | Non-temporal memory reference control mechanism |
US7315921B2 (en) | 2002-02-19 | 2008-01-01 | Ip-First, Llc | Apparatus and method for selective memory attribute control |
US7546446B2 (en) | 2002-03-08 | 2009-06-09 | Ip-First, Llc | Selective interrupt suppression |
US7395412B2 (en) | 2002-03-08 | 2008-07-01 | Ip-First, Llc | Apparatus and method for extending data modes in a microprocessor |
US7185180B2 (en) | 2002-04-02 | 2007-02-27 | Ip-First, Llc | Apparatus and method for selective control of condition code write back |
US7302551B2 (en) | 2002-04-02 | 2007-11-27 | Ip-First, Llc | Suppression of store checking |
US7155598B2 (en) | 2002-04-02 | 2006-12-26 | Ip-First, Llc | Apparatus and method for conditional instruction execution |
US7380103B2 (en) | 2002-04-02 | 2008-05-27 | Ip-First, Llc | Apparatus and method for selective control of results write back |
US7373483B2 (en) | 2002-04-02 | 2008-05-13 | Ip-First, Llc | Mechanism for extending the number of registers in a microprocessor |
US7380109B2 (en) | 2002-04-15 | 2008-05-27 | Ip-First, Llc | Apparatus and method for providing extended address modes in an existing instruction set for a microprocessor |
KR100450753B1 (en) | 2002-05-17 | 2004-10-01 | 한국전자통신연구원 | Programmable variable length decoder including interface of CPU processor |
US6718504B1 (en) | 2002-06-05 | 2004-04-06 | Arc International | Method and apparatus for implementing a data processor adapted for turbo decoding |
US6968444B1 (en) | 2002-11-04 | 2005-11-22 | Advanced Micro Devices, Inc. | Microprocessor employing a fixed position dispatch unit |
US6774832B1 (en) * | 2003-03-25 | 2004-08-10 | Raytheon Company | Multi-bit output DDS with real time delta sigma modulation look up from memory |
US7590829B2 (en) | 2003-03-31 | 2009-09-15 | Stretch, Inc. | Extension adapter |
US7668897B2 (en) | 2003-06-16 | 2010-02-23 | Arm Limited | Result partitioning within SIMD data processing systems |
US7373642B2 (en) | 2003-07-29 | 2008-05-13 | Stretch, Inc. | Defining instruction extensions in a standard programming language |
US7133950B2 (en) | 2003-08-19 | 2006-11-07 | Sun Microsystems, Inc. | Request arbitration in multi-core processor |
US7363544B2 (en) * | 2003-10-30 | 2008-04-22 | International Business Machines Corporation | Program debug method and apparatus |
US7401328B2 (en) * | 2003-12-18 | 2008-07-15 | Lsi Corporation | Software-implemented grouping techniques for use in a superscalar data processing system |
US7613911B2 (en) * | 2004-03-12 | 2009-11-03 | Arm Limited | Prefetching exception vectors by early lookup exception vectors within a cache memory |
US20050278517A1 (en) * | 2004-05-19 | 2005-12-15 | Kar-Lik Wong | Systems and methods for performing branch prediction in a variable length instruction set microprocessor |
-
2005
- 2005-05-19 US US11/132,428 patent/US20050278517A1/en not_active Abandoned
- 2005-05-19 US US11/132,424 patent/US8719837B2/en active Active
- 2005-05-19 TW TW094116302A patent/TW200602974A/en unknown
- 2005-05-19 US US11/132,447 patent/US20050278505A1/en not_active Abandoned
- 2005-05-19 WO PCT/US2005/017586 patent/WO2005114441A2/en active Application Filing
- 2005-05-19 US US11/132,423 patent/US20050278513A1/en not_active Abandoned
- 2005-05-19 GB GB0622477A patent/GB2428842A/en not_active Withdrawn
- 2005-05-19 US US11/132,448 patent/US20050289323A1/en not_active Abandoned
- 2005-05-19 CN CNA2005800215322A patent/CN101002169A/en active Pending
- 2005-05-19 US US11/132,432 patent/US20050273559A1/en not_active Abandoned
-
2014
- 2014-03-21 US US14/222,194 patent/US9003422B2/en active Active
Patent Citations (84)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5339365A (en) * | 1989-08-31 | 1994-08-16 | Canon Kabushiki Kaisha | Image processing apparatus capable of using fuzzy logic |
US5327536A (en) * | 1990-05-22 | 1994-07-05 | Nec Corporation | Microprocessor having branch prediction function |
US5155843A (en) * | 1990-06-29 | 1992-10-13 | Digital Equipment Corporation | Error transition mode for multi-processor system |
US6076158A (en) * | 1990-06-29 | 2000-06-13 | Digital Equipment Corporation | Branch prediction in high-performance processor |
US5778423A (en) * | 1990-06-29 | 1998-07-07 | Digital Equipment Corporation | Prefetch instruction for improving performance in reduced instruction set processor |
US6948052B2 (en) * | 1991-07-08 | 2005-09-20 | Seiko Epson Corporation | High-performance, superscalar-based computer system with out-of-order instruction execution |
US5422739A (en) * | 1991-07-08 | 1995-06-06 | Canon Kabushiki Kaisha | Color expressing method, color image reading apparatus and color image processing apparatus |
US5504592A (en) * | 1991-07-08 | 1996-04-02 | Canon Kabushiki Kaisha | Color expressing method, color image reading apparatus and color image processing apparatus |
US5577217A (en) * | 1993-05-14 | 1996-11-19 | Intel Corporation | Method and apparatus for a branch target buffer with shared branch pattern tables for associated branch predictions |
US5454117A (en) * | 1993-08-25 | 1995-09-26 | Nexgen, Inc. | Configurable branch prediction for a processor performing speculative execution |
US5761723A (en) * | 1994-02-04 | 1998-06-02 | Motorola, Inc. | Data processor with branch prediction and method of operation |
US5692168A (en) * | 1994-10-18 | 1997-11-25 | Cyrix Corporation | Prefetch buffer using flow control bit to identify changes of flow within the code stream |
US5655122A (en) * | 1995-04-05 | 1997-08-05 | Sequent Computer Systems, Inc. | Optimizing compiler with static prediction of branch probability, branch frequency and function frequency |
US5659752A (en) * | 1995-06-30 | 1997-08-19 | International Business Machines Corporation | System and method for improving branch prediction in compiled program code |
US5995248A (en) * | 1996-03-22 | 1999-11-30 | Minolta Co., Ltd. | Image forming device and method having MTF correction |
US5752014A (en) * | 1996-04-29 | 1998-05-12 | International Business Machines Corporation | Automatic selection of branch prediction methodology for subsequent branch instruction based on outcome of previous branch prediction |
US20040068643A1 (en) * | 1997-08-01 | 2004-04-08 | Dowling Eric M. | Method and apparatus for high performance branching in pipelined microsystems |
US20010044892A1 (en) * | 1997-09-10 | 2001-11-22 | Shinichi Yamaura | Method and system for high performance implementation of microprocessors |
US6151672A (en) * | 1998-02-23 | 2000-11-21 | Hewlett-Packard Company | Methods and apparatus for reducing interference in a branch history table of a microprocessor |
US6353882B1 (en) * | 1998-02-23 | 2002-03-05 | Hewlett-Packard Company | Reducing branch prediction interference of opposite well behaved branches sharing history entry by static prediction correctness based updating |
US20010056531A1 (en) * | 1998-03-19 | 2001-12-27 | Mcfarling Scott | Branch predictor with serially connected predictor stages for improving branch prediction accuracy |
US20010040686A1 (en) * | 1998-06-26 | 2001-11-15 | Heidi M. Schoolcraft | Streamlined tetrahedral interpolation |
US6253287B1 (en) * | 1998-09-09 | 2001-06-26 | Advanced Micro Devices, Inc. | Using three-dimensional storage to make variable-length instructions appear uniform in two dimensions |
US6339822B1 (en) * | 1998-10-02 | 2002-01-15 | Advanced Micro Devices, Inc. | Using padded instructions in a block-oriented cache |
US6526502B1 (en) * | 1998-12-02 | 2003-02-25 | Ip-First Llc | Apparatus and method for speculatively updating global branch history with branch prediction prior to resolution of branch outcome |
US6189091B1 (en) * | 1998-12-02 | 2001-02-13 | Ip First, L.L.C. | Apparatus and method for speculatively updating global history and restoring same on branch misprediction detection |
US20060036836A1 (en) * | 1998-12-31 | 2006-02-16 | Metaflow Technologies, Inc. | Block-based branch target buffer |
US6763452B1 (en) * | 1999-01-28 | 2004-07-13 | Ati International Srl | Modifying program execution based on profiling |
US6477683B1 (en) * | 1999-02-05 | 2002-11-05 | Tensilica, Inc. | Automated processor generation system for designing a configurable processor and method for the same |
US6647491B2 (en) * | 1999-02-18 | 2003-11-11 | Hewlett-Packard Development Company, L.P. | Hardware/software system for profiling instructions and selecting a trace using branch history information for branch predictions |
US6571331B2 (en) * | 1999-03-18 | 2003-05-27 | Ip-First, Llc | Static branch prediction mechanism for conditional branch instructions |
US20010032309A1 (en) * | 1999-03-18 | 2001-10-18 | Henry G. Glenn | Static branch prediction mechanism for conditional branch instructions |
US6499101B1 (en) * | 1999-03-18 | 2002-12-24 | I.P. First L.L.C. | Static branch prediction mechanism for conditional branch instructions |
US6427206B1 (en) * | 1999-05-03 | 2002-07-30 | Intel Corporation | Optimized branch predictions for strongly predicted compiler branches |
US6560754B1 (en) * | 1999-05-13 | 2003-05-06 | Arc International Plc | Method and apparatus for jump control in a pipelined processor |
US20040225871A1 (en) * | 1999-10-01 | 2004-11-11 | Naohiko Irie | Branch control memory |
US20020199092A1 (en) * | 1999-11-05 | 2002-12-26 | Ip-First Llc | Split history tables for branch prediction |
US20010021974A1 (en) * | 2000-02-01 | 2001-09-13 | Samsung Electronics Co., Ltd. | Branch predictor suitable for multi-processing microprocessor |
US6681295B1 (en) * | 2000-08-31 | 2004-01-20 | Hewlett-Packard Development Company, L.P. | Fast lane prefetching |
US20020066006A1 (en) * | 2000-11-29 | 2002-05-30 | Lsi Logic Corporation | Simple branch prediction and misprediction recovery method |
US20020069351A1 (en) * | 2000-12-05 | 2002-06-06 | Shyh-An Chi | Memory data access structure and method suitable for use in a processor |
US20020073301A1 (en) * | 2000-12-07 | 2002-06-13 | International Business Machines Corporation | Hardware for use with compiler generated branch information |
US20020078332A1 (en) * | 2000-12-19 | 2002-06-20 | Seznec Andre C. | Conflict free parallel read access to a bank interleaved branch predictor in a processor |
US20020083312A1 (en) * | 2000-12-27 | 2002-06-27 | Balaram Sinharoy | Branch Prediction apparatus and process for restoring replaced branch history for use in future branch predictions for an executing program |
US20020087851A1 (en) * | 2000-12-28 | 2002-07-04 | Matsushita Electric Industrial Co., Ltd. | Microprocessor and an instruction converter |
US20020087852A1 (en) * | 2000-12-28 | 2002-07-04 | Jourdan Stephan J. | Method and apparatus for predicting branches using a meta predictor |
US20020157000A1 (en) * | 2001-03-01 | 2002-10-24 | International Business Machines Corporation | Software hint to improve the branch target prediction accuracy |
US20020138236A1 (en) * | 2001-03-21 | 2002-09-26 | Akihiro Takamura | Processor having execution result prediction function for instruction |
US20020194461A1 (en) * | 2001-05-04 | 2002-12-19 | Ip First Llc | Speculative branch target address cache |
US6886093B2 (en) * | 2001-05-04 | 2005-04-26 | Ip-First, Llc | Speculative hybrid branch direction predictor |
US20020188833A1 (en) * | 2001-05-04 | 2002-12-12 | Ip First Llc | Dual call/return stack branch prediction system |
US20050132175A1 (en) * | 2001-05-04 | 2005-06-16 | Ip-First, Llc. | Speculative hybrid branch direction predictor |
US20020194463A1 (en) * | 2001-05-04 | 2002-12-19 | Ip First Llc, | Speculative hybrid branch direction predictor |
US20020194462A1 (en) * | 2001-05-04 | 2002-12-19 | Ip First Llc | Apparatus and method for selecting one of multiple target addresses stored in a speculative branch target address cache per instruction cache line |
US20020194464A1 (en) * | 2001-05-04 | 2002-12-19 | Ip First Llc | Speculative branch target address cache with selective override by seconday predictor based on branch instruction type |
US20040172524A1 (en) * | 2001-06-29 | 2004-09-02 | Jan Hoogerbrugge | Method, apparatus and compiler for predicting indirect branch target addresses |
US20030023838A1 (en) * | 2001-07-27 | 2003-01-30 | Karim Faraydon O. | Novel fetch branch architecture for reducing branch penalty without branch prediction |
US20030204705A1 (en) * | 2002-04-30 | 2003-10-30 | Oldfield William H. | Prediction of branch instructions in a data processing apparatus |
US20040225872A1 (en) * | 2002-06-04 | 2004-11-11 | International Business Machines Corporation | Hybrid branch prediction using a global selection counter and a prediction method comparison table |
US20040015683A1 (en) * | 2002-07-18 | 2004-01-22 | International Business Machines Corporation | Two dimensional branch history table prefetching mechanism |
US20040049660A1 (en) * | 2002-09-06 | 2004-03-11 | Mips Technologies, Inc. | Method and apparatus for clearing hazards using jump instructions |
US20050125634A1 (en) * | 2002-10-04 | 2005-06-09 | Fujitsu Limited | Processor and instruction control method |
US20040139281A1 (en) * | 2003-01-14 | 2004-07-15 | Ip-First, Llc. | Apparatus and method for efficiently updating branch target address cache |
US20040186985A1 (en) * | 2003-03-21 | 2004-09-23 | Analog Devices, Inc. | Method and apparatus for branch prediction based on branch targets |
US20040193843A1 (en) * | 2003-03-31 | 2004-09-30 | Eran Altshuler | System and method for early branch prediction |
US20040193855A1 (en) * | 2003-03-31 | 2004-09-30 | Nicolas Kacevas | System and method for branch prediction access |
US20040225870A1 (en) * | 2003-05-07 | 2004-11-11 | Srinivasan Srikanth T. | Method and apparatus for reducing wrong path execution in a speculative multi-threaded processor |
US20040230782A1 (en) * | 2003-05-12 | 2004-11-18 | International Business Machines Corporation | Method and system for processing loop branch instructions |
US20040255104A1 (en) * | 2003-06-12 | 2004-12-16 | Intel Corporation | Method and apparatus for recycling candidate branch outcomes after a wrong-path execution in a superscalar processor |
US20040268102A1 (en) * | 2003-06-30 | 2004-12-30 | Combs Jonathan D. | Mechanism to remove stale branch predictions at a microprocessor |
US20050027974A1 (en) * | 2003-07-31 | 2005-02-03 | Oded Lempel | Method and system for conserving resources in an instruction pipeline |
US20050050309A1 (en) * | 2003-08-29 | 2005-03-03 | Renesas Technology Corp. | Data processor |
US20050076193A1 (en) * | 2003-09-08 | 2005-04-07 | Ip-First, Llc. | Apparatus and method for selectively overriding return stack prediction in response to detection of non-standard return sequence |
US20050066305A1 (en) * | 2003-09-22 | 2005-03-24 | Lisanke Robert John | Method and machine for efficient simulation of digital hardware within a software development environment |
US20050091479A1 (en) * | 2003-10-24 | 2005-04-28 | Sung-Woo Chung | Branch predictor, system and method of branch prediction |
US20050125632A1 (en) * | 2003-12-03 | 2005-06-09 | Advanced Micro Devices, Inc. | Transitioning from instruction cache to trace cache on label boundaries |
US20050125613A1 (en) * | 2003-12-03 | 2005-06-09 | Sangwook Kim | Reconfigurable trace cache |
US20050154867A1 (en) * | 2004-01-14 | 2005-07-14 | International Business Machines Corporation | Autonomic method and apparatus for counting branch instructions to improve branch predictions |
US20050172277A1 (en) * | 2004-02-04 | 2005-08-04 | Saurabh Chheda | Energy-focused compiler-assisted branch prediction |
US20050216713A1 (en) * | 2004-03-25 | 2005-09-29 | International Business Machines Corporation | Instruction text controlled selectively stated branches for prediction via a branch target buffer |
US20050216703A1 (en) * | 2004-03-26 | 2005-09-29 | International Business Machines Corporation | Apparatus and method for decreasing the latency between an instruction cache and a pipeline processor |
US20050223202A1 (en) * | 2004-03-31 | 2005-10-06 | Intel Corporation | Branch prediction in a pipelined processor |
US20060015706A1 (en) * | 2004-06-30 | 2006-01-19 | Chunrong Lai | TLB correlated branch predictor and method for use thereof |
US20060041868A1 (en) * | 2004-08-23 | 2006-02-23 | Cheng-Yen Huang | Method for verifying branch prediction mechanism and accessible recording medium for storing program thereof |
Cited By (25)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20050278505A1 (en) * | 2004-05-19 | 2005-12-15 | Lim Seow C | Microprocessor architecture including zero impact predictive data pre-fetch mechanism for pipeline data memory |
US20050289323A1 (en) * | 2004-05-19 | 2005-12-29 | Kar-Lik Wong | Barrel shifter for a microprocessor |
US9003422B2 (en) | 2004-05-19 | 2015-04-07 | Synopsys, Inc. | Microprocessor architecture having extendible logic |
US8719837B2 (en) | 2004-05-19 | 2014-05-06 | Synopsys, Inc. | Microprocessor architecture having extendible logic |
US7971042B2 (en) | 2005-09-28 | 2011-06-28 | Synopsys, Inc. | Microprocessor system and method for instruction-initiated recording and execution of instruction sequences in a dynamically decoupleable extended instruction pipeline |
US20070260862A1 (en) * | 2006-05-03 | 2007-11-08 | Mcfarling Scott | Providing storage in a memory hierarchy for prediction information |
US7779241B1 (en) * | 2007-04-10 | 2010-08-17 | Dunn David A | History based pipelined branch prediction |
US8473727B2 (en) | 2007-04-10 | 2013-06-25 | David A. Dunn | History based pipelined branch prediction |
US9519480B2 (en) * | 2008-02-11 | 2016-12-13 | International Business Machines Corporation | Branch target preloading using a multiplexer and hash circuit to reduce incorrect branch predictions |
US20090204798A1 (en) * | 2008-02-11 | 2009-08-13 | Alexander Gregory W | Simplified Implementation of Branch Target Preloading |
US8131982B2 (en) * | 2008-06-13 | 2012-03-06 | International Business Machines Corporation | Branch prediction instructions having mask values involving unloading and loading branch history data |
US10338923B2 (en) | 2009-05-05 | 2019-07-02 | International Business Machines Corporation | Branch prediction path wrong guess instruction |
US20100287358A1 (en) * | 2009-05-05 | 2010-11-11 | International Business Machines Corporation | Branch Prediction Path Instruction |
US20110225401A1 (en) * | 2010-03-11 | 2011-09-15 | International Business Machines Corporation | Prefetching branch prediction mechanisms |
US8521999B2 (en) | 2010-03-11 | 2013-08-27 | International Business Machines Corporation | Executing touchBHT instruction to pre-fetch information to prediction mechanism for branch with taken history |
US9851973B2 (en) * | 2012-03-30 | 2017-12-26 | Intel Corporation | Dynamic branch hints using branches-to-nowhere conditional branch |
US20140229721A1 (en) * | 2012-03-30 | 2014-08-14 | Andrew T. Forsyth | Dynamic branch hints using branches-to-nowhere conditional branch |
US9135012B2 (en) | 2012-06-14 | 2015-09-15 | International Business Machines Corporation | Instruction filtering |
US9152424B2 (en) | 2012-06-14 | 2015-10-06 | International Business Machines Corporation | Mitigating instruction prediction latency with independently filtered presence predictors |
US9152425B2 (en) * | 2012-06-14 | 2015-10-06 | International Business Machines Corporation | Mitigating instruction prediction latency with independently filtered presence predictors |
US9135013B2 (en) | 2012-06-14 | 2015-09-15 | International Business Machines Corporation | Instruction filtering |
US20140101418A1 (en) * | 2012-06-14 | 2014-04-10 | International Business Machines Corporation | Mitigating instruction prediction latency with independently filtered presence predictors |
US10902348B2 (en) | 2017-05-19 | 2021-01-26 | International Business Machines Corporation | Computerized branch predictions and decisions |
US10372459B2 (en) | 2017-09-21 | 2019-08-06 | Qualcomm Incorporated | Training and utilization of neural branch predictor |
US11163577B2 (en) | 2018-11-26 | 2021-11-02 | International Business Machines Corporation | Selectively supporting static branch prediction settings only in association with processor-designated types of instructions |
Also Published As
Publication number | Publication date |
---|---|
GB2428842A (en) | 2007-02-07 |
US20050273559A1 (en) | 2005-12-08 |
US20050289323A1 (en) | 2005-12-29 |
US20140208087A1 (en) | 2014-07-24 |
US9003422B2 (en) | 2015-04-07 |
TW200602974A (en) | 2006-01-16 |
WO2005114441A2 (en) | 2005-12-01 |
US20050278505A1 (en) | 2005-12-15 |
US20050289321A1 (en) | 2005-12-29 |
US20050278517A1 (en) | 2005-12-15 |
US8719837B2 (en) | 2014-05-06 |
GB0622477D0 (en) | 2006-12-20 |
WO2005114441A3 (en) | 2007-01-18 |
CN101002169A (en) | 2007-07-18 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20050278513A1 (en) | Systems and methods of dynamic branch prediction in a microprocessor | |
US6988190B1 (en) | Method of an address trace cache storing loop control information to conserve trace cache area | |
US7437537B2 (en) | Methods and apparatus for predicting unaligned memory access | |
US7117347B2 (en) | Processor including fallback branch prediction mechanism for far jump and far call instructions | |
JP5198879B2 (en) | Suppress branch history register updates by branching at the end of the loop | |
US6189091B1 (en) | Apparatus and method for speculatively updating global history and restoring same on branch misprediction detection | |
US10209992B2 (en) | System and method for branch prediction using two branch history tables and presetting a global branch history register | |
JP2008530714A5 (en) | ||
JP2008532142A5 (en) | ||
JP5579694B2 (en) | Method and apparatus for managing a return stack | |
US10664280B2 (en) | Fetch ahead branch target buffer | |
US7107438B2 (en) | Pipelined microprocessor, apparatus, and method for performing early correction of conditional branch instruction mispredictions | |
US7185182B2 (en) | Pipelined microprocessor, apparatus, and method for generating early instruction results | |
US8285976B2 (en) | Method and apparatus for predicting branches using a meta predictor | |
JP2006216040A (en) | Method and apparatus for dynamic prediction by software | |
US7065636B2 (en) | Hardware loops and pipeline system using advanced generation of loop parameters | |
US7730289B2 (en) | Method for preloading data in a CPU pipeline | |
US7472264B2 (en) | Predicting a jump target based on a program counter and state information for a process | |
US7100024B2 (en) | Pipelined microprocessor, apparatus, and method for generating early status flags | |
US7890739B2 (en) | Method and apparatus for recovering from branch misprediction | |
US20050071830A1 (en) | Method and system for processing a sequence of instructions | |
GB2454816A (en) | Method for executing a load instruction in a pipeline processor, putting the data in the target address into a buffer then loading the requested data. |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: ARC INTERNATIONAL, UNITED KINGDOM Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:ARISTODEMOU, ARIS;FUHLER, RICH;WONG, KAR-LIK;REEL/FRAME:016932/0953;SIGNING DATES FROM 20050825 TO 20050826 |
|
STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION |