Patent classifications
G06F9/30141
IDENTIFYING AND TRACKING FREQUENTLY ACCESSED REGISTERS IN A PROCESSOR
Embodiments include methods, computing systems and computer program products for identifying and tracking frequently accessed registers in a processor of a computing system. Aspects include: creating a list of top accessed registers of certain registers in processor, each register having a corresponding register usage counter, initializing each register usage counter, starting a register usage monitoring mode, examining each register usage counter, and updating list of top accessed registers, stopping register usage monitoring mode, and updating a register file partition assignment when the list of top accessed registers is identified. Once the list of top accessed registers is identified, stopping the programs and bring its threads of execution to quiescent, moving registers between register file partitions until all registers on the list of top accessed registers are in the fully-ported register file partition, and resuming executions of the program and its threads.
Apparatus and method for implementing power saving techniques when processing floating point values
An apparatus and method are described for reducing power when reading and writing graphics data. For example, one embodiment of an apparatus comprises: a graphics processor unit (GPU) to process graphics data including floating point data; a set of registers, at least one of the registers of the set partitioned to store the floating point data; and encode/decode logic to reduce a number of binary 1 values being read from the at least one register by causing a specified set of bit positions within the floating point data to be read out as 0s rather than 1s.
RISC PROCESSOR HAVING SPECIALIZED DATAPATH FOR SPECIALIZED REGISTERS
A data path block circuit is disclosed. The data path block circuit includes a data path circuit having logic circuits, each configured to perform a data path operation to generate a result based on first and second operands. The data path block circuit also includes a first operand multiplexer, having inputs, each connected to one of a first register file, including a quantity of read and write ports, and a second register file, including a different quantity of read and write ports. The data path block circuit also includes a second operand multiplexer, having inputs, each connected to one of the first register file and the second register file. At least one of the first and second operand multiplexers includes a data input connected to the first register file. At least one of the first and second operand multiplexers includes a data input connected to the second register file.
Reconfigurable circuit having rows of a matrix of registers connected to corresponding ports and a semiconductor integrated circuit
A reconfigurable circuit includes a plurality of processing elements and an input/output data interface unit, and the reconfigurable circuit is configured to control connections of the plurality of processing elements for each context. The input/output data interface unit is configured to hold operation input data which is input to the plurality of processing elements and operation output data which is output from the plurality of processing elements. The input/output data interface unit includes a plurality of ports, and a plurality of registers. The registers are configured to be connected to the plurality of ports, and to include m (m being an integer of 2 or more) number of banks in a depth direction.
Resilient register file circuit for dynamic variation tolerance and method of operating the same
The disclosed system and method detect and correct register file read path errors that may occur as a result of reducing or eliminating supply voltage guardbands and/or frequency guardbands for a CPU, thereby increasing overall energy efficiency of the system.
COMPUTER SYSTEM USING A PLURALITY OF SINGLE INSTRUCTION MULTIPLE DATA (SIMD) ENGINES FOR EFFICIENT MATRIX OPERATIONS
A computer system including a plurality of SIMD engines and a corresponding plurality of output register sets. Operand A register file stores one or more Operand A values, each including a plurality of operand words. Operand B register file stores one or more Operand B values, each including a plurality of operand words. Operand A distribution circuit receives an Operand A value from the Operand A register file, and selectively routes one or more of the operand words of the received Operand A value to create a plurality of input Operand A values, which are selectively routed to the SIMD engines. Operand B distribution circuit receives one or more Operand B values from the Operand B register file, and selectively routes one or more of the operand words of the Operand B value(s) to create a plurality of input Operand B values, which are selectively routed to the SIMD engines.
SUPER-THREAD PROCESSOR
The disclosed inventions include a processor apparatus and method that enable a general purpose processor to achieve twice the operating frequency of typical processor implementations with a modest increase in area and a modest increase in energy per operation. The invention relies upon exploiting multiple independent streams of execution. Low area and low energy memory arrays used for register files operate a modest frequency. Instructions can be issued at a rate higher than this frequency by including logic that guarantees the spacing between instructions from the same thread are spaced wider than the time to access the register file. The result of the invention is the ability to overlap long latency structures, which allows using lower energy structures, thereby reducing energy per operation.
Arithmetic processing device and control method implemented by arithmetic processing device
An arithmetic processing device includes: a decoding circuit configured to decode a command; a command execution circuit configured to execute the command decoded by the decoding circuit; a register circuit configured to include a plurality of registers for holding data used by the command execution circuit; an identification information holding circuit configured to store identification information for identifying a register for writing a specific value when the command is a register writing command; a setting circuit configured to hold the specific value; and an operation control circuit configured to execute inhibiting processing when the command is a register reading command, the inhibiting processing including inhibiting an access of the register by the register reading command and selecting the specific value held in the setting circuit.
Memory fragments for supporting code block execution by using virtual cores instantiated by partitionable engines
A global front end scheduler to schedule instruction sequences to a plurality of virtual cores implemented via a plurality of partitionable engines. The global front end scheduler includes a thread allocation array to store a set of allocation thread pointers to point to a set of buckets in a bucket buffer in which execution blocks for respective threads are placed, a bucket buffer to provide a matrix of buckets, the bucket buffer including storage for the execution blocks, and a bucket retirement array to store a set of retirement thread pointers that track a next execution block to retire for a thread.
Microprocessor having self-resetting register scoreboard
A microprocessor using a counter in a scoreboard is introduced to handle data dependency. The microprocessor includes a register file having a plurality of registers mapped to entries of the scoreboard. Each entry of the scoreboard has a counter that tracks the data dependency of each of the registers. The counter decrements for every clock cycle until the counter resets itself when it counts down to 0. With the implementation of the counter in the scoreboard, the instruction pipeline may be managed according to the number of clock cycles of a previous issued instruction takes to access the register which is recorded in the counter of the scoreboard.