Patent classifications
G06F9/30141
Super-thread processor
The disclosed inventions include a processor apparatus and method that enable a general purpose processor to achieve twice the operating frequency of typical processor implementations with a modest increase in area and a modest increase in energy per operation. The invention relies upon exploiting multiple independent streams of execution. Low area and low energy memory arrays used for register files operate a modest frequency. Instructions can be issued at a rate higher than this frequency by including logic that guarantees the spacing between instructions from the same thread are spaced wider than the time to access the register file. The result of the invention is the ability to overlap long latency structures, which allows using lower energy structures, thereby reducing energy per operation.
MULTI-BIT REGISTER, CHIP, AND COMPUTING APPARATUS
A multi-bit register (200), a chip, and a computing apparatus, the multi-bit register (100) including: a plurality of register units (210-1, 210-2, . . . , 210-N), each of which is configured to store a bit of data, and the plurality of register units (210-1, 210-2, . . . , 210-N) being connected in parallel to each other; a clock buffer configured to provide a clock signal for the plurality of register units (210-1, 210-2, . . . , 210-N), wherein the plurality of register units (210-1, 210-2, . . . , 210-N) is arranged into an array of register units, and the clock buffer is arranged at an intervening position of the array of register units (210-1, 210-2, . . . , 210-N).
Arithmetic processing device
An arithmetic processing device includes: a decoder configured to write an immediate value to a register in a case where an instruction to be executed is an instruction not involving data reading from the register; and a processor configured to read data from the register and write a computing result based on the read data to the register in a case where an instruction to be executed by the decoder is an instruction involving data reading from the register.
Systems And Methods For Processor Circuits
A processor circuit includes a first front-end circuit for scheduling first instructions for a first program and a second front-end circuit for scheduling second instructions for a second program. A back-end processing circuit processes first operations in the first instructions and second operations in the second instructions. A multi-program scheduler circuit causes the first front-end circuit to schedule processing of the first operations on the back-end processing circuit and causes the second front-end circuit to schedule processing of the second operations on the back-end processing circuit. A processor generator system includes a processor designer that creates specifications for a processor using workloads for a program, a processor generator that generates a first processor instance using the specifications, a processor optimizer that generates a second processor instance using the workloads, and a co-designer that modifies the program using the second processor instance.
Vector multiply-add instruction
An apparatus comprises processing circuitry, a number of vector register and a number of scalar registers. An instruction decoder is provided which supports decoding of a vector multiply-add instruction specifying at least one vector register and at least one scalar register. In response to the vector multiply-add instruction, the decoder controls the processing circuitry to perform a vector multiply-add instruction in which each lane of processing generates a respective result data element corresponding to a sum of difference of a product value and an addend value, with the product value comprising the product of a respective data element of a first vector value and a multiplier value. In each lane of processing at least one of the multiplier value and the addend value is specified as a portion of a scalar value stored in a scalar register.
System and handling of register data in processors
A method, processor and/or system for processing data is disclosed that in an aspect includes providing a physical register file with one or more register file entries for storing data; identifying each physical register file entry with a row identifier to identify the entry row in the physical register file; enabling one or more columns within a target entry row of the physical register file; and revising data in the columns enabled within the target entry row of the physical register file. In an aspect, each physical register file entry is partitioned into a plurality of columns having a bit width and a column mask preferably is used to enable the one or more columns within the target row of the physical register file, and data is revised in only the columns enabled by the column mask.
Clearing Register Data
A processing unit having a register file includes: a plurality of registers each having a write enable input configured to receive a write enable signal and a write data input connected to a write data path of the processing unit and configured to write data values from the write data path for storage in the register when the write enable signal is asserted; write circuitry configured in a normal mode of operation to assert the write enable signal of a respective one of the registers to cause operational data values to be written to that register from the write data path; and data cleansing circuitry configured to control a data cleansing mode in which the write enable signals of all registers in the register file are simultaneously asserted to cause cleansing data values to be simultaneously written to all registers from the write data path.
System, apparatus and method for segmenting a memory array
In one embodiment, a graphics processor includes a register file having a plurality of storage segments to store information and output a plurality of segment outputs via a plurality of segmented bitlines to a static logic circuit to receive the plurality of segment outputs from the plurality of storage segments and to output read data based on the plurality of segment outputs. The register file may output the read data with a same amount of power without regard to a logic state of the read data. Other embodiments are described and claimed.
HIERARCHICAL GENERAL REGISTER FILE (GRF) FOR EXECUTION BLOCK
In an example, an apparatus comprises a plurality of execution units, and a first general register file (GRF) communicatively couple to the plurality of execution units, wherein the first GRF is shared by the plurality of execution units. Other embodiments are also disclosed and claimed.
COMMANDS TO SELECT A PORT DESCRIPTOR OF A SPECIFIC VERSION
A port descriptor version of a port descriptor to be obtained is selected. An indication of the port descriptor version is provided in a command to be preceded before another command used to obtain the port descriptor. The other command uses the port descriptor version to obtain the port descriptor. The port descriptor is obtained, and the port descriptor includes information relating to a port to be used in communication within the computing environment.