Patent classifications
G06F9/30141
Commands to select a port descriptor of a specific version
A port descriptor version of a port descriptor to be obtained is selected. An indication of the port descriptor version is provided in a command to be preceded before another command used to obtain the port descriptor. The other command uses the port descriptor version to obtain the port descriptor. The port descriptor is obtained, and the port descriptor includes information relating to a port to be used in communication within the computing environment.
Data processing system, data transfer device, and context switching method
A processing section executes processes concerning a plurality of applications in a time division manner. A Context Switching Direct Memory Access (CSDMA) engine detects a switching timing of an application to be executed in the processing section. When detecting the switching timing, the CSDMA engine saves a context of an application that is being executed in the processing section 46, to a main memory from the processing section, and installs a context of an application to be subsequently executed in the processing section, from the main memory to the processing section, not through a process by software managing the plurality of applications.
Apparatus and method for processing an instruction matrix specifying parallel and dependent operations
An execution unit to execute instructions using a time-lag sliced architecture (TLSA). The execution unit includes a first computation unit and a second computation unit, where each of the first computation unit and the second computation unit includes a plurality of logic slices arranged in order, where each of the plurality of logic slices except a lattermost logic slice is coupled to an immediately following logic slice to provide an output of that logic slice to the immediately following logic slice, where the immediately following logic slice is to execute with a time lag with respect to its immediately previous logic slice. Further, each of the plurality of logic slices of the second computation unit is coupled to a corresponding logic slice of the first computation unit to receive an output of the corresponding logic slice of the first computation unit.
Implementing write ports in register-file array cell
An approach is provided in which a system writes a set of data into a register file entry that includes a first memory array and a second memory array. The register file entry also includes a set of first write ports corresponding to the first memory array and a set of second write ports corresponding to the second memory array. The system configures a selection bit based on determining that a selected one of the set of first write ports is utilized to store the set of data in the first memory array. In turn, the system reads the set of data out of the first memory array based on the configured selection bit.
Microprocessor with pipeline control for executing of instruction at a preset future time
In the disclosure, the microprocessor resolves the conflicts in decode stage and schedules the instruction to be executed at a future time. The instruction is issued to an execution queue until the scheduled time in the future when it is dispatched to a functional unit for execution. The disclosure uses a counter for the functional unit to track when the resource is available in the future to accept the next instruction. The disclosure also tracks the future N cycles when the register file read and write ports are scheduled to read and write operand data.
Vector compare and store instruction that stores index values to memory
The present disclosure is directed to methods to generate a packed result array using parallel vector processing, of an input array and a comparison operation. In one aspect, an additive scan operation can be used to generate memory offsets for each successful comparison operation of the input array and to generate a count of the number of data elements satisfying the comparison operation. In another aspect, the input array can be segmented to allow more efficient processing using the vector registers. In another aspect, a vector processing system is disclosed that is operable to receive a data array, a comparison operation, and threshold criteria, and output a packed array, at a specified memory address, comprising of the data elements satisfying the comparison operation.
Data processing system having distrubuted registers
A processing system includes a system interconnect, a processor coupled to communicate with other components in the processing system through the system interconnect, distributed general purpose registers (GPRs) in the processing system wherein a first subset of the distributed GPRs is located in the processor and a second subset of the distributed GPRs is located in the processing system and external to the processor, and a first set of conductors directly connected between the processor and the second subsets of the distributed GPRs. An instruction execution pipeline in the processor accesses any register in the first and second subsets of the distributed GPRs as part of the processor's GPRs during instruction execution in the processor, in which the second subset of the distributed GPRs is accessed through the first conductor.
Systems and methods for executing a programmable finite state machine that accelerates fetchless computations and operations of an array of processing cores of an integrated circuit
Systems and methods for fetchless acceleration of convolutional loops on an integrated circuit include identifying, by a compiler, finite state machine (FSM) initialization parameters based on computational requirements of a computational loop; initializing a programmable FSM based on the FSM initialization parameters, wherein the FSM initialization parameters include a loop iteration parameter identifying a number of computation cycles of the computational loop; executing the programmable FSM to enable fetchless computations by: generating a plurality of computational loop control signals including a distinct computation loop control signal for each of the number of computation cycles of the computational loop based on the loop iteration parameter; and controlling an execution of a plurality of computation cycles of a computational circuit performing the computational loop based on transmitting the plurality of computational loop control signals until the number of computation cycles of the computation loop are completed.
Commands to select a port descriptor of a specific version
A port descriptor version of a port descriptor to be obtained is selected. An indication of the port descriptor version is provided in a command to be preceded before another command used to obtain the port descriptor. The other command uses the port descriptor version to obtain the port descriptor. The port descriptor is obtained, and the port descriptor includes information relating to a port to be used in communication within the computing environment.
DATA PROCESSING SYSTEM HAVING DISTRUBUTED REGISTERS
A processing system includes a system interconnect, a processor coupled to communicate with other components in the processing system through the system interconnect, distributed general purpose registers (GPRs) in the processing system wherein a first subset of the distributed GPRs is located in the processor and a second subset of the distributed GPRs is located in the processing system and external to the processor, and a first set of conductors directly connected between the processor and the second subsets of the distributed GPRs. An instruction execution pipeline in the processor accesses any register in the first and second subsets of the distributed GPRs as part of the processor's GPRs during instruction execution in the processor, in which the second subset of the distributed GPRs is accessed through the first conductor.