Patent classifications
G06F9/318
Inserting null vectors into a stream of vectors
Software instructions are executed on a processor within a computer system to configure a steaming engine with stream parameters to define a multidimensional array. The stream parameters define a size for each dimension of the multidimensional array, a null vector count (N), and a selected dimension. Data is fetched from a memory coupled to the streaming engine responsive to the stream parameters. A stream of vectors is formed for the multidimensional array responsive to the stream parameters from the data fetched from memory. N null stream vectors are inserted into the stream of vectors for the selected dimension without fetching respective null data from the memory.
Latent modification instruction for substituting functionality of instructions during transactional execution
An instruction stream includes a transactional code region. The transactional code region includes a latent modification instruction (LMI), a next sequential instruction (NSI) following the LMI, and a set of target instructions following the NSI in program order. Each target instruction has an associated function, and the LMI at least partially specifies a substitute function for the associated function. A processor executes the LMI, the NSI, and at least one of the target instructions, employing the substitute function at least partially specified by the LMI. The LMI, the NSI, and the target instructions may be executed by the processor in sequential program order or out of order.
Arithmetic logic unit with normal and accelerated performance modes using differing numbers of computational circuits
A processor includes a front end including circuitry to decode a first instruction to set a performance register for an execution unit and a second instruction, and an allocator including circuitry to assign the second instruction to the execution unit to execute the second instruction. The execution unit includes circuitry to select between a normal computation and an accelerated computation based on a mode field of the performance register, perform the selected computation, and select between a normal result associated with the normal computation and an accelerated result associated with the accelerated computation based on the mode field.
Method for forming constant extensions in the same execute packet in a VLIW processor
In a very long instruction word (VLIW) central processing unit instructions are grouped into execute packets that execute in parallel. A constant may be specified or extended by bits in a constant extension instruction in the same execute packet. If an instruction includes an indication of constant extension, the decoder employs bits of a constant extension instruction to extend the constant of an immediate field. Two or more constant extension slots are permitted in each execute packet, each extending constants for a different predetermined subset of functional unit instructions. In an alternative embodiment, more than one functional unit may have constants extended from the same constant extension instruction employing the same extended bits. A long extended constant may be formed using the extension bits of two constant extension instructions.
System for the management of out-of-order traffic in an interconnect network and corresponding method and integrated circuit
A system to manage out-of-order traffic in an interconnect network has initiators that provide requests through the interconnect network to memory resource targets and provide responses back through the interconnect network. The system includes components upstream the interconnect network to perform response re-ordering, which include memory to store responses from the interconnect network and a memory map controller to store the responses on a set of logical circular buffers. Each logical circular buffer corresponds to an initiator. The memory map controller computes an offset address for each buffer and stores an offset address of a given request received on a request path. The controller computes an absolute write memory address where responses are written in the memory, the response corresponding to the given request based on the given request offset address. The memory map controller also performs an order-controlled parallel read of the logical circular buffers and routes the data read from the memory to the corresponding initiator.
Preventing duplicate execution by sharing a result between different processing lanes assigned micro-operations that generate the same result
A data processing apparatus has control circuitry for detecting whether a first micro-operation to be processed by a first processing lane would give the same result as a second micro-operation processed by a second processing lane. If they would give the same result, then the first micro-operation is prevented from being processed by the first processing lane and the result of the second micro-operation is output as the result of the first micro-operation. This avoids duplication of processing, to save energy for example.
Allocating a debug instruction set based on the current operating state in a multi-instruction-set data processing apparatus
A data processing apparatus is provided comprising data processing circuitry and debug circuitry. The debug circuitry controls operation of the processing circuitry when operating in a debug mode. The data processing circuitry determines upon entry into a debug mode a current operating state of the data processing apparatus. The data processing circuitry allocates one of a plurality of instruction sets to be used as a debug instruction set depending upon the determined current operating state.
Hierarchical synthesis of computer machine instructions
Aspects disclosed herein relate to aggregating functionality of computer machine instructions to generate additional computer machine instructions and including the additional computer machine instructions in an instruction set architecture (ISA). An exemplary method includes selecting at least first and second computer machine instructions from an instruction set, aggregating functionality of the first and second computer machine instructions to generate a third computer machine instruction, and adding the third computer machine instruction to the instruction set.
Data processing apparatus with instruction encodings to enable near and far memory access modes
Apparatus comprises a processor configured for operation under a sequence of instructions from an instruction set, wherein said processor comprises: means for conditionally inhibiting at least one type of trap, interrupt or exception (TIE) event, wherein, when operating under a sequence of instructions, said inhibition means is inaccessible by said instructions to inhibit the or each type of TIE event, without interrupting said sequence. A data processing apparatus includes a processor adapted to operate under control of program code comprising instructions selected from an instruction set, the apparatus comprising: a predefined memory space providing a predefined addressable memory for storing program code and data, a larger memory space providing a larger addressable memory, means for accessing program code and data within the predefined memory space, and means for controlling the access means so as to enable the access means to access program code located within the larger memory space.