Patent classifications
G06F9/381
Memory loopback systems and methods
One embodiment of the present disclosure describes a memory system that may include one or more memory devices that may store data. The memory devices may receive command signals to access the stored data as a loopback signal. The memory devices may operate in a normal operational mode, a loopback operational mode, a retrieval operational mode, a non-inverting pass-through operational sub-mode, and an inverting pass-through operational sub-mode. The operational modes facilitate the transmission of the loopback signal for the purpose of monitoring of memory device operations. A selective inversion technique, which uses the operational modes, may protect the loopback signal integrity during transmission.
Streaming engine with early exit from loop levels supporting early exit loops and irregular loops
A streaming engine employed in a digital data processor specifies a fixed read only data stream defined by plural nested loops. An address generator produces address of data elements. A steam head register stores data elements next to be supplied to functional units for use as operands. Upon a stream break instruction specifying one of the nested loops, the stream engine ends a current iteration of the loop. If the specified loop was not the outermost loop, the streaming engine begins an iteration of a next outer loop. If the specified loop was the outermost nested loop, the streaming engine ends the stream. The streaming engine places a vector of data elements in order in lanes within a stream head register. A stream break instruction is operable upon a vector break.
METHOD AND APPARATUS FOR VECTOR PERMUTATION
A method is provided that includes performing, by a processor in response to a vector permutation instruction, permutation of values stored in lanes of a vector to generate a permuted vector, wherein the permutation is responsive to a control storage location storing permute control input for each lane of the permuted vector, wherein the permute control input corresponding to each lane of the permuted vector indicates a value to be stored in the lane of the permuted vector, wherein the permute control input for at least one lane of the permuted vector indicates a value of a selected lane of the vector is to be stored in the at least one lane, and storing the permuted vector in a storage location indicated by an operand of the vector permutation instruction.
Micro-operation cache using predictive allocation
According to one general aspect, an apparatus may include an instruction fetch unit circuit configured to retrieve instructions from a memory. The apparatus may include an instruction decode unit configured to convert instructions into one or more micro-operations that are provided to an execution unit circuit. The apparatus may also include a micro-operation cache configured to store micro-operations. The apparatus may further include a branch prediction circuit configured to: determine when a kernel of instructions is repeating, store at least a portion of the kernel within the micro-operation cache, and provide the stored portion of the kernel to the execution unit circuit without the further aid of the instruction decode unit circuit.
Information processing program, information processing device, and information processing method
Provided are an information processing program, an information processing device, and an information processing method that enable application processing and data transmission in a non-blocking manner to increase a communication speed. A server device includes buffering means configured to accumulate events, socket writing means configured to process the events, and flag management means configured to exclusively set a flag. The socket writing means includes socket write request means and callback processing means. The flag management means exclusively sets the flag at a timing before the event processing requested by the socket write request means starts, and releases the flag at a timing after the processing by the callback processing means ends. The socket write request means receives a call, and in a case where the flag is set, the events accumulated by the buffering means are processed.
METHOD AND APPARATUS FOR PERMUTING STREAMED DATA ELEMENTS
A method is provided that includes receiving, in a permute network, a plurality of data elements for a vector instruction from a streaming engine, and mapping, by the permute network, the plurality of data elements to vector locations for execution of the vector instruction by a vector functional unit in a vector data path of a processor.
Tracking streaming engine vector predicates to control processor execution
In a method of operating a computer system, an instruction loop is executed by a processor in which each iteration of the instruction loop accesses a current data vector and an associated current vector predicate. The instruction loop is repeated when the current vector predicate indicates the current data vector contains at least one valid data element and the instruction loop is exited when the current vector predicate indicates the current data vector contains no valid data elements.
Instruction block allocation
Apparatus and methods are disclosed for throttling processor operation in block-based processor architectures. In one example of the disclosed technology, a block-based instruction set architecture processor includes a plurality of processing cores configured to fetch and execute a sequence of instruction blocks. Each of the processing cores includes function resources for performing operations specified by the instruction blocks. The processor further includes a core scheduler configured to allocate functional resources for performing the operations. The functional resources are allocated for executing the instruction blocks based, at least in part, on a performance metric. The performance metric can be generated dynamically or statically based on branch prediction accuracy, energy usage tolerance, and other suitable metrics.
Delivering immediate values by using program counter (PC)-relative load instructions to fetch literal data in processor-based devices
Delivering immediate values by using program counter (PC)-relative load instructions to fetch literal data in processor-based devices is disclosed. In this regard, a processing element (PE) of a processor-based device provides an execution pipeline circuit that comprises an instruction processing portion and a data access portion. Using a literal data access logic circuit, the PE detects a PC-relative load instruction within a fetch window that includes multiple fetched instructions. The PE determines that the PC-relative load instruction can be serviced using literal data that is available to the instruction processing portion of the execution pipeline circuit (e.g., located within the fetch window containing the PC-relative load instruction, or stored in a literal pool buffer), The PE then retrieves the literal data within the instruction processing portion of the execution pipeline circuit, and executes the PC-relative load instruction using the literal data.
METHOD AND APPARATUS FOR VECTOR SORTING
A method for sorting of a vector in a processor is provided that includes performing, by the processor in response to a vector sort instruction, sorting of values stored in lanes of the vector to generate a sorted vector, wherein the values are sorted in an order indicated by the vector sort instruction, and storing the sorted vector in a storage location.