Patent classifications
G06F9/312
Methods and systems for using state vector data in a state machine engine
A state machine engine includes a state vector system. The state vector system includes an input buffer configured to receive state vector data from a restore buffer and to provide state vector data to a state machine lattice. The state vector system also includes an output buffer configured to receive state vector data from the state machine lattice and to provide state vector data to a save buffer.
Processors, methods, systems, and instructions to selectively fence only persistent storage of given data relative to subsequent stores
A processor of an aspect includes a decode unit to decode a persistent store fence instruction. The processor also includes a memory subsystem module coupled with the decode unit. The memory subsystem module, in response to the persistent store fence instruction, is to ensure that a given data corresponding to the persistent store fence instruction is stored persistently in a persistent storage before data of all subsequent store instructions is stored persistently in the persistent storage. The subsequent store instructions occur after the persistent store fence instruction in original program order. Other processors, methods, systems, and articles of manufacture are also disclosed.
Data transfer device, data receiving system and data receiving method
The data transfer device receives file data through a network and writes the file data to a storage device on a block basis, in which the communication processor receives a receiving instruction for the file data from an external host CPU and receives the file data via one or more packets from the network according to the receiving instruction, the receiving buffer stores data received in the communication processor therein upon receiving each packet, and the command issuance controller acquires, from the external host CPU, map information indicating a position to write the file data in a storage area of the storage device, specifies data in the receiving buffer for writing according to a data storing status of the receiving buffer, issues a write command for writing a specified data to the storage device and sends it to the storage device.
System and method for store fusion
Described herein is a system and method for store fusion that fuses small store operations into fewer, larger store operations. The system detects that a pair of adjacent operations are consecutive store operations, where the adjacent micro-operations refers to micro-operations flowing through adjacent dispatch slots and the consecutive store micro-operations refers to both of the adjacent micro-operations being store micro-operations. The consecutive store operations are then reviewed to determine if the data sizes are the same and if the store operation addresses are consecutive. The two store operations are then fused together to form one store operation with twice the data size and one store data HI operation.
Multi-multidimensional computer architecture for big data applications
A data processing apparatus is provided comprising a front-end interface electronically coupled to a main processor. The front-end interface is configured to receive data stored in a repository, in particular an external storage and/or a network, determine whether the data is a single-access data or a multiple-access data by analyzing an access parameter designating the data, route the multiple-access data for processing by the main processor, and route the single-access data for pre-processing by the front-end interface and routing results of the pre-processing to the main processor.
Hazard detection of out-of-order execution of load and store instructions in processors without using real addresses
Technical solutions are described for hazard detection of out-of-order execution of load and store instructions without using real addresses in a processing unit. An example includes an out-of-order load-store unit (LSU) for transferring data between memory and registers. The LSU detects a store-hit-load (SHL) in an out-of-order execution of instructions based only on effective addresses by: determining an effective address associated with a store instruction; determining whether a load instruction entry using said effective address is present in a load reorder queue; and indicating that a SHL has been detected based at least in part on determining that load instruction entry using said effective address is present in the load reorder queue. The LSU, in response to detecting the SHL, flushes instructions starting from a load instruction corresponding to the load instruction entry.
Method and apparatus for detecting memory conflicts using distinguished memory addresses
A method and apparatus for detecting potential memory conflicts in a parallel computing environment by executing two parallel program threads. The parallel program threads include special operands that are used by a processing core to identify memory addresses that have the potential for conflict. These memory addresses are combined into a composite access record for each thread. The composite access records are compared to each other in order to detect a potential memory conflict.
Executing load-store operations without address translation hardware per load-store unit port
Technical solutions are described for out-of-order (OoO) execution of one or more instructions by a processing unit includes receiving, by a load-store unit (LSU) of the processing unit, an OoO window of instructions including a plurality of instructions to be executed OoO, and issuing, by the LSU, instructions from the OoO window. The issuing includes selecting an instruction from the OoO window, the instruction using an effective address. Further, in response to the instruction being a load instruction, it is determined whether the effective address is present in an effective address directory (EAD). In response to the effective address being present in the EAD, the load instruction is issued using the effective address. Further, in response to the instruction being a store instruction, a real address mapped to the effective address is determined from an effective-real translation (ERT) table, and the store instruction is issued using the real address.
Processor and method for tracking progress of gathering/scattering data element pairs in different cache memory banks
Methods and apparatus are disclosed for accessing multiple data cache lines for scatter/gather operations. Embodiment of apparatus may comprise address generation logic to generate an address from an index of a set of indices for each of a set of corresponding mask elements having a first value. Line or bank match ordering logic matches addresses in the same cache line or different banks, and orders an access sequence to permit a group of addresses in multiple cache lines and different banks. Address selection logic directs the group of addresses to corresponding different banks in a cache to access data elements in multiple cache lines corresponding to the group of addresses in a single access cycle. A disassembly/reassembly buffer orders the data elements according to their respective bank/register positions, and a gather/scatter finite state machine changes the values of corresponding mask elements from the first value to a second value.
Processors, methods, systems, and instructions to load multiple data elements to destination storage locations other than packed data registers
A processor of an aspect includes a plurality of packed data registers, and a decode unit to decode an instruction. The instruction is to indicate a packed data register of the plurality of packed data registers that is to store a source packed memory address information. The source packed memory address information is to include a plurality of memory address information data elements. An execution unit is coupled with the decode unit and the plurality of packed data registers, the execution unit, in response to the instruction, is to load a plurality of data elements from a plurality of memory addresses that are each to correspond to a different one of the plurality of memory address information data elements, and store the plurality of loaded data elements in a destination storage location. The destination storage location does not include a register of the plurality of packed data registers.