Patent classifications
G06F9/321
METHOD AND SYSTEM FOR OPTIMIZING DATA TRANSFER FROM ONE MEMORY TO ANOTHER MEMORY
A method and system for moving data from a source memory to a destination memory by a processor is disclosed herein. The destination memory stores a sequence of instructions and the sequence of instructions comprises one or more load instructions and one or more store instructions. The processor initially moves the one or more store instructions from the destination memory to the source memory. The processor then executes the one or more load instructions from the destination memory. On executing the one or more load instructions, the data is loaded from the source memory to at least one register in the processor. The processor further initiates execution of the one or more store instructions stored in the source memory. On executing the one or more store instructions from the source memory, the processor stores the data from the at least one register to the destination memory.
Techniques for efficiently transferring data to a processor
A technique for block data transfer is disclosed that reduces data transfer and memory access overheads and significantly reduces multiprocessor activity and energy consumption. Threads executing on a multiprocessor needing data stored in global memory can request and store the needed data in on-chip shared memory, which can be accessed by the threads multiple times. The data can be loaded from global memory and stored in shared memory using an instruction which directs the data into the shared memory without storing the data in registers and/or cache memory of the multiprocessor during the data transfer.
Mergeable counter system and method
A system includes a first counter configured to increment or decrement in response to a triggering event. The first counter is sized to overflow. The system also includes a second counter configured to increment or decrement in response to a triggering event. The first counter and the second counter are merged to form a third counter in response to detecting an overflow triggering event for the first counter. A merge bit indicative of whether the first counter and the second counter are merged changes value in response to merging the first counter and the second counter.
Dropped write detection and correction
Implementations for dropped write detection and correction are described. An example method includes receiving a write command comprising data and associated metadata; increasing a value of a monotonic counter; generating updated metadata by adding the counter value to the metadata; atomically writing (a) the data and a first instance of the updated metadata to a first storage device, and (b) a second instance of the updated metadata to a second storage device; receiving a read request for the data; reading the first instance of the updated metadata from the first storage device; reading the second instance of the updated metadata from a second storage device; comparing the instances of metadata and the counter values within each instance of metadata; determining whether the first counter value matches the second counter value; and determining whether a dropped write has occurred based on whether the first counter values matches the second counter value.
Data encryption based on immutable pointers
Technologies disclosed herein provide cryptographic computing. An example processor includes a core to execute an instruction, where the core includes a register to store a pointer to a memory location and a tag associated with the pointer. The tag indicates whether the pointer is at least partially immutable. The core also includes circuitry to access the pointer and the tag associated with the pointer, determine whether the tag indicates that the pointer is at least partially immutable. The circuitry is further, based on a determination that the tag indicates the pointer is at least partially immutable, to obtain a memory address of the memory location based on the pointer, use the memory address to access encrypted data at the memory location, and decrypt the encrypted data based on a key and a tweak, where the tweak including one or more bits based, at least in part, on the pointer.
Interrupt handling method, computer system, and non-transitory storage medium that resumes waiting threads in response to interrupt signals from I/O devices
The present invention provides an interrupt handling system for handling interrupts in a computer system is provided. The interrupt handling system captures and processes the interrupts in a user space of the computer system. The present invention also provides for an interrupt registration method that facilitates interrupt handling in the user space during porting of user applications from one platform to another.
Way predictor and enable logic for instruction tightly-coupled memory and instruction cache
Disclosed herein are systems and method for instruction tightly-coupled memory (iTIM) and instruction cache (iCache) access prediction. A processor may use a predictor to enable access to the iTIM or the iCache and a particular way (a memory structure) based on a location state and program counter value. The predictor may determine whether to stay in an enabled memory structure, move to and enable a different memory structure, or move to and enable both memory structures. Stay and move predictions may be based on whether a memory structure boundary crossing has occurred due to sequential instruction processing, branch or jump instruction processing, branch resolution, and cache miss processing. The program counter and a location state indicator may use feedback and be updated each instruction-fetch cycle to determine which memory structure(s) needs to be enabled for the next instruction fetch.
Method and system for hard ware-assisted pre-execution
One aspect provides a system for hardware-assisted pre-execution. During operation, the system determines a pre-execution code region comprising one or more instructions. The system increments a global counter upon initiating the one or more instructions. The system issues a first instruction, which involves setting, in a first entry for the first instruction in a data structure, a first prefetch region identifier with a current value of the global counter. Responsive to a head pointer of the data structure reaching the first entry, the system: determines, based on a non-zero value for the first prefetch region identifier, that the first entry is not available to be allocated; and advances the head pointer to a next entry in the data structure, which renders a load associated with the first entry as a non-blocking load. The system resets the global counter upon completing the one or more instructions.
LIGHTWEIGHT ENCRYPTION
Briefly, an encryption/decryption algorithm providing for consistent encryption entropy and encryption/decryption performance that is independent of the type of input data.
Generation and use of memory access instruction order encodings
Apparatus and methods are disclosed for controlling execution of memory access instructions in a block-based processor architecture using a hardware structure that indicates a relative ordering of memory access instruction in an instruction block. In one example of the disclosed technology, a method of executing an instruction block having a plurality of memory load and/or memory store instructions includes selecting a next memory load or memory store instruction to execute based on dependencies encoded within the block, and on a store vector that stores data indicating which memory load and memory store instructions in the instruction block have executed. The store vector can be masked using a store mask. The store mask can be generated when decoding the instruction block, or copied from an instruction block header. Based on the encoded dependencies and the masked store vector, the next instruction can issue when its dependencies are available.