G06F2212/1048

Process-based multi-key total memory encryption

Systems, methods, and circuitries are disclosed for a per-process memory encryption system. At least one translation lookaside buffer (TLB) is configured to encode key identifiers for keys in one or more bits of either the virtual memory address or the physical address. The process state memory configured to store a first process key table for a first process that maps key identifiers to unique keys and a second process key table that maps the key identifiers to different unique keys. The active process key table memory configured to store an active key table. In response to a request for data corresponding to a virtual memory address, the at least one TLB is configured to provide a key identifier for the data to the active process key table to cause the active process key table to return the unique key mapped to the key identifier.

METHODS FOR EXTENDING A PROOF-OF-SPACE-TIME BLOCKCHAIN
20230044605 · 2023-02-09 ·

A method for extending a blockchain comprises, at a space server: allocating an amount of drive storage for generating proofs-of-space; or accessing a first challenge based on a prior block of the blockchain, the prior block comprising a first proof-of-space and a first proof-of-time; in response to accessing the first challenge, generating a second proof-of-space based on the first challenge and the amount of drive storage, the second proof-of-space indicating allocation of the amount of drive storage; accessing a second proof-of-time based on the prior block and indicating a first time delay elapsed after extension of the blockchain with the prior block; generating a new block comprising the second proof-of-space and the second proof-of-time; and broadcasting the new block over a distributed network.

PROCESS-BASED MULTI-KEY TOTAL MEMORY ENCRYPTION

Systems, methods, and circuitries are disclosed for a per-process memory encryption system. At least one translation lookaside buffer (TLB) is configured to encode key identifiers for keys in one or more bits of either the virtual memory address or the physical address. The process state memory configured to store a first process key table for a first process that maps key identifiers to unique keys and a second process key table that maps the key identifiers to different unique keys. The active process key table memory configured to store an active key table. In response to a request for data corresponding to a virtual memory address, the at least one TLB is configured to provide a key identifier for the data to the active process key table to cause the active process key table to return the unique key mapped to the key identifier.

CACHE ARCHITECTURES FOR MEMORY DEVICES
20230100015 · 2023-03-30 ·

Methods, systems, and devices for cache architectures for memory devices are described. For example, a memory device may include a main array having a first set of memory cells, a cache having a second set of memory cells, and a cache delay register configured to store an indication of cache addresses associated with recently performed access operations. In some examples, the cache delay register may be operated as a first-in-first-out (FIFO) register of cache addresses, where a cache address associated with a performed access operation may be added to the beginning of the FIFO register, and a cache address at the end of the FIFO register may be purged. Information associated with access operations on the main array may be maintained in the cache, and accessed directly (e.g., without another accessing of the main array), at least as long as the cache address is present in the cache delay register.

SHARED PREFETCH INSTRUCTION AND SUPPORT

Techniques for shared data prefetch are described. An exemplary instruction for shared data prefetch includes at least one field for an opcode, at least one field for a source operand to provide a memory address at least a byte of data, wherein the opcode is to indicate that circuitry is to fetch of a line of data from memory at the provided address that contains the byte specified with the source operand and store that byte in at least a cache local to a requester, wherein the byte of data is to be stored in a shared state.

GARBAGE COLLECTION OF REDUNDANT PARTITIONS
20230096276 · 2023-03-30 ·

A method, system, and computer program product for garbage collection of redundant partitions in distributed data management systems are provided. The method stores data across a set of nodes with the data being stored using one or more partitions and the data and the one or more partitions are replicated across the set of nodes. A first partition is determined to be stale at a first node of the set of nodes. The first partition is marked for deletion locally at the first node. A set of deletion votes are determined for the first partition with each node being associated with a deletion vote. The method determines a deletion decision for the first partition on the first node based on the set of deletion votes.

Systems and methods for implementing coherent memory in a multiprocessor system

Data units are stored in private caches in nodes of a multiprocessor system, each node containing at least one processor (CPU), at least one cache private to the node and at least one cache location buffer (CLB) private to the node. In each CLB location information values are stored, each location information value indicating a location associated with a respective data unit, wherein each location information value stored in a given CLB indicates the location to be either a location within the private cache disposed in the same node as the given CLB, to be a location in one of the other nodes, or to be a location in a main memory. Coherence of values of the data units is maintained using a cache coherence protocol. The location information values stored in the CLBs are updated by the cache coherence protocol in accordance with movements of their respective data units.

MAINTAINING AN ACTIVE TRACK DATA STRUCTURE TO DETERMINE ACTIVE TRACKS IN CACHE TO PROCESS

Provided are a computer program product for managing tracks in a storage in a cache. An active track data structure indicates tracks in the cache that have an active status. An active bit in a cache control block for a track is set to indicate active for the track indicated as active in the active track data structure. In response to processing the cache control block, a determination is made, from the cache control block for the track, whether the track is active or inactive to determine processing for the cache control block.

SYSTEM AND METHOD FOR IMPLEMENTING A NETWORK-INTERFACE-BASED ALLREDUCE OPERATION

An apparatus is provided that includes a network interface to transmit and receive data packets over a network; a memory including one or more buffers; an arithmetic logic unit to perform arithmetic operations for organizing and combining the data packets; and a circuitry to receive, via the network interface, data packets from the network; aggregate, via the arithmetic logic unit, the received data packets in the one or more buffers at a network rate; and transmit, via the network interface, the aggregated data packets to one or more compute nodes in the network, thereby optimizing latency incurred in combining the received data packets and transmitting the aggregated data packets, and hence accelerating a bulk data allreduce operation. One embodiment provides a system and method for performing the allreduce operation. During operation, the system performs the allreduce operation by pacing network operations for enhancing performance of the allreduce operation.

COMPUTING ARCHITECTURE
20220350745 · 2022-11-03 ·

Computing architecture comprises an off-chip memory, an on-chip cache unit, a prefetching unit, a global scheduler, a transmitting unit, a pre-recombination network, a post-recombination network, a main computing array, a write-back cache unit, a data dependence controller and an auxiliary computing array. The architecture reads data tiles into an on-chip cache in a prefetching mode, and performs computing according to the data tiles; in the computing process of the tiles, a tile exchange network is adopted to recombine a data structure, and a data dependence module is arranged to process a data dependence relationship possibly existing between different tiles. According to the computing architecture, the data utilization rate can be increased, the data processing flexibility is improved, and therefore Cache Miss is reduced, and the memory bandwidth pressure is reduced.