Patent classifications
G06F9/30047
SHARING INSTRUCTION CACHE LINES BETWEEN MULITPLE THREADS
Aspects are provided for sharing instruction cache footprint between multiple threads using instruction cache set/way pointers and a tracking table. The tracking table is built up over time for shared pages, even when the instruction cache has no access to real addresses or translation information. A set/way pointer to an instruction cache line is derived from the system memory address associated with a first thread's instruction fetch. The set/way pointer is stored as a surrogate for the system memory address in both an instruction cache directory (IDIR) and a tracking table. Another set/way pointer to an instruction cache line is derived from the system memory address associated with a second thread's instruction fetch. A match is detected between the set/way pointer and the other set/way pointer. The instruction cache directory is updated to indicate that the instruction cache line is shared between multiple threads.
Refresh-hiding memory system staggered refresh
A computer-implemented method includes refreshing a set of memory channels in a memory system substantially simultaneously, each memory channel refreshing a rank that is distinct from each of the other ranks being refreshed. Further, the method includes marking a memory channel from the set of memory channels as being unavailable for the rank being refreshed in the memory channel. In one or more examples, the method further includes blocking a fetch command to the memory channel for the rank being refreshed in the memory channel.
Arbitration scheme for coherent and non-coherent memory requests
A processor in a system is responsive to a coherent memory request buffer having a plurality of entries to store coherent memory requests from a client module and a non-coherent memory request buffer having a plurality of entries to store non-coherent memory requests from the client module. The client module buffers coherent and non-coherent memory requests and releases the memory requests based on one or more conditions of the processor or one of its caches. The memory requests are released to a central data fabric and into the system based on a first watermark associated with the coherent memory buffer and a second watermark associated with the non-coherent memory buffer.
Global snapshot backups of a distributed name space
Embodiments for enabling snapshot backups in a global name space of a cluster network, by representing the name space of cluster network in an MTree, storing data files organized in a B+ Tree format on one or more data nodes, storing name specific information of the data files in a B+ Tree format in a meta node, wherein a B+ Tree of the meta node accesses each corresponding B+ Tree in each of the one or more data nodes. The process takes snapshot backups of individual MTree limbs, and links the limbs of each snapshot into groups based on a cluster identifier and snapshot identifier.
Sharing instruction cache footprint between multiple threads
Aspects are provided for sharing instruction cache footprint between multiple threads. A set/way pointer to an instruction cache line is derived from a system memory address associated with an instruction fetch from a memory page. It is determined that the instruction cache line is shareable between a first thread and a second thread. An alias table entry is created indicating that other instruction cache lines associated with the memory page are also shareable between threads. Another instruction fetch is received from another thread requesting an instruction from another system memory address associated with the memory page. A further set/way pointer to another instruction cache line is derived from the other system memory address. It is determined that the other instruction cache line is shareable based on the alias table entry.
Sharing instruction cache lines between multiple threads
Aspects are provided for sharing instruction cache footprint between multiple threads using instruction cache set/way pointers and a tracking table. The tracking table is built up over time for shared pages, even when the instruction cache has no access to real addresses or translation information. A set/way pointer to an instruction cache line is derived from the system memory address associated with a first thread's instruction fetch. The set/way pointer is stored as a surrogate for the system memory address in both an instruction cache directory (IDIR) and a tracking table. Another set/way pointer to an instruction cache line is derived from the system memory address associated with a second thread's instruction fetch. A match is detected between the set/way pointer and the other set/way pointer. The instruction cache directory is updated to indicate that the instruction cache line is shared between multiple threads.
STREAMING ENGINE WITH FLEXIBLE STREAMING ENGINE TEMPLATE SUPPORTING DIFFERING NUMBER OF NESTED LOOPS WITH CORRESPONDING LOOP COUNTS AND LOOP OFFSETS
A streaming engine employed in a digital data processor specifies a fixed read only data stream defined by plural nested loops. An address generator produces address of data elements for the nested loops. A steam head register stores data elements next to be supplied to functional units for use as operands. A stream template specifies loop count and loop dimension for each nested loop. A format definition field in the stream template specifies the number of loops and the stream template bits devoted to the loop counts and loop dimensions. This permits the same bits of the stream template to be interpreted differently enabling trade off between the number of loops supported and the size of the loop counts and loop dimensions.
CONTROLLER WITH CACHING AND NON-CACHING MODES
An apparatus includes a CPU core, a first cache subsystem coupled to the CPU core, and a second memory coupled to the cache subsystem. The first cache subsystem includes a configuration register, a first memory, and a controller. The controller is configured to: receive a request directed to an address in the second memory and, in response to the configuration register having a first value, operate in a non-caching mode. In the non-caching mode, the controller is configured to provide the request to the second memory without caching data returned by the request in the first memory. In response to the configuration register having a second value, the controller is configured to operate in a caching mode. In the caching mode the controller is configured to provide the request to the second memory and cache data returned by the request in the first memory.
AUTOMATED DATABASE CACHE RESIZING
A test query of a database is performed in response to determining that a performance associated with a user database query of the database does not satisfy a first performance threshold. In response to a determination that the performance of the test query satisfies a second performance threshold, a database buffer cache of the database is resized. Resizing the database buffer cache includes: determining a metric based at least in part on a storage size of the database and an index side of the database, and resizing the database buffer cache of the database based on the metric and a size of the database buffer cache.
Spatial and temporal merging of remote atomic operations
Disclosed embodiments relate to spatial and temporal merging of remote atomic operations. In one example, a system includes an RAO instruction queue stored in a memory and having entries grouped by destination cache line, each entry to enqueue an RAO instruction including an opcode, a destination identifier, and source data, optimization circuitry to receive an incoming RAO instruction, scan the RAO instruction queue to detect a matching enqueued RAO instruction identifying a same destination cache line as the incoming RAO instruction, the optimization circuitry further to, responsive to no matching enqueued RAO instruction being detected, enqueue the incoming RAO instruction; and, responsive to a matching enqueued RAO instruction being detected, determine whether the incoming and matching RAO instructions have a same opcode to non-overlapping cache line elements, and, if so, spatially combine the incoming and matching RAO instructions by enqueuing both RAO instructions in a same group of cache line queue entries at different offsets.