G06F12/0862

Streaming engine with early and late address and loop count registers to track architectural state

A streaming engine employed in a digital data processor specifies a fixed read only data stream defined by plural nested loops. An address generator produces address of data elements. A steam head register stores data elements next to be supplied to functional units for use as operands. The streaming engine stores an early address of next to be fetched data elements and a late address of a data element in the stream head register for each of the nested loops. The streaming engine stores an early loop counts of next to be fetched data elements and a late loop counts of a data element in the stream head register for each of the nested loops.

MEMORY SYSTEM FOR ACCELERATING GRAPH NEURAL NETWORK PROCESSING
20230026824 · 2023-01-26 ·

A memory system for accelerating graph neural network processing can include an on-host chip memory to cache data needed for processing a current root node. The system can also include a volatile memory interface between the host and non-volatile memory. The volatile memory can be configured to save one or more sets of next root nodes, neighbor nodes and corresponding attributes. The non-volatile memory can have sufficient capacity to store the entire graph data. The non-volatile memory can also be configured to pre-arrange the sets of next root nodes, neighbor nodes and corresponding attributes for storage in the volatile memory.

MEMORY SYSTEM FOR ACCELERATING GRAPH NEURAL NETWORK PROCESSING
20230026824 · 2023-01-26 ·

A memory system for accelerating graph neural network processing can include an on-host chip memory to cache data needed for processing a current root node. The system can also include a volatile memory interface between the host and non-volatile memory. The volatile memory can be configured to save one or more sets of next root nodes, neighbor nodes and corresponding attributes. The non-volatile memory can have sufficient capacity to store the entire graph data. The non-volatile memory can also be configured to pre-arrange the sets of next root nodes, neighbor nodes and corresponding attributes for storage in the volatile memory.

DATA DEDUPLICATION LATENCY REDUCTION

Aspects of the present disclosure relate to reducing the latency of data deduplication. In embodiments, an input/output (IO) workload received by a storage array is monitored. Further, at least one IO write operation in the IO workload is identified. A space-efficient probabilistic data structure is used to determine if a director board is associated with the IO write. Additionally, the IO write operation is processed based on the determination.

Bottom-up Pre-emptive Cache Update in a Multi-level Redundant Cache System
20230022351 · 2023-01-26 ·

Embodiments for providing cache updates in a hierarchical multi-node system, through a service component between a lower level component and a next higher level component by maintaining a ledger storing an incrementing entry number indicating a present state of the datasets in a cache of the lower level component. The service component receives a data request to the lower level component from the higher level component including an appended last entry number accessed by the higher level component, and determines if the appended last entry number matches a current entry number in the ledger for any requested dataset, wherein no match indicates that at least some data in the higher level component cache is stale. In which case, it sends updated data information for the stale data to the higher level component, while the higher level component invalidates its cache entries and updates the appended last entry number to match a current entry number in the ledger.

DATA MANAGEMENT METHOD AND COMPUTER-READABLE RECORDING MEDIUM STORING DATA MANAGEMENT PROGRAM
20230229597 · 2023-07-20 · ·

A data management method causes a computer to execute processing including: creating, when a predetermined data processing program performs data processing, based on an access frequency to a data store, high-frequency state item list information obtained by listing high-frequency state items of which the access frequency is high; determining, when state information that includes a value of the high-frequency state item is written to the data store, whether or not the state information corresponds to the high-frequency state item with reference to the high-frequency state item list information; grouping and writing pieces of the state information of a plurality of the high-frequency state item.

Memory access communications through message passing interface implemented in memory systems

A memory system having a plurality of memory components and a controller, operatively coupled to the plurality of memory components to: store data in the memory components; communicate with a host system via a bus; service the data to the host system via communications over the bus; communicate with a processing device that is separate from the host system using a message passing interface over the bus; and provide data access to the processing device through communications made using the message passing interface over the bus.

METHOD AND APPARATUS TO SORT A VECTOR FOR A BITONIC SORTING ALGORITHM
20230229448 · 2023-07-20 ·

A method is provided that includes performing, by a processor in response to a vector sort instruction, sorting of values stored in lanes of the vector to generate a sorted vector, wherein the values in a first portion of the lanes are sorted in a first order indicated by the vector sort instruction and the values in a second portion of the lanes are sorted in a second order indicated by the vector sort instruction; and storing the sorted vector in a storage location.

PRODUCER PREFETCH FILTER

Indirect prefetch circuitry initiates a producer prefetch requesting return of producer data having a producer address and at least one consumer prefetch to request prefetching of consumer data having a consumer address derived from the producer data. A producer prefetch filter table stores producer filter entries indicative of previous producer addresses of previous producer prefetches. Initiation of a requested producer prefetch for producer data having a requested producer address is suppressed when a lookup of the producer prefetch filter table determines that the requested producer address hits against a producer filter entry of the table. The lookup of the producer prefetch filter table for the requested producer address depends on a subset of bits of the requested producer address including at least one bit which distinguishes different chunks of data within a same cache line.

PRODUCER PREFETCH FILTER

Indirect prefetch circuitry initiates a producer prefetch requesting return of producer data having a producer address and at least one consumer prefetch to request prefetching of consumer data having a consumer address derived from the producer data. A producer prefetch filter table stores producer filter entries indicative of previous producer addresses of previous producer prefetches. Initiation of a requested producer prefetch for producer data having a requested producer address is suppressed when a lookup of the producer prefetch filter table determines that the requested producer address hits against a producer filter entry of the table. The lookup of the producer prefetch filter table for the requested producer address depends on a subset of bits of the requested producer address including at least one bit which distinguishes different chunks of data within a same cache line.