Patent classifications
G06F13/1615
Achieving high bandwidth on ordered direct memory access write stream into a processor cache
Embodiments include methods, systems and computer program products method for maintaining ordered memory access with parallel access data streams associated with a distributed shared memory system. The computer-implemented method includes performing, by a first cache, a key check, the key check being associated with a first ordered data store. A first memory node signals that the first memory node is ready to begin pipelining of a second ordered data store into the first memory node to an input/output (I/O) controller. A second cache returns a key response to the first cache indicating that the pipelining of the second ordered data store can proceed. The first memory node sends a ready signal indicating that the first memory node is ready to continue pipelining of the second ordered data store into the first memory node to the I/O controller, wherein the ready signal is triggered by receipt of the key response.
INTERCONNECT AND METHOD OF HANDLING SUPPLEMENTARY DATA IN AN INTERCONNECT
An interconnect, and method of handling supplementary data in an interconnect, are provided. The interconnect has routing circuitry providing a plurality of paths, and routing control circuitry to use the plurality of paths to establish routes through the interconnect between source devices and destination devices coupled to the interconnect, to enable system data to be routed through the interconnect between the source devices and the destination devices. The system data relates to functional operation of a system comprising the interconnect, the source devices and the destination devices. At least a subset of the paths are redundant paths whose use by the routing control circuitry provides the system data with resilience to faults when routing the system data through the interconnect. The routing control circuitry is responsive to supplementary data which is unnecessary to ensure the functional operation of the system, to establish a supplementary data route through the interconnect to a supplementary data receiving circuit, such that the supplementary data route employs at least one of the redundant paths that is not required to provide resilience for the system data at a time the at least one of the redundant paths is used for the supplementary data route. This provides an efficient mechanism for transporting supplementary data, whilst ensuring non-intrusive behaviour.
Self-addressing memory
Techniques are disclosed relating to self-addressing memory. In one embodiment, an apparatus includes a memory and addressing circuitry coupled to or comprised in the memory. In this embodiment, the addressing circuitry is configured to receive memory access requests corresponding to a specified sequence of memory accesses. In this embodiment, the memory access requests do not include address information. In this embodiment, the addressing circuitry is further configured to assign addresses to the memory access requests for the specified sequence of memory accesses. In some embodiments, the apparatus is configured to perform the memory access requests using the assigned addresses.
TECHNOLOGIES FOR PROVIDING I/O CHANNEL ABSTRACTION FOR ACCELERATOR DEVICE KERNELS
Technologies for providing I/O channel abstraction for accelerator device kernels include an accelerator device comprising circuitry to obtain availability data indicative of an availability of one or more accelerator device kernels in a system, including one or more physical communication paths to each accelerator device kernel. The circuitry is also configured to receive a request to establish a logical communication path between a kernel of the present accelerator device and another accelerator device kernel and establish, in response to the request and as a function of the obtained availability data, the logical communication path between the kernel of the present accelerator device and the other accelerator device kernel.
MEMORY DEVICE INTERFACE COMMUNICATING WITH SET OF DATA BURSTS CORRESPONDING TO MEMORY DIES VIA DEDICATED PORTIONS FOR COMMAND PROCESSING
A first command associated with a first memory die is communicated via a first portion of an interface of the memory sub-system. A second command associated with a second memory die is communicated via the first portion of the interface to a second memory die. A data burst corresponding to the first memory die is caused to be communicated via a second portion of the interface, where the second command is communicated via the first portion of the interface concurrently with the data burst communicated via the second portion of the interface.
HOST-BASED AND CLIENT-BASED COMMAND SCHEDULING IN LARGE BANDWIDTH MEMORY SYSTEMS
A high-bandwidth memory (HBM) system includes an HBM device and a logic circuit. The logic circuit receives a first command from the host device and converts the received first command to a processing-in-memory (PIM) command that is sent to the HBM device through the second interface. A time between when the first command is received from the host device and when the HBM system is ready to receive another command from the host device is deterministic. The logic circuit further receives a fourth command and a fifth command from the host device. The fifth command requests time-estimate information relating to a time between when the fifth command is received and when the HBM system is ready to receive another command from the host device. The time-estimate information includes a deterministic period of time and an estimated period of time for a non-deterministic period of time.
QUASI-SYNCHRONOUS PROTOCOL FOR LARGE BANDWIDTH MEMORY SYSTEMS
A high-bandwidth memory (HBM) system includes an HBM device and a logic circuit. The logic circuit includes a first interface coupled to a host device and a second interface coupled to the HBM device. The logic circuit receives a first command from the host device through the first interface and converts the received first command to a first processing-in-memory (PIM) command that is sent to the HBM device through the second interface. The first PIM command has a deterministic latency for completion. The logic circuit further receives a second command from the host device through the first interface and converting the received second command to a second PIM command that is sent to the HBM device through the second interface. The second PIM command has a non-deterministic latency for completion.
HIGH-BANDWIDTH, LOW-LATENCY, ISOCHORONOUS FABRIC FOR GRAPHICS ACCELERATOR
Techniques are provided for low-latency, high bandwidth graphics accelerator die and memory system. In an example, a graphics accelerator die can include a plurality of memory blocks for storing graphic information, a display engine configured to request and receive the graphic information from the plurality of memory blocks for transfer to a display, a graphics engine configured to generate and transfer the graphic information to the plurality of memory blocks, and a high-bandwidth, low-latency isochronous fabric configured to arbitrate the transfer and reception of the graphic information.
ACHIEVING HIGH BANDWIDTH ON ORDERED DIRECT MEMORY ACCESS WRITE STREAM INTO A PROCESSOR CACHE
Embodiments include methods, systems and computer program products method for maintaining ordered memory access with parallel access data streams associated with a distributed shared memory system. The computer-implemented method includes performing, by a first cache, a key check, the key check being associated with a first ordered data store. A first memory node signals that the first memory node is ready to begin pipelining of a second ordered data store into the first memory node to an input/output (I/O) controller. A second cache returns a key response to the first cache indicating that the pipelining of the second ordered data store can proceed. The first memory node sends a ready signal indicating that the first memory node is ready to continue pipelining of the second ordered data store into the first memory node to the I/O controller, wherein the ready signal is triggered by receipt of the key response.
Memory system
A memory system includes: a plurality of first memory devices directly or indirectly coupled to one another, each first memory device including a first memory and a first memory controller suitable for controlling the first memory to store data; a second memory device including a second memory and a second memory controller suitable for controlling the second memory to store data; and a multi-processor including a plurality of processors, each processor executing an operating system (OS) and an application to access a data storage memory through the first and second memory devices.