G06F13/161

Scalable System on a Chip

An integrated circuit (IC) including a plurality of processor cores, a plurality of graphics processing units, a plurality of peripheral circuits, and a plurality of memory controllers is configured to support scaling of the system using a unified memory architecture. For example, the IC may include an interconnect fabric configured to provide communication between the one or more memory controller circuits and the processor cores, graphics processing units, and peripheral devices; and an off-chip interconnect coupled to the interconnect fabric and configured to couple the interconnect fabric to a corresponding interconnect fabric on another instance of the integrated circuit, wherein the interconnect fabric and the off-chip interconnect provide an interface that transparently connects the one or more memory controller circuits, the processor cores, graphics processing units, and peripheral devices in either a single instance of the integrated circuit or two or more instances of the integrated circuit.

KERNEL MAPPING TO NODES IN COMPUTE FABRIC
20230059948 · 2023-02-23 ·

A reconfigurable compute fabric can include multiple nodes, and each node can include multiple tiles with respective processing and storage elements. Compute kernels can be parsed into directed graphs and mapped to particular node or tile resources for execution. In an example, a branch-and-bound search algorithm can be used to perform the mapping. The algorithm can use a cost function to evaluate the resources based on capability, occupancy, or power consumption of the various node or tile resources.

Packet processing system, method and device utilizing a port client chain
11586562 · 2023-02-21 · ·

A packet processing system having each of a plurality of hierarchical clients and a packet memory arbiter serially communicatively coupled together via a plurality of primary interfaces thereby forming a unidirectional client chain. This chain is then able to be utilized by all of the hierarchical clients to write the packet data to or read the packet data from the packet memory.

TECHNOLOGIES FOR SWITCHING NETWORK TRAFFIC IN A DATA CENTER

Technologies for switching network traffic include a network switch. The network switch includes one or more processors and communication circuitry coupled to the one or more processors. The communication circuity is capable of switching network traffic of multiple link layer protocols. Additionally, the network switch includes one or more memory devices storing instructions that, when executed, cause the network switch to receive, with the communication circuitry through an optical connection, network traffic to be forwarded, and determine a link layer protocol of the received network traffic. The instructions additionally cause the network switch to forward the network traffic as a function of the determined link layer protocol. Other embodiments are also described and claimed.

NON-POSTED WRITE TRANSACTIONS FOR A COMPUTER BUS

Systems and devices can include a controller and a command queue to buffer incoming write requests into the device. The controller can receive, from a client across a link, a non-posted write request (e.g., a deferred memory write (DMWr) request) in a transaction layer packet (TLP) to the command queue; determine that the command queue can accept the DMWr request; identify, from the TLP, a successful completion (SC) message that indicates that the DMWr request was accepted into the command queue; and transmit, to the client across the link, the SC message that indicates that the DMWr request was accepted into the command queue. The controller can receive a second DMWr request in a second TLP; determine that the command queue is full; and transmit a memory request retry status (MRS) message to be transmitted to the client in response to the command queue being full.

Memory systems and methods including training, data organizing, and/or shadowing
11487433 · 2022-11-01 · ·

Described embodiments include memory systems that may shadow certain data stored in a first memory device (e.g. NAND flash device) onto a second memory device (e.g. DRAM device). Memory systems may train and/or re-organize stored data to facilitate the selection of data to be shadowed. Initial responses to memory commands may be serviced from the first memory device, which may have a lower latency than the second memory device. The remaining data may be serviced from the second memory device. A controller may begin to access the remaining data while the initial response is being provided from the first memory device, which may reduce the apparent latency associated with the second memory device.

SYSTEM, DEVICE, AND METHOD FOR MEMORY INTERFACE INCLUDING RECONFIGURABLE CHANNEL
20230092562 · 2023-03-23 ·

A method of communicating with a memory device through a plurality of sub-channels and a control sub-channel includes; setting a first mode or a second mode. In the first mode, writing or reading first data corresponding to a command synchronized to the control sub-channel through the plurality of sub-channels, and in the second mode, independently writing or reading second data and third data respectively corresponding to different commands synchronized to the control sub-channel through the plurality of sub-channels.

BANDWIDTH ALLOCATION FOR STORAGE SYSTEM COMMANDS IN PEER-TO-PEER ENVIRONMENT

Technology is disclosed for allocating PCIe bus bandwidth to storage commands in a peer-to-peer environment. A non-volatile storage system has a peer-to-peer connection with a host system and a target device, such as a GPU. A memory controller in the storage system monitors latency of PCIe transactions that are performed over a PCIe bus in order to transfer data for NVMe commands. The PCIe transactions may involve direct memory access (DMA) of memory in the host system or target device. There could be a significant difference in transaction latency depending on what memory is being accessed and/or what communication link is used to access the memory. The memory controller allocates bandwidth on a PCIe bus to the NVMe commands based on the latencies of the PCIe transactions. In an aspect, the memory controller groups the PCIe addresses based on the latencies of the PCIe transactions.

Memory system with region-specific memory access scheduling

An integrated circuit device includes a memory controller coupleable to a memory. The memory controller to schedule memory accesses to regions of the memory based on memory timing parameters specific to the regions. A method includes receiving a memory access request at a memory device. The method further includes accessing, from a timing data store of the memory device, data representing a memory timing parameter specific to a region of the memory cell circuitry targeted by the memory access request. The method also includes scheduling, at the memory controller, the memory access request based on the data.

LOW LATENCY RETIMER AND LOW LATENCY CONTROL METHOD

A low latency retimer and a low latency control method are provided; a physical layer module is provided on each of two opposite sides of the retimer; each physical layer module includes at least one set of signal transceiver units including a signal receiving unit and a signal transmitting unit; the signal receiving unit performs a serial-to-parallel conversion on a first high-speed serial signal to generate a parallel signal, and sends the parallel signal to the signal transmitting unit; the signal transmitting unit performs a parallel-to-serial conversion on the parallel signal, to convert the parallel signal to obtain a second high-speed serial signal, and outputs the second high-speed serial signal. Data paths of the retimer form a loopback structure, and the signal transmitting unit and the signal receiving unit are physically adjacent to each other, which solves the problem of signal transmission delay, and avoids high power consumption.