G06F12/1081

Tensor data distribution using grid direct-memory access (DMA) controller

In one embodiment, a method for tensor data distribution using a direct-memory access agent includes generating, by a first controller, source addresses indicating locations in a source memory where portions of a source tensor are stored. A second controller may generate destination addresses indicating locations in a destination memory where portions of a destination tensor are to be stored. The direct-memory access agent receives a source address generated by the first controller and a destination address generated by the second controller and determines a burst size. The direct-memory access agent may issue a read request comprising the source address and the burst size to read tensor data from the source memory and may store the tensor data into an alignment buffer. The direct-memory access agent then issues a write request comprising the destination address and the burst size to write data from the alignment buffer into the destination memory.

Tensor data distribution using grid direct-memory access (DMA) controller

In one embodiment, a method for tensor data distribution using a direct-memory access agent includes generating, by a first controller, source addresses indicating locations in a source memory where portions of a source tensor are stored. A second controller may generate destination addresses indicating locations in a destination memory where portions of a destination tensor are to be stored. The direct-memory access agent receives a source address generated by the first controller and a destination address generated by the second controller and determines a burst size. The direct-memory access agent may issue a read request comprising the source address and the burst size to read tensor data from the source memory and may store the tensor data into an alignment buffer. The direct-memory access agent then issues a write request comprising the destination address and the burst size to write data from the alignment buffer into the destination memory.

Same-machine remote direct memory operations (RDMOS)

Techniques are described for offloading remote direct memory operations (RDMOs) to “execution candidates”. The execution candidates may be any hardware capable of performing the offloaded operation. Thus, the execution candidates may be network interface controllers, specialized co-processors, FPGAs, etc. The execution candidates may be on a machine that is remote from the processor that is offloading the operation, or may be on the same machine as the processor that is offloading the operation. Details for certain specific RDMOs, which are particularly useful in online transaction processing (OLTP) and hybrid transactional/analytical (HTAP) workloads, are provided.

Same-machine remote direct memory operations (RDMOS)

Techniques are described for offloading remote direct memory operations (RDMOs) to “execution candidates”. The execution candidates may be any hardware capable of performing the offloaded operation. Thus, the execution candidates may be network interface controllers, specialized co-processors, FPGAs, etc. The execution candidates may be on a machine that is remote from the processor that is offloading the operation, or may be on the same machine as the processor that is offloading the operation. Details for certain specific RDMOs, which are particularly useful in online transaction processing (OLTP) and hybrid transactional/analytical (HTAP) workloads, are provided.

LOW LATENCY EFFICIENT SHARING OF RESOURCES IN MULTI-SERVER ECOSYSTEMS
20180011807 · 2018-01-11 ·

A method is provided in one example embodiment and includes receiving by a network element a request from a network device connected to the network element to update a shared resource maintained by the network element; subsequent to the receipt, identifying a Base Address Register Resource Table (“BRT”) element assigned to a Peripheral Component Interconnect (“PCI”) adapter of the network element associated with the network device, wherein the BRT points to the shared resource; changing an attribute of the identified BRT from read-only to read/write to enable the identified BRT to be written by the network device; and notifying the network device that the attribute of the identified BRT has been changed, thereby enabling the network device to update the shared resource via a Base Address Register (“BAR”) comprising the identified BRT.

RESTRICTED ADDRESS TRANSLATION TO PROTECT AGAINST DEVICE-TLB VULNERABILITIES

An apparatus includes an extended capability register and an input/output (I/O) memory management circuitry. The I/O memory management circuitry is to receive, from an I/O device, an address translation request referencing a guest virtual address associated with a guest virtual address space of a virtual machine. The I/O memory management circuitry may translate the guest virtual address to a guest physical address associated with a guest physical address space of the virtual machine, and, responsive to determining that a value stored by the extended capability register indicates a restrict-translation-request-response (RTRR) mode, transmit, to the I/O device, a translation response having the guest physical address.

Mitigating read disturb effects in memory devices

A die read counter and a block read counter are maintained for a specified block of a memory device. An estimated number of read events associated with the specified block is determined based on a value of the block read counter and a value of the die read counter. Responsive to determining that the estimated number of read events satisfies a criterion, a media management operation of one or more pages associated with the specified block is performed.

Mitigating read disturb effects in memory devices

A die read counter and a block read counter are maintained for a specified block of a memory device. An estimated number of read events associated with the specified block is determined based on a value of the block read counter and a value of the die read counter. Responsive to determining that the estimated number of read events satisfies a criterion, a media management operation of one or more pages associated with the specified block is performed.

Method of dynamically configuring FPGA and network security device

Provided are a method of dynamically configuring a FPGA and a network security device. The network security device includes a CPU and at least one FPGA coupled with the CPU. The CPU generates a configuration entry for a target FPGA in response to a user instruction. The configuration entry includes a classification number and a configuration content for the target FPGA. The CPU sends the configuration entry to each FPGA coupled with the CPU, Each FPGA obtains its own classification number, compares its own classification number with the classification number in the configuration entry, and stores the configuration content when the own classification number the same with the classification number in the configuration entry.

Method of dynamically configuring FPGA and network security device

Provided are a method of dynamically configuring a FPGA and a network security device. The network security device includes a CPU and at least one FPGA coupled with the CPU. The CPU generates a configuration entry for a target FPGA in response to a user instruction. The configuration entry includes a classification number and a configuration content for the target FPGA. The CPU sends the configuration entry to each FPGA coupled with the CPU, Each FPGA obtains its own classification number, compares its own classification number with the classification number in the configuration entry, and stores the configuration content when the own classification number the same with the classification number in the configuration entry.