G06F13/28

USING A VECTOR PROCESSOR TO CONFIGURE A DIRECT MEMORY ACCESS SYSTEM FOR FEATURE TRACKING OPERATIONS IN A SYSTEM ON A CHIP

In various examples, a VPU and associated components may be optimized to improve VPU performance and throughput. For example, the VPU may include a min/max collector, automatic store predication functionality, a SIMD data path organization that allows for inter-lane sharing, a transposed load/store with stride parameter functionality, a load with permute and zero insertion functionality, hardware, logic, and memory layout functionality to allow for two point and two by two point lookups, and per memory bank load caching capabilities. In addition, decoupled accelerators may be used to offload VPU processing tasks to increase throughput and performance, and a hardware sequencer may be included in a DMA system to reduce programming complexity of the VPU and the DMA system. The DMA and VPU may execute a VPU configuration mode that allows the VPU and DMA to operate without a processing controller for performing dynamic region based data movement operations.

USING A VECTOR PROCESSOR TO CONFIGURE A DIRECT MEMORY ACCESS SYSTEM FOR FEATURE TRACKING OPERATIONS IN A SYSTEM ON A CHIP

In various examples, a VPU and associated components may be optimized to improve VPU performance and throughput. For example, the VPU may include a min/max collector, automatic store predication functionality, a SIMD data path organization that allows for inter-lane sharing, a transposed load/store with stride parameter functionality, a load with permute and zero insertion functionality, hardware, logic, and memory layout functionality to allow for two point and two by two point lookups, and per memory bank load caching capabilities. In addition, decoupled accelerators may be used to offload VPU processing tasks to increase throughput and performance, and a hardware sequencer may be included in a DMA system to reduce programming complexity of the VPU and the DMA system. The DMA and VPU may execute a VPU configuration mode that allows the VPU and DMA to operate without a processing controller for performing dynamic region based data movement operations.

MEMORY DEVICE FOR WAFER-ON-WAFER FORMED MEMORY AND LOGIC

A memory device includes an array of memory cells configured on a die or chip and coupled to sense lines and access lines of the die or chip and a respective sense amplifier configured on the die or chip coupled to each of the sense lines. Each of a plurality of subsets of the sense lines is coupled to a respective local input/output (I/O) line on the die or chip for communication of data on the die or chip and a respective transceiver associated with the respective local I/O line, the respective transceiver configured to enable communication of the data to one or more device off the die or chip.

Quasi-volatile system-level memory

A high-capacity system memory may be built from both quasi-volatile (QV) memory circuits, logic circuits, and static random-access memory (SRAM) circuits. Using the SRAM circuits as buffers or cache for the QV memory circuits, the system memory may achieve access latency performance of the SRAM circuits and may be used as code memory. The system memory is also capable of direct memory access (DMA) operations and includes an arithmetic logic unit for performing computational memory tasks. The system memory may include one or more embedded processor. In addition, the system memory may be configured for multi-channel memory accesses by multiple host processors over multiple host ports. The system memory may be provided in the dual-in-line memory module (DIMM) format.

Quasi-volatile system-level memory

A high-capacity system memory may be built from both quasi-volatile (QV) memory circuits, logic circuits, and static random-access memory (SRAM) circuits. Using the SRAM circuits as buffers or cache for the QV memory circuits, the system memory may achieve access latency performance of the SRAM circuits and may be used as code memory. The system memory is also capable of direct memory access (DMA) operations and includes an arithmetic logic unit for performing computational memory tasks. The system memory may include one or more embedded processor. In addition, the system memory may be configured for multi-channel memory accesses by multiple host processors over multiple host ports. The system memory may be provided in the dual-in-line memory module (DIMM) format.

Techniques for managing context information for a storage device
11579789 · 2023-02-14 · ·

Disclosed herein are techniques for managing context information for data stored within a non-volatile memory of a computing device. According to some embodiments, the method can include (1) loading, into a volatile memory of the computing device, the context information from the non-volatile memory, where the context information is separated into a plurality of silos, (2) writing transactions into a log stored within the non-volatile memory, and (3) each time a condition is satisfied: (i) identifying a next silo of the plurality of silos to be written into the non-volatile memory, (ii) updating the next silo to reflect the transactions that apply to the next silo, and (iii) writing the next silo into the non-volatile memory. In turn, when an inadvertent shutdown of the computing device occurs, the silos of which the context information is comprised can be sequentially accessed and restored in an efficient manner.

Techniques for managing context information for a storage device
11579789 · 2023-02-14 · ·

Disclosed herein are techniques for managing context information for data stored within a non-volatile memory of a computing device. According to some embodiments, the method can include (1) loading, into a volatile memory of the computing device, the context information from the non-volatile memory, where the context information is separated into a plurality of silos, (2) writing transactions into a log stored within the non-volatile memory, and (3) each time a condition is satisfied: (i) identifying a next silo of the plurality of silos to be written into the non-volatile memory, (ii) updating the next silo to reflect the transactions that apply to the next silo, and (iii) writing the next silo into the non-volatile memory. In turn, when an inadvertent shutdown of the computing device occurs, the silos of which the context information is comprised can be sequentially accessed and restored in an efficient manner.

Enabling use of non-volatile media—express (NVME) over a network

Enabling a protocol for efficiently and reliably using the NVME protocol over a network, referred to as NVME over Network, or NVMEoN, may include an NVMEoN exchange layer for handling exchanges between initiating and target nodes on a network, a burst transmission protocol that provides guaranteed delivery without duplicate retransmission, and an exchange status block approach to manage state information about exchanges.

Enabling use of non-volatile media—express (NVME) over a network

Enabling a protocol for efficiently and reliably using the NVME protocol over a network, referred to as NVME over Network, or NVMEoN, may include an NVMEoN exchange layer for handling exchanges between initiating and target nodes on a network, a burst transmission protocol that provides guaranteed delivery without duplicate retransmission, and an exchange status block approach to manage state information about exchanges.

Write ordering in SSDs

Disclosed are systems and methods by which a storage device may process and return I/O commands to a host in the order in which the host provided the commands, thereby reducing host overhead, including but not limited to the following: receiving a first I/O command and a second I/O command, the first I/O command and the second I/O command being assigned a sequence tag, issuing the first I/O command and the second I/O command to one or more storage channels based on their respective sequence tags, collecting a command completion notice of the first I/O command or the second I/O command when the first I/O command or the second I/O command has been respectively completed; and issuing a command completion notification to a host based on the sequence tag of the associated completed first I/O command or the second I/O command.