Patent classifications
G06F13/28
SECURE MEMORY ISOLATION FOR SECURE ENDPOINTS
A single input/output (I/O) controller for both secure partitionable endpoints (PEs) and non-secure PEs is enabled in a trusted execution environment (TEE) where secure memory portions are isolated from non-secure PEs. Security attributes for certain endpoints indicate secure memory access privilege of owning entities of the certain endpoints. A security monitor has exclusive access to the address translation control tables (TCE) stored in secure memory associated with a secure endpoint. When owning entity reassignment occurs, the endpoints are reinitialized to support a change in ownership from an outgoing owning entity having secure memory access and an incoming owning entity not having secure memory access.
SECURE MEMORY ISOLATION FOR SECURE ENDPOINTS
A single input/output (I/O) controller for both secure partitionable endpoints (PEs) and non-secure PEs is enabled in a trusted execution environment (TEE) where secure memory portions are isolated from non-secure PEs. Security attributes for certain endpoints indicate secure memory access privilege of owning entities of the certain endpoints. A security monitor has exclusive access to the address translation control tables (TCE) stored in secure memory associated with a secure endpoint. When owning entity reassignment occurs, the endpoints are reinitialized to support a change in ownership from an outgoing owning entity having secure memory access and an incoming owning entity not having secure memory access.
BUILT-IN SELF-TEST FOR A PROGRAMMABLE VISION ACCELERATOR OF A SYSTEM ON A CHIP
In various examples, a VPU and associated components may be optimized to improve VPU performance and throughput. For example, the VPU may include a min/max collector, automatic store predication functionality, a SIMD data path organization that allows for inter-lane sharing, a transposed load/store with stride parameter functionality, a load with permute and zero insertion functionality, hardware, logic, and memory layout functionality to allow for two point and two by two point lookups, and per memory bank load caching capabilities. In addition, decoupled accelerators may be used to offload VPU processing tasks to increase throughput and performance, and a hardware sequencer may be included in a DMA system to reduce programming complexity of the VPU and the DMA system. The DMA and VPU may execute a VPU configuration mode that allows the VPU and DMA to operate without a processing controller for performing dynamic region based data movement operations.
BUILT-IN SELF-TEST FOR A PROGRAMMABLE VISION ACCELERATOR OF A SYSTEM ON A CHIP
In various examples, a VPU and associated components may be optimized to improve VPU performance and throughput. For example, the VPU may include a min/max collector, automatic store predication functionality, a SIMD data path organization that allows for inter-lane sharing, a transposed load/store with stride parameter functionality, a load with permute and zero insertion functionality, hardware, logic, and memory layout functionality to allow for two point and two by two point lookups, and per memory bank load caching capabilities. In addition, decoupled accelerators may be used to offload VPU processing tasks to increase throughput and performance, and a hardware sequencer may be included in a DMA system to reduce programming complexity of the VPU and the DMA system. The DMA and VPU may execute a VPU configuration mode that allows the VPU and DMA to operate without a processing controller for performing dynamic region based data movement operations.
PERFORMING LOAD AND STORE OPERATIONS OF 2D ARRAYS IN A SINGLE CYCLE IN A SYSTEM ON A CHIP
In various examples, a VPU and associated components may be optimized to improve VPU performance and throughput. For example, the VPU may include a min/max collector, automatic store predication functionality, a SIMD data path organization that allows for inter-lane sharing, a transposed load/store with stride parameter functionality, a load with permute and zero insertion functionality, hardware, logic, and memory layout functionality to allow for two point and two by two point lookups, and per memory bank load caching capabilities. In addition, decoupled accelerators may be used to offload VPU processing tasks to increase throughput and performance, and a hardware sequencer may be included in a DMA system to reduce programming complexity of the VPU and the DMA system. The DMA and VPU may execute a VPU configuration mode that allows the VPU and DMA to operate without a processing controller for performing dynamic region based data movement operations.
PERFORMING LOAD AND STORE OPERATIONS OF 2D ARRAYS IN A SINGLE CYCLE IN A SYSTEM ON A CHIP
In various examples, a VPU and associated components may be optimized to improve VPU performance and throughput. For example, the VPU may include a min/max collector, automatic store predication functionality, a SIMD data path organization that allows for inter-lane sharing, a transposed load/store with stride parameter functionality, a load with permute and zero insertion functionality, hardware, logic, and memory layout functionality to allow for two point and two by two point lookups, and per memory bank load caching capabilities. In addition, decoupled accelerators may be used to offload VPU processing tasks to increase throughput and performance, and a hardware sequencer may be included in a DMA system to reduce programming complexity of the VPU and the DMA system. The DMA and VPU may execute a VPU configuration mode that allows the VPU and DMA to operate without a processing controller for performing dynamic region based data movement operations.
Built-in self-test for a programmable vision accelerator of a system on a chip
In various examples, a VPU and associated components may be optimized to improve VPU performance and throughput. For example, the VPU may include a min/max collector, automatic store predication functionality, a SIMD data path organization that allows for inter-lane sharing, a transposed load/store with stride parameter functionality, a load with permute and zero insertion functionality, hardware, logic, and memory layout functionality to allow for two point and two by two point lookups, and per memory bank load caching capabilities. In addition, decoupled accelerators may be used to offload VPU processing tasks to increase throughput and performance, and a hardware sequencer may be included in a DMA system to reduce programming complexity of the VPU and the DMA system. The DMA and VPU may execute a VPU configuration mode that allows the VPU and DMA to operate without a processing controller for performing dynamic region based data movement operations.
Built-in self-test for a programmable vision accelerator of a system on a chip
In various examples, a VPU and associated components may be optimized to improve VPU performance and throughput. For example, the VPU may include a min/max collector, automatic store predication functionality, a SIMD data path organization that allows for inter-lane sharing, a transposed load/store with stride parameter functionality, a load with permute and zero insertion functionality, hardware, logic, and memory layout functionality to allow for two point and two by two point lookups, and per memory bank load caching capabilities. In addition, decoupled accelerators may be used to offload VPU processing tasks to increase throughput and performance, and a hardware sequencer may be included in a DMA system to reduce programming complexity of the VPU and the DMA system. The DMA and VPU may execute a VPU configuration mode that allows the VPU and DMA to operate without a processing controller for performing dynamic region based data movement operations.
Operation method
Aspects of data modification for neural networks are described herein. The aspects may include a connection value generator configured to receive one or more groups of input data and one or more weight values and generate one or more connection values based on the one or more weight values. The aspects may further include a pruning module configured to modify the one or more groups of input data and the one or more weight values based on the connection values. Further still, the aspects may include a computing unit configured to update the one or more weight values and/or calculate one or more input gradients.
Operation method
Aspects of data modification for neural networks are described herein. The aspects may include a connection value generator configured to receive one or more groups of input data and one or more weight values and generate one or more connection values based on the one or more weight values. The aspects may further include a pruning module configured to modify the one or more groups of input data and the one or more weight values based on the connection values. Further still, the aspects may include a computing unit configured to update the one or more weight values and/or calculate one or more input gradients.