G06N3/10

HARDWARE ACCELERATED ANOMALY DETECTION IN A SYSTEM ON A CHIP

In various examples, a VPU and associated components may be optimized to improve VPU performance and throughput. For example, the VPU may include a min/max collector, automatic store predication functionality, a SIMD data path organization that allows for inter-lane sharing, a transposed load/store with stride parameter functionality, a load with permute and zero insertion functionality, hardware, logic, and memory layout functionality to allow for two point and two by two point lookups, and per memory bank load caching capabilities. In addition, decoupled accelerators may be used to offload VPU processing tasks to increase throughput and performance, and a hardware sequencer may be included in a DMA system to reduce programming complexity of the VPU and the DMA system. The DMA and VPU may execute a VPU configuration mode that allows the VPU and DMA to operate without a processing controller for performing dynamic region based data movement operations.

HARDWARE ACCELERATED ANOMALY DETECTION IN A SYSTEM ON A CHIP

In various examples, a VPU and associated components may be optimized to improve VPU performance and throughput. For example, the VPU may include a min/max collector, automatic store predication functionality, a SIMD data path organization that allows for inter-lane sharing, a transposed load/store with stride parameter functionality, a load with permute and zero insertion functionality, hardware, logic, and memory layout functionality to allow for two point and two by two point lookups, and per memory bank load caching capabilities. In addition, decoupled accelerators may be used to offload VPU processing tasks to increase throughput and performance, and a hardware sequencer may be included in a DMA system to reduce programming complexity of the VPU and the DMA system. The DMA and VPU may execute a VPU configuration mode that allows the VPU and DMA to operate without a processing controller for performing dynamic region based data movement operations.

NEURAL NETWORK LOOP DETECTION
20230051050 · 2023-02-16 ·

Apparatuses, systems, and techniques to detect loops in neural network graphs. In at least one embodiment, one or more loops are detected within one or more graphs corresponding to one or more neural networks.

Accelerated deep learning

Techniques in advanced deep learning provide improvements in one or more of accuracy, performance, and energy efficiency, such as accuracy of learning, accuracy of prediction, speed of learning, performance of learning, and energy efficiency of learning. An array of processing elements performs flow-based computations on wavelets of data. Each processing element has a respective compute element and a respective routing element. Each compute element has processing resources and memory resources. Each router enables communication via wavelets with at least nearest neighbors in a 2D mesh. Stochastic gradient descent, mini-batch gradient descent, and continuous propagation gradient descent are techniques usable to train weights of a neural network modeled by the processing elements. Reverse checkpoint is usable to reduce memory usage during the training.

Accelerated deep learning

Techniques in advanced deep learning provide improvements in one or more of accuracy, performance, and energy efficiency, such as accuracy of learning, accuracy of prediction, speed of learning, performance of learning, and energy efficiency of learning. An array of processing elements performs flow-based computations on wavelets of data. Each processing element has a respective compute element and a respective routing element. Each compute element has processing resources and memory resources. Each router enables communication via wavelets with at least nearest neighbors in a 2D mesh. Stochastic gradient descent, mini-batch gradient descent, and continuous propagation gradient descent are techniques usable to train weights of a neural network modeled by the processing elements. Reverse checkpoint is usable to reduce memory usage during the training.

Manufacturing automation using acoustic separation neural network

A system for controlling an operation of a machine including a plurality of actuators assisting one or multiple tools to perform one or multiple tasks, in response to receiving an acoustic mixture of signals generated by the tool performing a task and by the plurality of actuators actuating the tool, submit the acoustic mixture of signals into a neural network trained to separate from the acoustic mixture a signal generated by the tool performing the task from signals generated by the actuators actuating the tool to extract the signal generated by the tool performing the task from the acoustic mixture of signals, analyze the extracted signal to produce a state of performance of the task, and execute a control action selected according to the state of performance of the task.

Manufacturing automation using acoustic separation neural network

A system for controlling an operation of a machine including a plurality of actuators assisting one or multiple tools to perform one or multiple tasks, in response to receiving an acoustic mixture of signals generated by the tool performing a task and by the plurality of actuators actuating the tool, submit the acoustic mixture of signals into a neural network trained to separate from the acoustic mixture a signal generated by the tool performing the task from signals generated by the actuators actuating the tool to extract the signal generated by the tool performing the task from the acoustic mixture of signals, analyze the extracted signal to produce a state of performance of the task, and execute a control action selected according to the state of performance of the task.

Scheduling artificial intelligence model partitions based on reversed computation graph

Techniques are disclosed for scheduling artificial intelligence model partitions for execution in an information processing system. For example, a method comprises the following steps. An intermediate representation of an artificial intelligence model is obtained. A reversed computation graph corresponding to a computation graph generated based on the intermediate representation is obtained. Nodes in the reversed computation graph represent functions related to the artificial intelligence model, and one or more directed edges in the reversed computation graph represent one or more dependencies between the functions. The reversed computation graph is partitioned into sequential partitions, such that the partitions are executed sequentially and functions corresponding to nodes in each partition are executed in parallel.

Systems for introducing memristor random telegraph noise in Hopfield neural networks

Systems are provided for implementing a hardware accelerator. The hardware accelerator emulate a stochastic neural network, and includes a first memristor crossbar array, and a second memristor crossbar array. The first memristor crossbar array can be programmed to calculate node values of the neural network. The nodes values can be calculated in accordance with rules to reduce an energy function associated with the neural network. The second memristor crossbar array is coupled to the first memristor crossbar array and programmed to introduce noise signals into the neural network. The noise signals can be introduced such that the energy function associated with the neural network converges towards a global minimum and modifies the calculated node values.

Systems for introducing memristor random telegraph noise in Hopfield neural networks

Systems are provided for implementing a hardware accelerator. The hardware accelerator emulate a stochastic neural network, and includes a first memristor crossbar array, and a second memristor crossbar array. The first memristor crossbar array can be programmed to calculate node values of the neural network. The nodes values can be calculated in accordance with rules to reduce an energy function associated with the neural network. The second memristor crossbar array is coupled to the first memristor crossbar array and programmed to introduce noise signals into the neural network. The noise signals can be introduced such that the energy function associated with the neural network converges towards a global minimum and modifies the calculated node values.