Patent classifications
G06F15/78
HARDWARE ACCELERATED ANOMALY DETECTION IN A SYSTEM ON A CHIP
In various examples, a VPU and associated components may be optimized to improve VPU performance and throughput. For example, the VPU may include a min/max collector, automatic store predication functionality, a SIMD data path organization that allows for inter-lane sharing, a transposed load/store with stride parameter functionality, a load with permute and zero insertion functionality, hardware, logic, and memory layout functionality to allow for two point and two by two point lookups, and per memory bank load caching capabilities. In addition, decoupled accelerators may be used to offload VPU processing tasks to increase throughput and performance, and a hardware sequencer may be included in a DMA system to reduce programming complexity of the VPU and the DMA system. The DMA and VPU may execute a VPU configuration mode that allows the VPU and DMA to operate without a processing controller for performing dynamic region based data movement operations.
APPARATUS INCLUDING RECONFIGURABLE INTERFACE AND METHODS OF MANUFACTURING THE SAME
An apparatus including reconfigurable interface circuits and associated systems and methods are disclosed herein. An reconfigurable interface circuit may include an output buffer and an input buffer coupled to a connector for respectively generating and receiving signals. The reconfigurable interface circuit may include a control circuit configured to control operation of the input and output buffers along with additional circuits to selectively implement one or more from a set of selectable communication settings.
DATA INPUT/OUTPUT OPERATIONS DURING LOOP EXECUTION IN A RECONFIGURABLE COMPUTE FABRIC
Various examples are directed to systems and methods in which a first flow controller of a first synchronous flow may receive an instruction to execute a first loop using the first synchronous flow. The first flow controller may determine a first iteration index for a first iteration of the first loop. The first flow controller may send, to a first compute element of the first synchronous flow, a first synchronous message to initiate a first synchronous flow thread for executing the first iteration of the first loop. The first synchronous message may comprise the iteration index. The first compute element may execute an input/output operation at a first location of a first compute element memory indicated by the first iteration index.
Devices for time division multiplexing of state machine engine signals
A device includes a plurality of blocks. Each block of the plurality of blocks includes a plurality of rows. Each row of the plurality of rows includes a plurality of configurable elements and a routing line, whereby each configurable element of the plurality of configurable elements includes a data analysis element comprising a plurality of memory cells, wherein the data analysis element is configured to analyze at least a portion of a data stream and to output a result of the analysis. Each configurable element of the plurality of configurable elements also includes a multiplexer configured to transmit the result to the routing line.
Control barrier network for reconfigurable data processors
A processing system comprises a control bus and a plurality of logic units. The control bus is configurable by configuration data to form signal routes in a control barrier network coupled to processing units in an array of processing units. The plurality of logic units has inputs and outputs connected to the control bus and to the array of processing units. A logic unit in the plurality of logic units is operatively coupled to a processing unit in the array of processing units and is configurable by the configuration data to consume source tokens and a status signal from the processing unit on the inputs and to produce barrier tokens and an enable signal on the outputs based on the source tokens and the status signal on the inputs.
Scalable network-on-chip for high-bandwidth memory
Described herein are memory controllers for integrated circuits that implement network-on-chip (NoC) to provide access to memory to couple processing cores of the integrated circuit to a memory device. The NoC may be dedicated to service the memory controller and may include one or more routers to facilitate management of the access to the memory controller.
Method and system for processing neural network
The present disclosure provides a neural network processing system that comprises a multi-core processing module composed of a plurality of core processing modules and for executing vector multiplication and addition operations in a neural network operation, an on-chip storage medium, an on-chip address index module, and an ALU module for executing a non-linear operation not completable by the multi-core processing module according to input data acquired from the multi-core processing module or the on-chip storage medium, wherein the plurality of core processing modules share an on-chip storage medium and an ALU module, or the plurality of core processing modules have an independent on-chip storage medium and an ALU module. The present disclosure improves an operating speed of the neural network processing system, such that performance of the neural network processing system is higher and more efficient.
Method and system for processing neural network
The present disclosure provides a neural network processing system that comprises a multi-core processing module composed of a plurality of core processing modules and for executing vector multiplication and addition operations in a neural network operation, an on-chip storage medium, an on-chip address index module, and an ALU module for executing a non-linear operation not completable by the multi-core processing module according to input data acquired from the multi-core processing module or the on-chip storage medium, wherein the plurality of core processing modules share an on-chip storage medium and an ALU module, or the plurality of core processing modules have an independent on-chip storage medium and an ALU module. The present disclosure improves an operating speed of the neural network processing system, such that performance of the neural network processing system is higher and more efficient.
System having a hybrid threading processor, a hybrid threading fabric having configurable computing elements, and a hybrid interconnection network
Representative apparatus, method, and system embodiments are disclosed for configurable computing. In a representative embodiment, a system includes an interconnection network, a processor, a host interface, and a configurable circuit cluster. The configurable circuit cluster may include a plurality of configurable circuits arranged in an array; an asynchronous packet network and a synchronous network coupled to each configurable circuit of the array; and a memory interface circuit and a dispatch interface circuit coupled to the asynchronous packet network and to the interconnection network. Each configurable circuit includes instruction or configuration memories for selection of a current data path configuration, a master synchronous network input, and a data path configuration for a next configurable circuit.
Systems and methods for controlling access to secure debugging and profiling features of a computer system
The present disclosure describes systems and methods for controlling access to secure debugging and profiling features of a computer system. Some illustrative embodiments include a system that includes a processor, and a memory coupled to the processor (the memory used to store information and an attribute associated with the stored information). At least one bit of the attribute determines a security level, selected from a plurality of security levels, of the stored information associated with the attribute. Asserting at least one other bit of the attribute enables exportation of the stored information from the computer system if the security level of the stored information is higher than at least one other security level of the plurality of security levels.