G06F2015/768

SECURED DEPLOYMENT OF MACHINE LEARNING MODELS

A system includes a programmable logic device including a communication interface configured to receive an encrypted deep learning model and a first key in a bitstream. In an embodiment, the programmable logic device includes a storage block configured to store the first key. The programmable logic device also includes a decryption block configured to decrypt the deep learning model using the first key. A method includes receiving, at a programmable logic device, the encrypted deep learning model and a first key in a bitstream. The method also includes decrypting, at the programmable logic device, the deep learning model using the first key. The method also includes implementing the deep learning model on the programmable logic device.

Compute nodes within reconfigurable computing clusters

Reconfigurable computing clusters, compute nodes within reconfigurable computing clusters, and methods of operating a reconfigurable computing cluster are disclosed. A reconfigurable computing cluster includes an optical circuit switch, and a plurality of computing assets, each of the plurality of computing assets connected to the optical circuit switch by two or more bidirectional fiber optic communications paths.

PIPELINE INCLUDING SEPARATE HARDWARE DATA PATHS FOR DIFFERENT INSTRUCTION TYPES

A processing element is implemented in a stage of a pipeline and configured to execute an instruction. A first array of multiplexers is to provide information associated with the instruction to the processing element in response to the instruction being in a first set of instructions. A second array of multiplexers is to provide information associated with the instruction to the first processing element in response to the instruction being in a second set of instructions. A control unit is to gate at least one of power or a clock signal provided to the first array of multiplexers in response to the instruction being in the second set.

Reconfigurable Computing Appliance
20200285603 · 2020-09-10 · ·

A reconfigurable computing appliance includes a number of computing tiles. Each computing tile includes a reconfigurable processing element and a network fabric interface device configured to communicate over a network fabric. The reconfigurable processing element operates on data received from an I/O input interface and/or data received via the network fabric interface device.

FPGA-BASED GRAPH DATA PROCESSING METHOD AND SYSTEM THEREOF
20200242072 · 2020-07-30 ·

An FPGA-based graph data processing method is provided for executing graph traversals on a graph having characteristics of a small-world network by using a first processor being a CPU and a second processor that is a FPGA and is in communicative connection with the first processor, wherein the first processor sends graph data to be traversed to the second processor, and obtains result data of the graph traversals from the second processor for result output after the second processor has completed the graph traversals of the graph data by executing level traversals, and the second processor comprises a sparsity processing module and a density processing module, the sparsity processing module operates in a beginning stage and/or an ending stage of the graph traversals, and the density processing module with a higher degree of parallelism than the sparsity processing module operates in the intermediate stage of the graph traversals.

COMPUTE NODES WITHIN RECONFIGURABLE COMPUTING CLUSTERS

Reconfigurable computing clusters, compute nodes within reconfigurable computing clusters, and methods of operating a reconfigurable computing cluster are disclosed. A reconfigurable computing cluster includes an optical circuit switch, and a plurality of computing assets, each of the plurality of computing assets connected to the optical circuit switch by two or more bidirectional fiber optic communications paths.

Reconfigurable Circuit Architecture
20200226095 · 2020-07-16 · ·

A method of reconfiguration and a reconfigurable circuit architecture comprising a configurable volatile storage circuit and Non-Volatile Memory circuit elements; wherein the Non-Volatile memory circuit elements store multiple bit states for re-configuration, the multiple bit states being read from the Non-Volatile memory circuit elements and written into the configurable volatile storage circuit for reconfiguration. The Non-Volatile Memory circuit elements and the configurable volatile storage circuit are provided on a common die.

Pipeline including separate hardware data paths for different instruction types

A processing element is implemented in a stage of a pipeline and configured to execute an instruction. A first array of multiplexers is to provide information associated with the instruction to the processing element in response to the instruction being in a first set of instructions. A second array of multiplexers is to provide information associated with the instruction to the first processing element in response to the instruction being in a second set of instructions. A control unit is to gate at least one of power or a clock signal provided to the first array of multiplexers in response to the instruction being in the second set.

Constrained metric optimization of a system on chip

A method including receiving a first configuration of a device validated against a design constraint, is provided. A configuration includes stimuli controls and stimuli parameters used as inputs in a device model. The method includes determining a quality of the first configuration based on an estimation of an output parameter including a desired behavior of the device, simulating the device in the first configuration when the first configuration quality overcomes a threshold, and requesting a second configuration of the device when the quality of the first configuration is below the selected threshold. The method also includes obtaining a regression based on multiple, high quality configurations to determine, for the device, a distribution of output parameter values and comparing the distribution of output parameter values with a baseline of a random regression to adjust the machine learning engine according to a target range of output parameter values.

ISSUING INSTRUCTIONS ON A VECTOR PROCESSOR
20240028557 · 2024-01-25 ·

The present disclosure relates to a mechanism for issuing instructions in a processor (e.g., a vector processor) implemented as an overlay on programmable hardware (e.g., a field programmable gate array (FPGA) device). Implementations described herein include features for optimizing resource availability on programmable hardware units and enabling superscalar execution when coupled with a temporal single-instruction multiple data (SIMD). Systems described herein involve an issue component of a processor controller (e.g., a vector processor controller) that enables fast and efficient instruction issue while verifying that structural and data hazards between instructions have been resolved.