G06F30/33

Deep learning inference efficiency technology with early exit and speculative execution

Systems, apparatuses and methods may provide for technology that processes an inference workload in a first subset of layers of a neural network that prevents or inhibits data dependent branch operations, conducts an exit determination as to whether an output of the first subset of layers satisfies one or more exit criteria, and selectively bypasses processing of the output in a second subset of layers of the neural network based on the exit determination. The technology may also speculatively initiate the processing of the output in the second subset of layers while the exit determination is pending. Additionally, when the inference workloads include a plurality of batches, the technology may mask one or more of the plurality of batches from processing in the second subset of layers.

Deep learning inference efficiency technology with early exit and speculative execution

Systems, apparatuses and methods may provide for technology that processes an inference workload in a first subset of layers of a neural network that prevents or inhibits data dependent branch operations, conducts an exit determination as to whether an output of the first subset of layers satisfies one or more exit criteria, and selectively bypasses processing of the output in a second subset of layers of the neural network based on the exit determination. The technology may also speculatively initiate the processing of the output in the second subset of layers while the exit determination is pending. Additionally, when the inference workloads include a plurality of batches, the technology may mask one or more of the plurality of batches from processing in the second subset of layers.

System and method for fast and accurate netlist to RTL reverse engineering

Embodiments herein provide for reverse engineering of integrated circuits (ICs) for design verification. In example embodiments, an apparatus receives a gate-level netlist for an integrated circuit (IC), generates a list of equivalence classes related to signals included in the gate-level netlist, determines control signals of the gate-level netlist based at least in part on the list of equivalence classes, determines a logic flow of a finite state transducer (FST) based at least in part on the control signals, and generates register transfer level (RTL) source code for the IC based on the FST.

System and method for fast and accurate netlist to RTL reverse engineering

Embodiments herein provide for reverse engineering of integrated circuits (ICs) for design verification. In example embodiments, an apparatus receives a gate-level netlist for an integrated circuit (IC), generates a list of equivalence classes related to signals included in the gate-level netlist, determines control signals of the gate-level netlist based at least in part on the list of equivalence classes, determines a logic flow of a finite state transducer (FST) based at least in part on the control signals, and generates register transfer level (RTL) source code for the IC based on the FST.

Simulation of quantum circuits
11556686 · 2023-01-17 · ·

Methods, systems and apparatus for simulating quantum circuits including multiple quantum logic gates. In one aspect, a method includes the actions of representing the multiple quantum logic gates as functions of one or more classical Boolean variables that define a undirected graphical model with each classical Boolean variable representing a vertex in the model and each function of respective classical Boolean variables representing a clique between vertices corresponding to the respective classical Boolean variables; representing the probability of obtaining a particular output bit string from the quantum circuit as a first sum of products of the functions; and calculating the probability of obtaining the particular output bit string from the quantum circuit by directly evaluating the sum of products of the functions. The calculated partition function is used to (i) calibrate, (ii) validate, or (iii) benchmark quantum computing hardware implementing a quantum circuit.

DEVELOPMENT SYSTEM AND METHOD OF OFFLINE SOFTWARE-IN-THE-LOOP SIMULATION

A development system and a method of an offline software-in-the-loop simulation are disclosed. A common firmware architecture generates a chip control program. The common firmware architecture has an application layer and a hardware abstraction layer. The application layer has a configuration header file and a product program. A processing program required by a peripheral module is added to the hardware abstraction layer during compiling. The chip control program is provided to a controller chip or a circuit simulation software to be executed to control the product-related circuit through controlling the peripheral module.

IC, monitoring system and monitoring method thereof

An IC is provided. The IC includes an input pin, a controller, a timer, a first memory, a processor, at least one output pin, an output module coupled to the output pin, and a direct memory access (DMA) device coupled between the output module and the first memory. The controller is configured to provide a first control signal in response to a command from the input pin. The timer is configured to periodically provide a trigger signal according to the first control signal. The processor is configured to store first data in the first memory. The DMA device is configured to obtain the first data from the first memory in response to the trigger signal, and transmit the first data to the output module. The output module is configured to provide the first data to the output pin according to a transmission rate.

IC, monitoring system and monitoring method thereof

An IC is provided. The IC includes an input pin, a controller, a timer, a first memory, a processor, at least one output pin, an output module coupled to the output pin, and a direct memory access (DMA) device coupled between the output module and the first memory. The controller is configured to provide a first control signal in response to a command from the input pin. The timer is configured to periodically provide a trigger signal according to the first control signal. The processor is configured to store first data in the first memory. The DMA device is configured to obtain the first data from the first memory in response to the trigger signal, and transmit the first data to the output module. The output module is configured to provide the first data to the output pin according to a transmission rate.

DEEP LEARNING INFERENCE EFFICIENCY TECHNOLOGY WITH EARLY EXIT AND SPECULATIVE EXECUTION
20230215158 · 2023-07-06 ·

Systems, apparatuses and methods may provide for technology that processes an inference workload in a first subset of layers of a neural network that prevents or inhibits data dependent branch operations, conducts an exit determination as to whether an output of the first subset of layers satisfies one or more exit criteria, and selectively bypasses processing of the output in a second subset of layers of the neural network based on the exit determination. The technology may also speculatively initiate the processing of the output in the second subset of layers while the exit determination is pending. Additionally, when the inference workloads include a plurality of batches, the technology may mask one or more of the plurality of batches from processing in the second subset of layers.

DEEP LEARNING INFERENCE EFFICIENCY TECHNOLOGY WITH EARLY EXIT AND SPECULATIVE EXECUTION
20230215158 · 2023-07-06 ·

Systems, apparatuses and methods may provide for technology that processes an inference workload in a first subset of layers of a neural network that prevents or inhibits data dependent branch operations, conducts an exit determination as to whether an output of the first subset of layers satisfies one or more exit criteria, and selectively bypasses processing of the output in a second subset of layers of the neural network based on the exit determination. The technology may also speculatively initiate the processing of the output in the second subset of layers while the exit determination is pending. Additionally, when the inference workloads include a plurality of batches, the technology may mask one or more of the plurality of batches from processing in the second subset of layers.