Patent classifications
H03K19/17748
Data flow graph optimization techniques for RTL loops with conditional-exit statements
A computer-implemented method includes compiling a Register Transfer Level (RTL) code to form a data flow graph (DFG). The computer-implemented method includes identifying a chain of multiplexers in the DFG, wherein the chain of multiplexers includes exit multiplexers associated with a loop exit path and non-exit multiplexers. The computer-implemented method also includes traversing a topological order of the DFG in reverse. The computer-implemented method also includes computing fanin-cones for each two consecutive exit multiplexers. The computer-implemented method includes generating a truth table responsive to valid fanin-cones and back propagating select conditions for the each two consecutive exit multiplexers. The computer-implemented method includes eliminating an exit multiplexer from the each two consecutive exit multiplexers based on the truth table. The computer-implemented method further includes transforming the DFG to a new DFG based on the truth table.
Data flow graph optimization techniques for RTL loops with conditional-exit statements
A computer-implemented method includes compiling a Register Transfer Level (RTL) code to form a data flow graph (DFG). The computer-implemented method includes identifying a chain of multiplexers in the DFG, wherein the chain of multiplexers includes exit multiplexers associated with a loop exit path and non-exit multiplexers. The computer-implemented method also includes traversing a topological order of the DFG in reverse. The computer-implemented method also includes computing fanin-cones for each two consecutive exit multiplexers. The computer-implemented method includes generating a truth table responsive to valid fanin-cones and back propagating select conditions for the each two consecutive exit multiplexers. The computer-implemented method includes eliminating an exit multiplexer from the each two consecutive exit multiplexers based on the truth table. The computer-implemented method further includes transforming the DFG to a new DFG based on the truth table.
Adaptive integrated programmable device platform
An integrated circuit (IC) includes a first interface configured for operation with a plurality of tenants implemented concurrently in the integrated circuit, wherein the plurality of tenants communicate with a host data processing system using the first interface. The IC includes a second interface configured for operation with the plurality of tenants, wherein the plurality of tenants communicate with one or more network nodes via a network using the second interface. The IC can include a programmable logic circuitry configured for operation with the plurality of tenants, wherein the programmable logic circuitry implements one or more hardware accelerated functions for the plurality of tenants and routes data between the first interface and the second interface. The first interface, the second interface, and the programmable logic circuitry are configured to provide isolation among the plurality of tenants.
HYBRID ARCHITECTURE FOR SIGNAL PROCESSING AND SIGNAL PROCESSING ACCELERATOR
Systems and methods for configuring a SPA are disclosed. The SPA comprises a plurality of input ports, a plurality of data memory units, signal processing circuitry, and an enable block including at least two counters. Each counter determines an amount of unprocessed data that is stored in a respective one of the plurality of data memory units, and the enable block is configured to disable the signal processing circuitry until a predetermined amount of data is received over the input ports.
HYBRID ARCHITECTURE FOR SIGNAL PROCESSING AND SIGNAL PROCESSING ACCELERATOR
Systems and methods for configuring a SPA are disclosed. The SPA comprises a plurality of input ports, a plurality of data memory units, signal processing circuitry, and an enable block including at least two counters. Each counter determines an amount of unprocessed data that is stored in a respective one of the plurality of data memory units, and the enable block is configured to disable the signal processing circuitry until a predetermined amount of data is received over the input ports.
Logic integrated circuit and semiconductor device
A logic integrated circuit includes: a three-terminal resistance change switch including a first resistance change switch and a second resistance change switch connected in series; a reading circuit which reads first data based on a resistance state of the first resistance change switch and second data based on a resistance state of the second resistance change switch; and a first error detection circuit which compares the first data with the second data and issue an output based on a result of the comparison.
Logic integrated circuit and semiconductor device
A logic integrated circuit includes: a three-terminal resistance change switch including a first resistance change switch and a second resistance change switch connected in series; a reading circuit which reads first data based on a resistance state of the first resistance change switch and second data based on a resistance state of the second resistance change switch; and a first error detection circuit which compares the first data with the second data and issue an output based on a result of the comparison.
Techniques for computing dot products with memory devices
Sparse representation of information performs powerful feature extraction on high-dimensional data and is of interest for applications in signal processing, machine vision, object recognition, and neurobiology. Sparse coding is a mechanism by which biological neural systems can efficiently process complex sensory data while consuming very little power. Sparse coding algorithms in a bio-inspired approach can be implemented in a crossbar array of memristors (resistive memory devices). This network enables efficient implementation of pattern matching and lateral neuron inhibition, allowing input data to be sparsely encoded using neuron activities and stored dictionary elements. The reconstructed input can be obtained by performing a backward pass through the same crossbar matrix using the neuron activity vector as input. Different dictionary sets can be trained and stored in the same system, depending on the nature of the input signals. Using the sparse coding algorithm, natural image processing is performed based on a learned dictionary.
Techniques for computing dot products with memory devices
Sparse representation of information performs powerful feature extraction on high-dimensional data and is of interest for applications in signal processing, machine vision, object recognition, and neurobiology. Sparse coding is a mechanism by which biological neural systems can efficiently process complex sensory data while consuming very little power. Sparse coding algorithms in a bio-inspired approach can be implemented in a crossbar array of memristors (resistive memory devices). This network enables efficient implementation of pattern matching and lateral neuron inhibition, allowing input data to be sparsely encoded using neuron activities and stored dictionary elements. The reconstructed input can be obtained by performing a backward pass through the same crossbar matrix using the neuron activity vector as input. Different dictionary sets can be trained and stored in the same system, depending on the nature of the input signals. Using the sparse coding algorithm, natural image processing is performed based on a learned dictionary.
Systems and Methods for Loading Weights into a Tensor Processing Block
The present disclosure describes a digital signal processing (DSP) block that includes a plurality of columns of weight registers and a plurality of inputs configured to receive a first plurality of values and a second plurality of values. The first plurality of values is stored in the plurality of columns of weight registers after being received. In a first mode of operation, the first and second pluralities of values are received via a first portion of the plurality of inputs. In a second mode of operation, the first plurality of values is received via a second portion of the plurality of inputs, and the second plurality of values is received via the first portion of the plurality of inputs. Additionally, the DSP block includes a plurality of multipliers configured to simultaneously multiply each value of the first plurality of values by each value of the second plurality of values.