Patent classifications
G06F30/331
Integrated Sensor Device with Deep Learning Accelerator and Random Access Memory
Systems, devices, and methods related to a deep learning accelerator and memory are described. For example, an integrated sensor device may be configured to execute instructions with matrix operands and configured with: a sensor to generate measurements of stimuli; random access memory to store instructions executable by the deep learning accelerator and store matrices of an artificial neural network; a host interface connectable to a host system; and a controller to store the measurements generated by the sensor into the random access memory as an input to the artificial neural network. After the deep learning accelerator generates in the random access memory an output of the artificial neural network by executing the instructions to process the input, the controller may communicate the output to a host system through the host interface.
Processing data in memory using an FPGA
Processing data in memory using a field programmable gate array by reading a first portion of a data set to a burst block having a first data format, transforming a sub-portion of the first portion, to an element block having a second data format, processing the sub-portion yielding a first results set, transforming the first results set to the first data format of the burst block, and writing the first results set to the burst block.
Hardware deprocessing using voltage imaging for hardware assurance
Embodiments of the present disclosure provide methods, apparatus, systems, computing devices, computing entities for setting deprocessing parameters used in conducting hardware deprocessing on a hardware. In accordance with one embodiment, a method is provided that includes: receiving sample images using different E-beam voltages, wherein each image is captured from a backside of the hardware using a different E-beam voltage; generating thickness-based contour maps, wherein each map is generated for an image and includes contour lines indicating locations having a same thickness of remaining material; generating estimated E-beam penetration depths, wherein each depth is generated for an image and is based at least in part on the E-beam voltage used to capture the image; generating an estimated thickness measurement of the remaining material based at least in part on the contour maps and the penetration depths; and setting the deprocessing parameters based at least in part on the estimated thickness measurement.
Hardware deprocessing using voltage imaging for hardware assurance
Embodiments of the present disclosure provide methods, apparatus, systems, computing devices, computing entities for setting deprocessing parameters used in conducting hardware deprocessing on a hardware. In accordance with one embodiment, a method is provided that includes: receiving sample images using different E-beam voltages, wherein each image is captured from a backside of the hardware using a different E-beam voltage; generating thickness-based contour maps, wherein each map is generated for an image and includes contour lines indicating locations having a same thickness of remaining material; generating estimated E-beam penetration depths, wherein each depth is generated for an image and is based at least in part on the E-beam voltage used to capture the image; generating an estimated thickness measurement of the remaining material based at least in part on the contour maps and the penetration depths; and setting the deprocessing parameters based at least in part on the estimated thickness measurement.
Optimization processing unit having subunits that are programmably and partially connected
Techniques usable in optimization processing are described. A system includes an optimization processing unit (OPU). The OPU includes stochastic computing units and at least one programmable interconnect. Each of the stochastic computing units includes nodes and multiplication unit(s) configured to interconnect at least a portion of the nodes. The programmable interconnect(s) are configured to provide weights for and to selectably couple a portion of the stochastic computing units.
Optimization processing unit having subunits that are programmably and partially connected
Techniques usable in optimization processing are described. A system includes an optimization processing unit (OPU). The OPU includes stochastic computing units and at least one programmable interconnect. Each of the stochastic computing units includes nodes and multiplication unit(s) configured to interconnect at least a portion of the nodes. The programmable interconnect(s) are configured to provide weights for and to selectably couple a portion of the stochastic computing units.
Method, emulator, and storage media for debugging logic system design
A method for debugging a logic system design including a target module to be debugged. The method includes receiving a first gate-level netlist associated with the logic system design and a second gate-level netlist associated with the target module that are generated based on a description of the logic system design, obtaining runtime information of an input signal of the target module by running the first gate-level netlist, and obtaining runtime information of the target module by running the second gate-level netlist based on the runtime information of the input signal of the target module.
Systems and methods for intelligent graph-based buffer sizing for a mixed-signal integrated circuit
A system and method for minimizing a total physical size of data buffers for executing an artificial neural network (ANN) on an integrated circuit includes implementing a buffer-sizing simulation based on sourcing a task graph of the ANN, wherein: (i) the task graph includes a plurality of distinct data buffers, wherein each of the plurality of distinct data buffers is assigned to at least one write operation and at least one read operation; (ii) the buffer-sizing simulation, when executed, computes an estimated physical size for each of a plurality of distinct data buffers for implementing the artificial neural network on a mixed-signal integrated circuit; and (iii) configuring the buffer-sizing simulation includes setting simulation parameters that include buffer-size minimization parameters and buffer data throughput optimization parameters; and generating an estimate of a physical size for each of the plurality of distinct data buffers based on the implementation of the buffer-sizing simulation.
SYSTEMS AND METHODS FOR ACCELERATING THE COMPUTATION OF THE EXPONENTIAL FUNCTION
Aspects of embodiments of the present disclosure relate to a field programmable gate array (FPGA) configured to implement an exponential function data path including: an input scaling stage including constant shifters and integer adders to scale a mantissa portion of an input floating-point value by approximately log.sub.2 e to compute a scaled mantissa value, where e is Euler's number; and an exponential stage including barrel shifters and an exponential lookup table to: extract an integer portion and a fractional portion from the scaled mantissa value based on the exponent portion of the input floating-point value; apply a bias shift to the integer portion to compute a result exponent portion of a result floating-point value; lookup a result mantissa portion of the result floating-point value in the exponential lookup table based on the fractional portion; and combine the result exponent portion and the result mantissa portion to generate the result floating-point value.
SYSTEMS AND METHODS FOR ACCELERATING THE COMPUTATION OF THE EXPONENTIAL FUNCTION
Aspects of embodiments of the present disclosure relate to a field programmable gate array (FPGA) configured to implement an exponential function data path including: an input scaling stage including constant shifters and integer adders to scale a mantissa portion of an input floating-point value by approximately log.sub.2 e to compute a scaled mantissa value, where e is Euler's number; and an exponential stage including barrel shifters and an exponential lookup table to: extract an integer portion and a fractional portion from the scaled mantissa value based on the exponent portion of the input floating-point value; apply a bias shift to the integer portion to compute a result exponent portion of a result floating-point value; lookup a result mantissa portion of the result floating-point value in the exponential lookup table based on the fractional portion; and combine the result exponent portion and the result mantissa portion to generate the result floating-point value.