Patent classifications
G06F2117/08
METHOD FOR CO-DESIGN OF HARDWARE AND NEURAL NETWORK ARCHITECTURES USING COARSE-TO-FINE SEARCH, TWO-PHASED BLOCK DISTILLATION AND NEURAL HARDWARE PREDICTOR
Methods, systems, and apparatus for combined or separate implementation of coarse-to-fine neural architecture search (NAS), two-phase block NAS, variable hardware prediction, and differential hardware design are provided and described. A variable predictor is trained, as described herein. Then, a controller or policy may be used to iteratively modify a neural network architecture along dimensions formed by neural network architecture parameters. The modification is applied to blocks (e.g., subnetworks) within the neural network architecture. In each iteration, the remainder of the neural network architecture parameters are modified and learned with a differential NAS method. The training process is performed with two-phase block NAS and incorporates a variable hardware predictor to predict power, performance, and area (PPA) parameters. The hardware parameters may be learned as well using the variable hardware predictor.
SYSTEMS AND METHODS FOR MODELING AND SIMULATING AN IOT SYSTEM
Methods and systems to model, simulate and continuously analyze global non-functional properties, such as profitability, availability, security and performance, of complex Internet of Things (IoT) systems. This modeling enables the collaborative design, interoperability, documentation, simulation, testing, deployment, operations, analysis and optimization of connected services and IoT infrastructures. Various embodiments of the present invention may be characterized as a tool for modeling an IoT system and controlling the evolution of this system. The present invention enables a customer or any entity to describe and simulate an IoT system in different scenarios and, in turn, derive various estimates for what the customer will have to invest. This is of great benefit to entities since building out and implementing a complex IoT system is likely an expensive and time and resource consuming endeavor.
Method for configuring a co-simulation for a total system
A method and system (and/or a total simulation) have at least first and second sub-systems. An interconnection network is determined, which couples and determines the first and the second sub-systems at a coupling. First sub-system information of the first sub-system and second sub-system information of the second sub-system are determined. An execution sequence is selected, by which it is determined, in which sequence relative to each other a first and a second parameter outputs are determined. Furthermore, extrapolation methods are determined, by which first and second parameter inputs are determinable during a macro step size (e.g. between the coupling times). The macro step size prescribes-coupling times, at which an exchange of corresponding first and second input parameters and of the first and the second output parameters between the sub-systems is performed. The coupling of the sub-systems is configured based on the interconnection network, the first sub-system information and the second sub-system information, the execution sequence, the extrapolation methods, and the macro step size, and the co-simulation is performed.
Signal flow-based computer program with direct feedthrough loops
A method for controlling the course of a signal flow-based computer program having interconnected software components and at least one DF loop. The following method steps are performed: a) identifying the at least one DF loop and the DF components, each DF component instantaneously imaging at least one DF input signal present at at least one component input onto at least one output signal present at at least one component output, b) determining the maximum possible change of the values of the DF input signals for each unit of time from at least one property of the respective DF input signal, c) activating a delay element in front of the component input where a DF input signal is present whose value has the smallest maximum possible change, and d) running the computer program in accordance with the connection of the software components ascertained in steps a) to c).
Adaptable dynamic region for hardware acceleration
Creating an adaptable dynamic region for hardware acceleration can include receiving a first kernel for inclusion in a circuit design for an integrated circuit of an accelerator platform. The circuit design includes a dynamic design corresponding to a dynamic region of programmable circuitry in the integrated circuit that couples to a static region of the programmable circuitry. The first kernel can be included in the within the dynamic design. A global resource used by the first kernel can be determined. An interconnect architecture for the dynamic design can be constructed based on the global resource used by the first kernel.
METHOD AND SYSTEM FOR BUILDING HARDWARE IMAGES FROM HETEROGENEOUS DESIGNS FOR ELETRONIC SYSTEMS
Automatically generating a hardware image based on programming model types includes determining by a design tool, types of programming models used in specifications of blocks of a circuit design, in response to a user control input to generate a hardware image to configure a programmable integrated circuit (IC). The design tool can generate a model-type compiler script for each of the types of programming models. Each compiler script initiates compilation of blocks having specifications based on one of the types of programming model into an accelerator representation. The design tool can generate a build script configured to execute the compiler scripts and link the accelerator representations into linked accelerator representations. Execution of the build script builds a hardware image from the linked accelerator representations for configuring the programmable IC to implement a circuit according to the circuit design.
Support system and computer readable medium
A software studying unit (122) calculates software processing time of each of a plurality of functions in a target source program. A data-flow graph generation unit (121) generates an inter-function data-flow graph of the plurality of functions based on the target source program. A hardware studying unit (130) calculates hardware processing time of each function and a circuit scale of each function by a high-level synthesis for the target source program. An implementation combination selection unit (140) selects, based on the software processing time of each function, the hardware processing time of each function, the circuit scale of each function, and the inter-function data-flow graph, an implementation combination of one or more functions to be implemented by software and one or more functions to be implemented by hardware.
METHODS AND APPARATUS FOR PROFILE-GUIDED OPTIMIZATION OF INTEGRATED CIRCUITS
Methods and apparatus for performing profile-guided optimization of integrated circuit hardware are provided. Circuit design tools may receive a source code and compile the source code to generate a hardware description. The hardware description may include profiling blocks configured to measure useful information required for optimization. The hardware description may then be simulated to gather profiling data. The circuit design tools may then analyze the gathered profiling data to identify additional opportunities for hardware optimization. The source code may then be modified based on the analysis of the profiling data to produce a smaller and faster hardware that is better suited to the application.
Method to segregate logic and memory into separate dies for thermal management in a multi-dimensional packaging
A packaging technology to improve performance of an AI processing system resulting in an ultra-high bandwidth system. An IC package is provided which comprises: a substrate; a first die on the substrate, and a second die stacked over the first die. The first die can be a first logic die (e.g., a compute chip, CPU, GPU, etc.) while the second die can be a compute chiplet comprising ferroelectric or paraelectric logic. Both dies can include ferroelectric or paraelectric logic. The ferroelectric/paraelectric logic may include AND gates, OR gates, complex gates, majority, minority, and/or threshold gates, sequential logic, etc. The IC package can be in a 3D or 2.5D configuration that implements logic-on-logic stacking configuration. The 3D or 2.5D packaging configurations have chips or chiplets designed to have time distributed or spatially distributed processing. The logic of chips or chiplets is segregated so that one chip in a 3D or 2.5D stacking arrangement is hot at a time.
3D model validation and optimization system and method thereof
A network system can optimize 3D models for 3D printing. A smoothing operation can be performed for a 3D model that comprises a plurality of voxels by identifying exterior voxels of the 3D model. For a first exterior voxel of the 3D model, an exterior surface orientation can be determined and a smoothing operation can be performed based on the determined exterior surface orientation. The smoothing operation can include performing a triangulation operation based on the determined exterior surface orientation of the first exterior voxel. Furthermore, in response to determining that a dimension of a set of voxels is below a threshold limit, one or more voxels can be added to the set of voxels to satisfy the threshold limit.