G06F8/4434

Tracing engine-based software loop escape analysis and mixed differentiation evaluation

Systems and methods are provided for loop escape analysis in executing computer instructions. In one embodiment, a method comprises instructions performed by at least one computer process. The method comprises receiving a set of executable computer instructions stored on a storage medium (e.g., by reading the instructions from a tangible, non-transitory storage medium). The method further comprises analyzing the computer instructions to determine a loop, analyzing the computer instructions to determine at least one new variable in the loop, and storing, in a data structure, at least one of an operation related to the variable or a value related to the variable. The method further comprises determining whether to compress the data structure upon reaching the end of the loop, and, based on the determination, compressing the data structure. Systems and computer-readable media are also provided.

Data structure allocation into storage class memory during compilation

A method, a computer program product, and a system for allocating a variable into storage class memory during compilation of a program. The method includes selecting a variable recorded in a symbol table during compilation and computing a variable size of the variable by analyzing attributes related to the variable. The method further includes computing additional attributes relating to the variable. The method also includes computing a control flow graph and analyzing the control flow graph and the additional attributes to determine an allocation location for the variable. The method further includes allocating the variable into a storage class memory based on the analysis performed.

COMPILER, COMPILATION METHOD, AND COMPILER DEVICE
20220382548 · 2022-12-01 · ·

The present disclosure relates to a compiler for causing a computer to execute a process. The process includes generating a first program, wherein the first program includes a first code that determines whether a first area of a memory that a process inside a loop included in a second program refers to in a first execution time of the loop is in duplicate with a second area of the memory that the process refers to in a second execution time of the loop, a second code that executes the process in an order of the first and second execution times when it is determined that the first and the second areas are duplicate, and a third code that executes the process for the first execution time and the process for the second execution time in parallel when it is determined that the first and the second areas are not duplicate.

DETERMINISTIC MEMORY ALLOCATION FOR REAL-TIME APPLICATIONS
20220365764 · 2022-11-17 ·

Deterministic memory allocation for real-time applications. In an embodiment, bitcode is scanned to detect calls by a memory allocation function to a dummy function. Each call uses parameters comprising an identifier of a memory pool and a size of a data type to be stored in the memory pool. For each detected call, an allocation record, comprising the parameters, is generated. Then, a header file is generated based on the allocation records. The header file may comprise a definition of bucket(s) and a definition of memory pools. Each definition of a memory pool may identify at least one bucket.

DETERMINING MEMORY REQUIREMENTS FOR LARGE-SCALE ML APPLICATIONS TO FACILITATE EXECUTION IN GPU-EMBEDDED CLOUD CONTAINERS

We disclose a system that executes an inferential model in VRAM that is embedded in a set of graphics-processing units (GPUs). The system obtains execution parameters for the inferential model specifying: a number of signals, a number of training vectors, a number of observations and a desired data precision. It also obtains one or more formulae for computing memory usage for the inferential model based on the execution parameters. Next, the system uses the one or more formulae and the execution parameters to compute an estimated memory footprint for the inferential model. The system uses the estimated memory footprint to determine a required number of GPUs to execute the inferential model, and generates code for executing the inferential model in parallel while efficiently using available memory in the required number of GPUs. Finally, the system uses the generated code to execute the inferential model in the set of GPUs.

Structured weight based sparsity in an artificial neural network compiler

A novel and useful system and method of improved power performance and lowered memory requirements for an artificial neural network based on packing memory utilizing several structured sparsity mechanisms. The invention applies to neural network (NN) processing engines adapted to implement mechanisms to search for structured sparsity in weights and activations, resulting in a considerably reduced memory usage. The sparsity guided training mechanism synthesizes and generates structured sparsity weights. A compiler mechanism within a software development kit (SDK), manipulates structured weight domain sparsity to generate a sparse set of static weights for the NN. The structured sparsity static weights are loaded into the NN after compilation and utilized by both the structured weight domain sparsity mechanism and the structured activation domain sparsity mechanism. The application of structured sparsity lowers the span of search options and creates a relatively loose coupling between the data and control planes.

PROCESSOR AND INSTRUCTION SET

A processor includes a register file having a plurality of register file addresses, a processing unit, configured to perform processing in accordance with a configuration defined by information stored in the register file, and an instruction sequencer. The instruction sequencer is configured to control the processing unit by retrieving a sequence of instructions from a memory, in which each instruction includes an opcode, and a subset of the instructions includes a data portion. For each instruction in the sequence of instructions, the instruction sequencer performs an action defined by the opcode. The action for the subset of the opcodes includes writing the data portion to a register file address defined by the opcode. The sequence of instructions includes variable length instructions.

TRACING ENGINE-BASED SOFTWARE LOOP ESCAPE ANALYSIS AND MIXED DIFFERENTIATION EVALUATION
20230085300 · 2023-03-16 ·

A method for loop escape analysis includes receiving a set of executable computer instructions stored on a storage medium, and determining a number of inputs to a loop associated with a data structure, storage space that would be saved by compressing the data structure, and a size of new elements required to compress the data structure. Upon reaching an end of the loop, the method determines whether to compress the data structure based on a comparison between the size of the new elements and the saved storage space. In response to determining to compress the data structure, the method compresses the data structure.

SYSTEM AND METHOD FOR TRANSITION OF STATIC SCHEMA TO DYNAMIC SCHEMA
20230075578 · 2023-03-09 ·

Systems and methods that provide a mechanism to transition static schema to dynamic schema while maintaining backwards capability. Simple removal of static schema elements, followed by replacement with dynamic schema elements, make a third-party code incompatible since the third-party code references schema entities that no longer exist. Provided is a mechanism to decrease the memory use of non-material static schema entities. Transitioning static schema to dynamic schema allows the database to avoid loading non-material schema entities, thereby decreasing overall memory usage.

ON-TARGET UNIT TESTING
20230071041 · 2023-03-09 ·

The present disclosure is directed to systems and methods directed to improving the functions of a vehicle. Systems and methods are provided that provide a custom tool that autogenerates a set of software agents that allows a system to separate processing, transmission and receiving of messages to achieve better synchronization. The disclosure herein also provides a simplified method of key provisioning by designating one client as a server and assigning a symmetric key to every other client permanently provisioned between that client and the server. Systems and method are further provided that predict faults in a vehicle. Systems and methods are also provided that preserve data in the event of a system crash. Systems and methods are also provided in which an operating system of a vehicle detects the presence of a new peripheral and pulls the related interface file for that new peripheral. Further, a data synchronization solution is provided herein which provides optimized levels of synchronization.