G06F8/4441

IMAGE RECOGNITION NEURAL NETWORK PROCESSING METHOD, DEVICE AND SYSTEM
20220319161 · 2022-10-06 ·

An image recognition neural network processing method includes: a compiler segments an image recognition neural network to obtain tiles of at least one network layer group; classifies the tiles of each network layer group; and for each network layer group, generates an assembly code and tile information of the network layer group according to a tile result and a classification result of the network layer group. The same type of tiles correspond to the same assembly function, each assembly code includes a code segment of the assembly function corresponding to each type of tiles, the tile information includes block information of each tile in the network layer group, the tile information used to instruct a neural network processor to, according to the block information therein, invoke a corresponding code segment to process image data of a corresponding tile when a target image is identified by the image recognition neural network.

Method of using multidimensional blockification to optimize computer program and device thereof

Disclosed embodiments relate to a method and device for optimizing compilation of source code. The proposed method receives a first intermediate representation code of a source code and analyses each basic block instruction of the plurality of basic block instructions contained in the first intermediate representation code for blockification. In order to blockify the identical instructions, the one or more groups of basic block instructions are assessed for eligibility of blockification. Upon determining as eligible, the group of basic block instructions are blockified using one of one dimensional SIMD vectorization and two-dimensional SIMD vectorization. The method further generates a second intermediate representation of the source code which is translated to executable target code with more efficient processing capacity.

Dynamically replacing a call to a software library with a call to an accelerator

A computer program includes calls to a software library. A virtual function table is built that includes the calls to the software library in the computer program. A programmable device includes one or more currently-implemented accelerators. The available accelerators that are currently-implemented are determined. The calls in the software library that correspond to a currently-implemented accelerator are determined. One or more calls to the software library in the virtual function table are replaced with one or more corresponding calls to a corresponding currently-implemented accelerator. When a call in the software library could be implemented in a new accelerator, an accelerator image for the new accelerator is dynamically generated. The accelerator image is then deployed to create the new accelerator. One or more calls to the software library in the virtual function table are replaced with one or more corresponding calls to the new accelerator.

Register pressure target function splitting

Provided are embodiments for a method of performing register pressure targeted function splitting. The method can include determining a candidate region of a function, the candidate region comprising variables, and determining a number of available registers in a computing system for allocating the variables of the function. The method can also include grouping the variables in the candidate region into first variables and second variables based at least in part on the number of available registers, and splitting the candidate region of the function into split functions based at least in part on the grouping of the variables. Also provided are embodiments for a computer program product and a system for performing register pressure targeted function splitting.

Method and system for software enhancement and management

A software enhancement and management system (E&M System) can include two ways to decompose existing software such that new functionality can be added: functional decomposition and time-affecting linear pathway (TALP) decomposition. Functional decomposition captures the inputs and outputs of the existing software's functions and attaches the new algorithmic constructs presented as other functions that receive the outputs of the existing software's functions. TALP decomposition allows for the generation of time-prediction polynomials that approximate time-complexity functions, speedup, and automatic dynamic loop-unrolling-based parallelization for each TALP.

SIGNATURE-BASED AUTOMATIC OFFLOAD TO HARDWARE ACCELERATORS
20230195471 · 2023-06-22 ·

Technology to automatically recognize library functions and substitute accelerator calls can include scanning program code to detect one or more of a library signature or an accelerator tag, substituting an accelerator call for a library function tagged with the accelerator tag upon detecting the accelerator tag, and upon detecting the library signature, performing one of substituting the accelerator call for the library function associated with the library signature or applying the accelerator tag to the library function associated with the library signature to indicate the accelerator call is to be substituted for the library function. When the accelerator tag is applied to the library function associated with the library signature, a subsequent scan is to be performed to detect the applied tag and substitute the accelerator call for the library function tagged with the accelerator tag.

PROCESSING A USER QUERY
20230195727 · 2023-06-22 ·

In an example there is provided a computer-implemented method which comprises generating an execution plan for a received user query, converting the execution plan into bytecode, compiling to unoptimized machine code using the bytecode and beginning execution of the execution plan by executing the unoptimized machine code, compiling optimized machine code using the bytecode whilst executing the unoptimized machine code; and switching to executing the optimized machine code in order to execute the execution plan, when the optimized machine code has been compiled.

CONTAINERIZED DEPLOYMENT OF MICROSERVICES BASED ON MONOLITHIC LEGACY APPLICATIONS

The present disclosure provides a scalable container-based system implemented in computer instructions stored in a non-transitory medium. The present disclosure further provides a method of creating and operating a scalable container-based system.

COMPILER THAT PERFORMS REGISTER PROMOTION OPTIMIZATIONS IN REGIONS OF CODE WHERE MEMORY ALIASING MAY OCCUR
20170351497 · 2017-12-07 ·

Processor hardware detects when memory aliasing occurs, and assures proper operation of the code even in the presence of memory aliasing. Because the hardware can detect and correct for memory aliasing, this allows a compiler to make optimizations such as register promotion even in regions of the code where memory aliasing can occur. The result is code that is more optimized and therefore runs faster.

Supporting dynamic behavior in statically compiled programs

Support for dynamic behavior is provided during static compilation while reducing reliance on JIT compilation and large runtimes. A mapping is created between metadata and native code runtime artifacts, such as between type definition metadata and a runtime type description, or between method definition metadata, a runtime type description, and a native code method location, or field definition metadata, a runtime type description, and a field location. A mapping between runtime artifacts may also be created. Some compilation results include trampoline code to support a reflection invocation of an artifact in the reduced runtime support environment, for virtual method calls, call-time bounds checking, calling convention conversion, or compiler-intrinsic methods. Some results support runtime diagnostics by including certain metadata even when full dynamic behavior is not supported.