G06F8/4434

WORKLOAD ORIENTED CONSTANT PROPAGATION FOR COMPILER
20220147331 · 2022-05-12 · ·

An embodiment of a semiconductor package apparatus may include technology to identify workload control variables, add workload flags to respective edges in a static single assignment graph, and propagate constants based on the identified workload control variables and the workload flags. Other embodiments are disclosed and claimed.

Non-transitory computer-readable recording medium, compilation method, and compiler device
11734003 · 2023-08-22 · ·

The present disclosure relates to a compiler for causing a computer to execute a process. The process includes generating a first program, wherein the first program includes a first code that determines whether a first area of a memory that a process inside a loop included in a second program refers to in a first execution time of the loop is in duplicate with a second area of the memory that the process refers to in a second execution time of the loop, a second code that executes the process in an order of the first and second execution times when it is determined that the first and the second areas are duplicate, and a third code that executes the process for the first execution time and the process for the second execution time in parallel when it is determined that the first and the second areas are not duplicate.

Method of using multidimensional blockification to optimize computer program and device thereof

Disclosed embodiments relate to a method and device for optimizing compilation of source code. The proposed method receives a first intermediate representation code of a source code and analyses each basic block instruction of the plurality of basic block instructions contained in the first intermediate representation code for blockification. In order to blockify the identical instructions, the one or more groups of basic block instructions are assessed for eligibility of blockification. Upon determining as eligible, the group of basic block instructions are blockified using one of one dimensional SIMD vectorization and two-dimensional SIMD vectorization. The method further generates a second intermediate representation of the source code which is translated to executable target code with more efficient processing capacity.

DYNAMIC MEMORY ALLOCATION METHODS AND SYSTEMS
20220137841 · 2022-05-05 ·

In a dynamic memory allocator, a method of allocating memory to a process, the method comprising executing on a processor the steps of: creating one or more arenas within the memory, each arena comprising one or more memory blocks and each arena having an n-byte aligned arena address; upon receiving a memory request from the process, returning a pointer to the process, the pointer having as its value an address of a memory block selected from one of the arenas; upon determining that the memory block is no longer needed by the process, retrieving the address of said memory block from the pointer and releasing the memory block; and, upon a new arena being created, shifting forward the n-byte aligned address of said new arena according to a stored variable such that each memory block of said new arena is also shifted by the stored variable, the stored variable having n bytes and the stored variable having a random value.

Framework for user-directed profile-driven optimizations
11321061 · 2022-05-03 · ·

A method for using profiling to obtain application-specific, preferred parameter values for an application is disclosed. First, a parameter for which to obtain an application-specific value is identified. Code is then augmented for application-specific profiling of the parameter. The parameter is profiled and profile data is collected. The profile data is then analyzed to determine the application's preferred parameter value for the profile parameter.

SELECTIVE INJECTION OF GC SAFEPOINTS FOR JNI INVOCATIONS
20220129290 · 2022-04-28 ·

A computer-implemented method is provided for managing Garbage Collection (GC) safepoints. The method includes determining whether a GC safepoint for a target native method can be removed by checking a heap occupancy ratio prior to executing the target native method. The method further includes removing the GC safepoint responsive to the heap occupancy ratio prior to executing the target native method being less than a threshold occupancy amount percentage. The method also includes determining whether the GC safepoint for the target native method can be removed by checking a most recent GC pause time. The method additionally includes removing the GC safepoint responsive to the most recent GC pause time being shorter by a threshold pause time amount percentage than an execution time of the target native method.

Monitoring stack memory usage to optimize programs

A computer system determines stack usage. An intercept function is executed to store a stack marker in a stack, wherein the intercept function is invoked when a program enters or exits each function of a plurality of functions of the program. A plurality of stack markers are identified in the stack and a memory address is determined for each stack marker during execution of the program to obtain a plurality of memory addresses. The plurality of memory addresses are analyzed to identify a particular memory address associated with a greatest stack depth. A stack usage of the program is determined based on the greatest stack depth. Embodiments of the present invention further include a method and program product for determining stack usage in substantially the same manner described above.

Framework For User-Directed Profile-Driven Optimizations
20230244458 · 2023-08-03 ·

A method for using profiling to obtain application-specific, preferred parameter values for an application is disclosed. First, a parameter for which to obtain an application-specific value is identified. Code is then augmented for application-specific profiling of the parameter. The parameter is profiled and profile data is collected. The profile data is then analyzed to determine the application's preferred parameter value for the profile parameter.

Merging Skip-Buffers

A method in a reconfigurable computing system includes connecting a plurality of tensor consumers to their corresponding tensor producers via skip-buffers, which generates a plurality of skip-buffers. The method includes determining that at least one skip-buffer of the plurality of skip-buffers corresponding to a first set of tensor consumers and at least one skip-buffer of the plurality of skip-buffers corresponding to a second set of tensor consumers, are compatible to wholly or partially merge. The method also includes merging, wholly or partially, the compatible skip-buffers to produce a merged skip-buffer having a minimal buffer depth. The described method may reduce memory unit consumption and latency.

Allocating variables to computer memory
11762641 · 2023-09-19 · ·

A method of allocating variables to computer memory includes determining at compile time when each of the plurality of variables is live in a memory region and allocating a memory region to each variable wherein at least some variables are allocated at compile time to overlapping memory regions to be stored in those memory regions at runtime at non-overlapping times.