G06F8/41

Device profiling in GPU accelerators by using host-device coordination

System and method of compiling a program having a mixture of host code and device code to enable Profile Guided Optimization (PGO) for device code execution. An exemplary integrated compiler can compile source code programmed to be executed by a host processor (e.g., CPU) and a co-processor (e.g., a GPU) concurrently. The compilation can generate an instrumented executable code which includes: profile instrumentation counters for the device functions; and instructions for the host processor to allocate and initialize device memory for the counters and to retrieve collected profile information from the device memory to generate instrumentation output. The output is fed back to the compiler for compiling the source code a second time to generate optimized executable code for the device functions defined in the source code.

Device profiling in GPU accelerators by using host-device coordination

System and method of compiling a program having a mixture of host code and device code to enable Profile Guided Optimization (PGO) for device code execution. An exemplary integrated compiler can compile source code programmed to be executed by a host processor (e.g., CPU) and a co-processor (e.g., a GPU) concurrently. The compilation can generate an instrumented executable code which includes: profile instrumentation counters for the device functions; and instructions for the host processor to allocate and initialize device memory for the counters and to retrieve collected profile information from the device memory to generate instrumentation output. The output is fed back to the compiler for compiling the source code a second time to generate optimized executable code for the device functions defined in the source code.

Big data application lifecycle management

Aspects of the present disclosure involve systems, methods, devices, and the like for creating an application lifecycle management platform for big data applications. In one embodiment the lifecycle management platform can include a multiple-layer container file that integrates multiple big-data tools/platforms. The system may create a generic template application, create a build environment for the generic template application, create a test environment for the generic template application, and run the built generic template application in the test environment prior to the user writing any new code in the generic template application. In one embodiment, the test environment includes a container management system or virtual machine that launches the big data application (which may be the generic template application before a developer edits the file) on a separate big-data server cluster.

Systems and methods for automatic data management for an asynchronous task-based runtime

A compilation system can define, at compile time, the data blocks to be managed by an Even Driven Task (EDT) based runtime/platform, and can also guide the runtime/platform on when to create and/or destroy the data blocks, so as to improve the performance of the runtime/platform. The compilation system can also guide, at compile time, how different tasks may access the data blocks they need in a manner that can improve performance of the tasks.

Systems and methods for automatic data management for an asynchronous task-based runtime

A compilation system can define, at compile time, the data blocks to be managed by an Even Driven Task (EDT) based runtime/platform, and can also guide the runtime/platform on when to create and/or destroy the data blocks, so as to improve the performance of the runtime/platform. The compilation system can also guide, at compile time, how different tasks may access the data blocks they need in a manner that can improve performance of the tasks.

Information processing apparatus, computer-readable recording medium storing compiling program, and compiling method
11579853 · 2023-02-14 · ·

An information processing apparatus includes a processor configured to: for each of a plurality of loops, acquire loop information including a number of variables, a number of registers, a number of memory commands for inputting and outputting a value of the variable between the register and a main storage device, and a number of arithmetic commands for the value of the variable stored in the register, which are used in the loop; calculate the number of variables, the number of registers, the number of memory commands, and the number of arithmetic commands, which correspond to a combination of the loops that are candidates for loop fusion, for each of the combinations of the loops; determine a combination to which the loop fusion is to be applied among the combinations which are calculated for each of the combinations; and execute the loop fusion on the determined combination.

Method and system for converting a single-threaded software program into an application-specific supercomputer

The invention comprises (i) a compilation method for automatically converting a single-threaded software program into an application-specific supercomputer, and (ii) the supercomputer system structure generated as a result of applying this method. The compilation method comprises: (a) Converting an arbitrary code fragment from the application into customized hardware whose execution is functionally equivalent to the software execution of the code fragment; and (b) Generating interfaces on the hardware and software parts of the application, which (i) Perform a software-to-hardware program state transfer at the entries of the code fragment; (ii) Perform a hardware-to-software program state transfer at the exits of the code fragment; and (iii) Maintain memory coherence between the software and hardware memories. If the resulting hardware design is large, it is divided into partitions such that each partition can fit into a single chip. Then, a single union chip is created which can realize any of the partitions.

Computing system and method for automated program error repair

This application relates to a computing system and method for an automated program error repair. In one aspect, the computing system includes a storage, a preprocessing processor, and an automated error repair processor. The storage stores a program code. The preprocessing processor acquires the program code from the storage and preprocesses the program code. Preprocessing includes tokenizing the program code with tokens, converting the tokens into vectors, and adding location information for the tokens. The automated error repair processor receives the preprocessed program code as an input from the preprocessing processor, detects an error in the preprocessed program code, corrects the detected error, and outputs the error-corrected program code. Detecting and correcting the error are performed based on a deep learning result and the location information for the tokens.

Systems and methods for controlling access to secure debugging and profiling features of a computer system
11580264 · 2023-02-14 · ·

The present disclosure describes systems and methods for controlling access to secure debugging and profiling features of a computer system. Some illustrative embodiments include a system that includes a processor, and a memory coupled to the processor (the memory used to store information and an attribute associated with the stored information). At least one bit of the attribute determines a security level, selected from a plurality of security levels, of the stored information associated with the attribute. Asserting at least one other bit of the attribute enables exportation of the stored information from the computer system if the security level of the stored information is higher than at least one other security level of the plurality of security levels.

Autonomously re-initializing applications based on detecting periodic changes in device state

Arrangements for autonomously re-initializing one or more applications after a detected change in device state are provided. In some examples, a configuration file may be received from one or more computing devices, such as a server, hosting one or more client-facing applications. In some examples, the configuration file may be modified. For instance, one or more properties or attributes may be modified or added to identify applications that have an always running status and identifying a custom class having automatic start enabled. A modified configuration file may be generated and transmitted to the one or more devices. Accordingly, upon detecting a change of device state (e.g., reboot, refresh, or the like) the modified configuration file may reboot and cause the identified applications to automatically or autonomously re-load, re-initialize and recompile prior to receiving a first request for access from a customer or user device.