G06F8/4432

Probabilistic framework for compiler optimization with multithread power-gating controls

A probabilistic framework for compiler optimization with multithread power-gating controls includes scheduling all thread fragments of a multithread computer code with the estimated execution time, logging all time stamps of events, and sorting and unifying the logged time stamps. Time slices are constructed using adjacent time stamps of each thread fragment. A power-gating time having a component turned off for each time slice is determined. Power-gateable windows that reduce energy consumption of the time slice is determined according to the power-gating time. The compiler inserts predicated power-gating instructions at locations corresponding to the selected power-gateable windows into the power-gateable computer code.

Systems and methods for minimizing communications

A system for allocation of one or more data structures used in a program across a number of processing units takes into account a memory access pattern of the data structure, and the amount of total memory available for duplication across the several processing units. Using these parameters duplication factors are determined for the one or more data structures such that the cost of remote communication is minimized when the data structures are duplicated according to the respective duplication factors while allowing parallel execution of the program.

Fast partial scalarization
11074667 · 2021-07-27 · ·

Methods, systems, and devices for fast partial scalarization are described. A device may generate a representation of a set of vectors and a set of vector instructions associated with the set of vectors. The device may determine information associated with a vector in the set of vectors based on the representation, the information including an indication of splitting the vector and splitting one or more vector instructions associated with the vector. In some aspects, the device may associate the vector to one or more other vectors in the set of vectors based on one or more vector instructions related to the set of vectors. The device may update the information based on the associating and generate partially scalarized instructions based on the updating. The device may generate the partially scalarized instructions by excluding a subset of vector instructions and generating additional subsets of vector instructions and scalar instructions.

METHODS AND SYSTEMS FOR PROGRAM OPTIMIZATION UTILIZING INTELLIGENT SPACE EXPLORATION

Embodiments for program optimization are provided. A program is compiled with respect to a performance result utilizing a set of parameters. Information associated with the compiling of the program is collected. The collected information is external to the performance result. The set of parameters is changed based on the collected information.

Power Optimization In An Artificial Intelligence Processor
20210081019 · 2021-03-18 ·

In one embodiment, the present disclosure includes a method of reducing power in an artificial intelligence processor. For each cycle, over a plurality of cycles, an AI model is translated into operations executable on an artificial intelligence processor. The translating is based on power parameters that correspond to power consumption and performance of the artificial intelligence processor. The AI processor is configured with the executable operations, and input activation data sets are processed. Accordingly, result sets, power consumption data, and performance data are generated and stored over the plurality of cycles. The method further includes training an AI algorithm using the stored parameters, the power consumption data, and the performance data. A trained AI algorithm outputs a plurality of optimized parameters to reduce power consumption of the AI processor. The AI model is then translated into optimized executable operations based on the plurality of optimized parameters.

INFORMATION PROCESSING APPARATUS, RECORDING MEDIUM FOR INFORMATION PROCESSING PROGRAM, AND INFORMATION PROCESSING METHOD
20210081143 · 2021-03-18 · ·

An information processing apparatus, includes a computation processing device that includes a memory and a processor coupled to the memory; and a storage device that stores a program, and wherein the processor is configured to: store, in the memory, a first storage area for first data that is assigned to a computation target by data definition for the computation target written in the program and a second storage area for second data that is assigned to the computation target instead of the first data, simplify the program, when the data definition for the computation target is omitted by executing the simplified program, output the second data, and perform the computation by using the output second data.

Managing contact status updates in a presence management system

A system configured to perform operations to receive, via a network communication interface, an indication of a power event occurring at a first device. The first device is for an online identity. The power event causes the first device to switch from an external power source to an internal battery. The first device represents that the online identity is online while the first device receives power from the internal battery. The system is further configured to perform operations to hold, at a second device, at least one status update for an online contact of the online identity while the first device receives power from the internal battery. Furthermore, the system is configured to perform operations to release, for transmission to the first device, the at least one status update in response to determining that the first device switches back to the external power source.

Power optimization in an artificial intelligence processor
10884485 · 2021-01-05 · ·

In one embodiment, the present disclosure includes a method of reducing power in an artificial intelligence processor. For each cycle, over a plurality of cycles, an AI model is translated into operations executable on an artificial intelligence processor. The translating is based on power parameters that correspond to power consumption and performance of the artificial intelligence processor. The AI processor is configured with the executable operations, and input activation data sets are processed. Accordingly, result sets, power consumption data, and performance data are generated and stored over the plurality of cycles. The method further includes training an AI algorithm using the stored parameters, the power consumption data, and the performance data. A trained AI algorithm outputs a plurality of optimized parameters to reduce power consumption of the AI processor. The AI model is then translated into optimized executable operations based on the plurality of optimized parameters.

SYSTEMS AND METHODS FOR ENERGY PROPORTIONAL SCHEDULING

A compilation system using an energy model based on a set of generic and practical hardware and software parameters is presented. The model can represent the major trends in energy consumption spanning potential hardware configurations using only parameters available at compilation time. Experimental verification indicates that the model is nimble yet sufficiently precise, allowing efficient selection of one or more parameters of a target computing system so as to minimize power/energy consumption of a program while achieving other performance related goals. A voltage and/or frequency optimization and selection is presented which can determine an efficient dynamic hardware configuration schedule at compilation time. In various embodiments, the configuration schedule is chosen based on its predicted effect on energy consumption. A concurrency throttling technique based on the energy model can exploit the power-gating features exposed by the target computing system to increase the energy efficiency of programs.

Game rendering method, terminal device, and non-transitory computer-readable storage medium

The present disclosure discloses a game rendering method and a terminal device. The terminal device includes a JS layer, a bridge layer, and a system framework layer. The method includes the follows. The JS layer transmits drawing instructions cached in an instruction set to the bridge layer, when a number of the drawing instructions cached in the instruction set is greater than or equal to a first threshold. The bridge layer obtains a rendering result by using an OpenGL capability to process the drawing instructions, and transmits the rendering result to the system framework layer. The system framework layer performs rendering based on the rendering result.