G06F9/45516

Dynamically-imposed field and method type restrictions for managed execution environments

A data structure (e.g., field, method parameter, or method return value) is defined by a descriptor to be of a particular type, which imposes a first set of restrictions on values assumable by the data structure. Separately, the data structure is associated with a type restriction that defines a second set of restrictions that further restricts the values assumable by the data structure. The descriptor and type restriction are encoded separately in a program binary. Responsive to identifying a value for the data structure that (a) is not forbidden by the first set of restrictions defined the descriptor and (b) is forbidden by the second set of restrictions defined by the type restriction, a runtime environment may perform a restrictive operation, such as: blocking storage of the value to a field; blocking passing of the value to a method parameter; or blocking return of the value from a method.

REAL-TIME PERFORMANCE TRACKING USING DYNAMIC COMPILATION

Systems, apparatuses, and methods for performing real-time tracking of performance targets using dynamic compilation. A performance target is specified in a service level agreement. A dynamic compiler analyzes a software application executing in real-time and determine which high-level application metrics to track. The dynamic compiler then inserts instructions into the code to increment counters associated with the metrics. A power optimization unit then utilizes the counters to determine if the system is currently meeting the performance target. If the system is exceeding the performance target, then the power optimization unit reduces the power consumption of the system while still meeting the performance target.

ON-DEMAND BINARY TRANSLATION STATE MAP GENERATION
20170371634 · 2017-12-28 · ·

The present disclosure is directed to a system for on-demand binary translation state map generation. Instead of interpreting the native code to be executed, binary translation circuitry (BT circuitry) may execute a binary translation (BT) in place of the native code. When a stop occurs (e.g., due to an interrupt, a modification of the native code, etc.), the BT circuitry may generate a binary translation state map (BT state map) that allows the location of the stop to be mapped back to the native code. Generation of the BT state map may involve determining a location and offset for the stop, performing region formation based on the location, loading instructions from the region (e.g., while accounting for the need to emulate instructions), forming the BT state map based at least on the size of the loaded instructions, and then mapping the stop back to the native code utilizing the offset.

AUTOMATED USE OF COMPUTATIONAL MOTIFS VIA DEEP LEARNING DETECTION
20230205517 · 2023-06-29 ·

A system and method are described for efficiently utilizing optimized implementations of computational patterns in an application. In various implementations, a computing system includes at least one or more processors, and these one or more processors and other hardware resources of the computing system process a variety of applications. Sampled, dynamic values of hardware performance counters are sent to a trained data model. The data model provides characterization of the computational patterns being used and the types of workloads being processed. The data model also indicates whether the identified computational patterns already use an optimized version. Later, a selected processor determines a given unoptimized computational pattern is no longer running and replaces this computational pattern with an optimized version. Although the application is still running, the processor performs a static replacement. On a next iteration of the computational pattern, the optimized version is run.

Rendering optimisation by recompiling shader instructions
11688123 · 2023-06-27 · ·

A rendering optimisation identifies a draw call within a current render (which may be the first draw call in the render or a subsequent draw call in the render) and analyses a last shader in the series of shaders used by the draw call to determine whether the last shader samples from the one or more buffers at coordinates matching a current fragment location. If this determination is positive, the method further recompiles the last shader to replace an instruction that reads data from one of the one or more buffers at coordinates matching a current fragment location with an instruction that reads from the one or more buffers at coordinates stored in on-chip registers.

Implementing optional specialization when executing code

A compiler is capable of compiling instructions that do or do not supply specialization information for a generic type. The generic type is compiled into an unspecialized type. If specialization information was supplied, the unspecialized type is adorned with information indicating type restrictions for application programming interface (API) points associated with the unspecialized type, which becomes a specialized type. A runtime environment is capable of executing calls to a same API point that do or do not indicate a specialized type, and is capable of executing calls to a same API point of objects of an unspecialized type or of objects of a specialized type. When the call to an API point indicates a specialized type, and the specialized type matches that of the object (if the API point belongs to an object), then a runtime environment may perform optimized accesses based on type restrictions derived from the specialized type.

Flexible acceleration of code execution
09836316 · 2017-12-05 · ·

Technologies for performing flexible code acceleration on a computing device includes initializing an accelerator virtual device on the computing device. The computing device allocates memory-mapped input and output (I/O) for the accelerator virtual device and also allocates an accelerator virtual device context for a code to be accelerated. The computing device accesses a bytecode of the code to be accelerated and determines whether the bytecode is an operating system-dependent bytecode. If not, the computing device performs hardware acceleration of the bytecode via the memory-mapped I/O using an internal binary translation module. However, if the bytecode is operating system-dependent, the computing device performs software acceleration of the bytecode.

Methods, systems, and media for binary compatible graphics support in mobile operating systems

Methods, systems, and media for binary compatible graphics support in mobile operating systems are provided. In some embodiments, binary compatible graphics support can be provided by extending diplomatic functions to perform library-wide prelude and postlude operations in the context of the foreign operating system before and after domestic library usage. In some embodiments, binary compatible graphics support can be provided by using thread impersonation approaches that allow one thread to temporarily take on the persona of another thread to perform some action that may be tread-dependent. In some embodiments, binary compatible graphics support can be provided by using dynamic library replication approaches that load multiple, independent instances of a single library within the same process.

Ahead of time compilation of content pages
09830307 · 2017-11-28 · ·

Systems and methods are described which provide improved memory and resource allocation for content pages accessed using a browser. A content page may be accessed and compiled into machine code, such as executable files or an application. The machine code may then be executed on a user device in a process separate from the browser to cause display of the content page, such as in a standalone application. Content pages may be pre-compiled and provided in response to user requests to access the content pages. A content page may be compiled faster than the browser may process the native content page, and the compiled machine code may require less memory than the native content page and associated resources. In some embodiments an intermediary system may be used to perform content page compilation and may provide the compiled machine code, instead of the native content, to the user device.

METHOD OF IMPLEMENTING AN ARM64-BIT FLOATING POINT EMULATOR ON A LINUX SYSTEM
20230176869 · 2023-06-08 ·

The present invention provides a method of implementing an ARM64-bit floating point emulator on a Linux system, which includes: running an ARM64-bit instruction on the Linux system; applying an instruction classifier to a first feature code of a machine code indicated by the ARM64-bit instruction to determine whether the ARM64-bit instruction is an ARM64-bit floating point instruction; and, if the ARM64-bit instruction is an ARM64-bit floating point instruction, applying the instruction classifier to a second feature code of the machine code indicated by the ARM64-bit instruction to determine the ARM64-bit floating instruction to be a specific ARM64-bit floating instruction.