Patent classifications
G06F8/441
TECHNOLOGIES FOR UNTRUSTED CODE EXECUTION WITH PROCESSOR SANDBOX SUPPORT
Technologies for untrusted code execution include a computing device having a processor with sandbox support. The computing device executes code included in a native domain in a non-privileged, native processor mode. The computing device may invoke a sandbox jump processor instruction during execution of the code in the native domain to enter a sandbox domain. The computing device executes code in the sandbox domain in a non-privileged, sandbox processor mode in response to invoking the sandbox jump instruction. While executing in the sandbox processor mode, the processor denies access to memory outside of the sandbox domain and may deny execution of one or more prohibited instructions. From the sandbox domain, the computing device may execute a sandbox exit instruction to exit the sandbox domain and resume execution in the native domain. The computing device may execute processor instructions to configure the sandbox domain. Other embodiments are described and claimed.
Concept for Handling Memory Spills
Examples provide an apparatus, device, method, computer program and non-transitory machine-readable storage medium including program code for processing memory spill code during compilation of a computer program. The non-transitory machine-readable storage medium includes program code for processing memory spill code during compilation of a computer program, when executed, to cause a machine to perform identifying a plurality of instructions related to scalar memory spill code during compilation of a computer program, and transforming at least a subset of the plurality of instructions into vectorized code.
Handling Interrupts from a Virtual Function in a System with a Multi-Die Reconfigurable Processor
A system is presented that includes a communication link, a runtime processor, and a reconfigurable processor. The reconfigurable processor is adapted for generating an interrupt to the runtime processor in response to a predetermined event and includes first and second dies arranged in a package, having respective first and second arrays of coarse-grained reconfigurable (CGR) units, and respective first and second communication link interfaces coupled to the communication link. The runtime processor is adapted for configuring the first and second communication link interfaces to provide access to the first and second arrays of coarse-grained reconfigurable units from first and second physical function drivers and from at least one virtual function driver, and the reconfigurable processor is adapted for sending the interrupt to the first or to the second physical function driver and for sending the interrupt to a virtual function driver of the at least one virtual function driver.
METHODS AND APPARATUS FOR MACHINE LEARNING-GUIDED COMPILER OPTIMIZATIONS FOR REGISTER-BASED HARDWARE ARCHITECTURES
Methods, apparatus, systems, and articles of manufacture are disclosed that perform machine learning-guided compiler optimizations for register-based hardware architectures. Examples disclosed herein include a non-transitory computer readable medium comprising instructions that, when executed, cause a machine to at least select a register-based compiler transformation to apply to source code at a current position in a search tree, determine whether the search tree is in need of pruning based on an output of a query to a machine learning (ML) model, in response to determining the search tree is in need of pruning, prune the search tree at the current position, in response to applying the selected register-based compiler transformation to the source code, generate a code variant, calculate a score associated with the source code at the current position in the search tree, and update parameters of the machine learning (ML) model to include the calculated score.
Configurable Access to a Multi-Die Reconfigurable Processor by a Virtual Function
A data processing system is presented that includes a communication link, a runtime processor, and one or more reconfigurable processors. A reconfigurable processor includes first and second dies arranged in a package, having respective K and L arrays of coarse-grained reconfigurable (CGR) units, and respective first and second communication link interfaces coupled to the communication link. The runtime processor is adapted for configuring the first communication link interface to provide access to the K arrays of CGR units through the communication link from a first physical function driver and from up to M virtual function drivers, and for configuring the second communication link interface to provide access to the K arrays of CGR units of the first die and to the L arrays of CGR units of the second die through the communication link from a second physical function driver and from up to N virtual function drivers.
Allocating variables to computer memory
A method of allocating variables to computer memory includes determining at compile time when each of the plurality of variables is live in a memory region and allocating a memory region to each variable wherein at least some variables are allocated at compile time to overlapping memory regions to be stored in those memory regions at runtime at non-overlapping times.
GPR optimization in a GPU based on a GPR release mechanism
This disclosure provides systems, devices, apparatus and methods, including computer programs encoded on storage media, for GPR optimization in a GPU based on a GPR release mechanism. More specifically, a GPU may determine at least one unutilized branch within an executable shader based on constants defined for the executable shader. Based on the at least one unutilized branch, the GPU may further determine a number of GPRs that can be deallocated from previously allocated GPRs. The GPU may deallocate, for a subsequent thread within a draw call, the number of GPRs from the previously allocated GPRs during execution of the executable shader based on the determined number of GPRs to be deallocated.
Obsoleting values stored in registers in a processor based on processing obsolescent register-encoded instructions
Obsoleting values stored in registers in a processor based on processing obsolescent register-encoded instructions is disclosed. The processor is configured to support execution of read and/or write instructions that include obsolescence encoding indicating that one or more of its source and/or target register operands are to be obsoleted by the processor. A register encoded as obsolescent means the data value stored in such register will not be used by subsequent instructions in an instruction stream, and thus does not need to be retained. Thus, such register can be set as being in an obsolescent state so that the data value stored in such register can be ignored to improve performance. As one example, data values for registers having an obsolescent state can be ignored and thus not stored in a saved context for a process being switched out, thus conserving memory and improving processing time for a process switch.
Method and apparatus for reusable and relative indexed register resource allocation in function calls
The disclosed systems, apparatuses and methods are directed to optimizing by a compiler register resource allocation for functions of a module, using a Register File comprising a limited number of registers. After performing interprocedural analysis in the module, the compiler computes the number of registers used by each function, and compiles the function to final machine code, except at callsites where a call is detected to be made to another function. At each callsite and for each called function, the compiler expands call instructions to final machine code after computing and setting a relative index to be used by a called function for running in an available part of the Register File. The relative index optimizes register resource allocation by minimizing the number of spilled registers before a function is called.
NON-TRANSITORY COMPUTER-READABLE MEDIUM, ASSEMBLY INSTRUCTION CONVERSION METHOD AND INFORMATION PROCESSING APPARATUS
A non-transitory computer-readable medium having stored therein a program for causing a computer to execute a process. The process includes storing a plurality of generation instructions in a storage area for each of a plurality of first assembly instructions, each generation instruction instructing the generation of a machine language of a second assembly instruction that executes processing equivalent to each first assembly instruction, and generating machine languages of a plurality of second assembly instructions so that the machine languages of the second assembly instructions having a dependency relationship do not appear adjacent to each other, according to the plurality of generation instructions in the storage area.