G06F8/4451

METHOD AND SYSTEM FOR OPTIMIZING DATA TRANSFER FROM ONE MEMORY TO ANOTHER MEMORY
20230120354 · 2023-04-20 · ·

A method and system for moving data from a source memory to a destination memory by a processor are disclosed. The processor has a plurality of registers and the source memory stores a sequence of instructions that include one or more load instructions and one or more store instructions. The processor moves the load instructions from the source memory to the destination memory. Then, the processor initiates execution of the load instructions from the destination memory in order to load the data from the source memory to one or more registers in the processor. Execution then returns to the sequence of instructions stored in the source memory, and the processor stores the data from the registers to the destination memory.

Thread-safe development in a multi-threaded system

A method for thread-safe development of a computer program configured for parallel thread execution comprises maintaining a digital record of read or write access to a data object from each of a plurality of sibling threads executing on a computer system. Pursuant to each instance of read or write access from a given sibling thread, an entry comprising an indicator of the access type is added to the digital record. The method further comprises assessing the thread safety of the read or write access corresponding to each entry in the digital record and identifying one or more thread-unsafe instances of read or write access based on the assessment of thread safety.

Method and apparatus for minimally intrusive instruction pointer-aware processing resource activity profiling

Systems and methods for minimally intrusive instruction pointer-aware processing resource activity profiling are disclosed. In one embodiment, a graphics processor includes a grouping of processing resources and control logic that is associated with the grouping of processing resources. The control logic is configured to sample a state of at least one processing resource of the grouping of processing resources and to determine activity data from the state with the activity data including at least one of stalls and reason counts for stalling activity, instruction types, pipeline utilization, thread utilization, and shader activity.

METHOD AND APPARATUS FOR MINIMALLY INTRUSIVE INSTRUCTION POINTER-AWARE PROCESSING RESOURCE ACTIVITY PROFILING

Systems and methods for minimally intrusive instruction pointer-aware processing resource activity profiling are disclosed. In one embodiment, a graphics processor includes a grouping of processing resources and control logic that is associated with the grouping of processing resources. The control logic is configured to sample a state of at least one processing resource of the grouping of processing resources and to determine activity data from the state with the activity data including at least one of stalls and reason counts for stalling activity, instruction types, pipeline utilization, thread utilization, and shader activity.

Information processing apparatus, non-transitory computer-readable medium, and information processing method
11163570 · 2021-11-02 · ·

An information processing apparatus includes: a memory; and a processor configured to: acquire an instruction sequence including plural instructions; generate plural candidates of new instruction sequences capable of obtaining an execution result as same as in the instruction sequence, by replacing at least a part of plural nop instructions included in the instruction sequence with a wait instruction that waits for completion of all preceding instructions; delete any one of the nop instructions and the wait instruction from each of the new instruction sequences, when the execution result does not change in case any one of the nop instructions and the wait instruction is deleted from the new instruction sequences in the candidates; and select a one candidate among the candidates subjected to the delete, the one candidate including the number of instructions equal to or less than a certain number, and having a smallest number of execution cycles.

BACKGROUND PROCESSING DURING REMOTE MEMORY ACCESS
20220214827 · 2022-07-07 · ·

An apparatus for executing a software program, comprising at least one hardware processor configured for: identifying in a plurality of computer instructions at least one remote memory access instruction and a following instruction following the at least one remote memory access instruction; executing after the at least one remote memory access instruction a sequence of other instructions, where the sequence of other instructions comprises a return instruction to execute the following instruction; and executing the following instruction; wherein executing the sequence of other instructions comprises executing an updated plurality of computer instructions produced by at least one of: inserting into the plurality of computer instructions the sequence of other instructions or at least one flow-control instruction to execute the sequence of other instructions; and replacing the at least one remote memory access instruction with at least one non-blocking memory access instruction.

THREAD-SAFE DEVELOPMENT IN A MULTI-THREADED SYSTEM

A method for thread-safe development of a computer program configured for parallel thread execution comprises maintaining a digital record of read or write access to a data object from each of a plurality of sibling threads executing on a computer system. Pursuant to each instance of read or write access from a given sibling thread, an entry comprising an indicator of the access type is added to the digital record. The method further comprises assessing the thread safety of the read or write access corresponding to each entry in the digital record and identifying one or more thread-unsafe instances of read or write access based on the assessment of thread safety.

Background processing during remote memory access
11144238 · 2021-10-12 · ·

An apparatus for executing a software program, comprising at least one hardware processor configured for: identifying in a plurality of computer instructions at least one remote memory access instruction and a following instruction following the at least one remote memory access instruction; executing after the at least one remote memory access instruction a sequence of other instructions, where the sequence of other instructions comprises a return instruction to execute the following instruction; and executing the following instruction; wherein executing the sequence of other instructions comprises executing an updated plurality of computer instructions produced by at least one of: inserting into the plurality of computer instructions the sequence of other instructions or at least one flow-control instruction to execute the sequence of other instructions; and replacing the at least one remote memory access instruction with at least one non-blocking memory access instruction.

METHOD AND APPARATUS FOR MINIMALLY INTRUSIVE INSTRUCTION POINTER-AWARE PROCESSING RESOURCE ACTIVITY PROFILING

Systems and methods for minimally intrusive instruction pointer-aware processing resource activity profiling are disclosed. In one embodiment, a graphics processor includes a grouping of processing resources and control logic that is associated with the grouping of processing resources. The control logic is configured to sample a state of at least one processing resource of the grouping of processing resources and to determine activity data from the state with the activity data including at least one of stalls and reason counts for stalling activity, instruction types, pipeline utilization, thread utilization, and shader activity.

INFORMATION PROCESSING APPARATUS, NON-TRANSITORY COMPUTER-READABLE MEDIUM, AND INFORMATION PROCESSING METHOD
20200249945 · 2020-08-06 · ·

An information processing apparatus includes: a memory; and a processor configured to: acquire an instruction sequence including plural instructions; generate plural candidates of new instruction sequences capable of obtaining an execution result as same as in the instruction sequence, by replacing at least a part of plural nop instructions included in the instruction sequence with a wait instruction that waits for completion of all preceding instructions; delete any one of the nop instructions and the wait instruction from each of the new instruction sequences, when the execution result does not change in case any one of the nop instructions and the wait instruction is deleted from the new instruction sequences in the candidates; and select a one candidate among the candidates subjected to the delete, the one candidate including the number of instructions equal to or less than a certain number, and having a smallest number of execution cycles.