G06F9/30083

SYSTEMS AND METHODS TO SKIP INCONSEQUENTIAL MATRIX OPERATIONS

Disclosed embodiments relate to systems and methods to skip inconsequential matrix operations. In one example, a processor includes decode circuitry to decode an instruction having fields to specify an opcode and locations of first source, second source, and destination matrices, the opcode indicating that the processor is to multiply each element at row M and column K of the first source matrix with a corresponding element at row K and column N of the second source matrix, and accumulate a resulting product with previous contents of a corresponding element at row M and column N of the destination matrix, the processor to skip multiplications that, based on detected values of corresponding multiplicands, would generate inconsequential results; scheduling circuitry to schedule execution of the instruction; and execution circuitry to execute the instructions as per the opcode.

Apparatus and method for lowering power dissipation based on operation between dual processors

The present disclosure discloses a dual-processor electronic apparatus operation method used in a dual-processor electronic apparatus that includes steps outlined below. A first processor is activated in an initialization procedure. A second processor is activated by the first processor to enter an operation mode. The first processor is deactivated in the operation mode, and the second processor executes a predetermined procedure. Whether a predetermined event occurs during the execution of the predetermined procedure is determined by the second processor such that event information is stored when the predetermined event occurs and the first processor is activated. The event information is accessed and processed by the first processor.

METHOD AND APPARATUS FOR IMPLEMENTING POWER MODES IN MICROCONTROLLERS USING POWER PROFILES
20170371661 · 2017-12-28 ·

A method and apparatus for implementing power modes in microcontrollers (MCUs) using power profiles. In one embodiment of the method, a central processing unit (CPU) of the MCU executes a first instruction for calling a subroutine stored in a memory of the MCU, wherein the first instruction comprises a first parameter to be passed to the subroutine. Thereafter the CPU writes a first value to a first special function register (SFR) of the MCU in response to executing the first instruction, wherein the first value is related to the first parameter. The MCU operates in a first power mode in response to the CPU writing the first value to the first SFR. The CPU also executes a second instruction for calling the subroutine, wherein the second instruction comprises a second parameter to be passed to the subroutine. In response the CPU writes a second value to a second SFR of the MCU in response to executing the second instruction, wherein the second value is related to the second parameter. The MCU operates in a second power mode in response to the CPU writing the second value to the second SFR. The MCU consumes more power operating in the first power mode than it does when operating in the second power mode.

SYSTEM AND METHOD FOR ASSOCIATIVE POWER AND CLOCK MANAGEMENT WITH INSTRUCTION GOVERNED OPERATION FOR POWER EFFICIENT COMPUTING
20170351318 · 2017-12-07 ·

A system includes an ARM core processor, a programmable regulator, a compiler, and a control unit, where the compiler uses a performance association outcome to generate a 2-bit regulator control values encoded into each individual instruction. The system can provide associative low power operation where instructions govern the operation of on-chip regulators or clock generator in real time. Based on explicit association between long delay instruction patterns and hardware performance, an instruction based power management scheme with energy models are formulated for deriving the energy efficiency of the associative operation. An integrated voltage regulator or clock generator is dynamically controlled based on instructions existing in the current pipeline stages leading to additional power saving. A compiler optimization strategy can further improve the energy efficiency.

VLIW Power Management

VLIW directed Power Management is described. In accordance with described techniques, a program is compiled to generate instructions for execution by a very long instruction word machine. During the compiling, power configurations for the very long instruction word machine to execute the instructions are determined, and fields of the instructions are populated with the power configurations. In one or more implementations, an instruction that includes a power configuration for the very long instruction word machine and operations for execution by the very long instruction word machine is obtained. A power setting of the very long instruction word machine is adjusted based on the power configuration of the instruction, and the operations of the instruction are executed by the very long instruction word machine.

POWER MANAGEMENT OF BRANCH PREDICTORS IN A COMPUTER PROCESSOR
20170344377 · 2017-11-30 ·

A computer processor includes a branch prediction unit that includes a local branch predictor and a global branch predictor. Managing power consumption in such a computer processor includes, for each of a plurality of branch instructions: performing, by the local branch predictor, a local branch prediction; performing, by each of the global branch predictors, a global branch prediction; determining to utilize the local branch prediction over the global branch predictions as a branch prediction for the branch instruction; incrementing a value of a counter; determining whether the value of the counter exceeds a predetermined threshold; and if the value of the counter exceeds the predetermined threshold, powering down at least one of the global branch predictors and configuring the branch prediction unit to bypass the powered down global branch predictor for branch predictions of subsequent branch instructions.

Power management synchronization messaging system

A multi-die package for a microprocessor provides a power management synchronization system. The package has a plurality of dies. Each die has a plurality of cores, including a single master core. A plurality of sideband non-system-bus inter-die communication wires communicatively couple the dies to each other for a purpose of synchronizing power management. The master core of each die is configured to use one and only one of the inter-die communication wires to transmit power management synchronization messages to each of the other master cores. The master core of each die is also configured to receive power management synchronization messages from each of the other master cores via one or more inter-die communication wires. The cores use this system of inter-die communication wires to synchronize management of resources that affect both the performance and power consumption of the cores.

Apparatus and method to preclude X86 special bus cycle load replays in an out-of-order processor

An apparatus including first and second reservation stations. The first reservation station dispatches a load micro instruction, and indicates on a hold bus if the load micro instruction is a specified load micro instruction directed to retrieve an operand from a prescribed resource other than on-core cache memory, where the specified load instruction comprises a load instruction resulting from execution of an x86 special bus cycle. The second reservation station is coupled to the hold bus, and dispatches one or more younger micro instructions therein that depend on the load micro instruction for execution after a number of clock cycles following dispatch of the first load micro instruction, and if it is indicated on the hold bus that the load micro instruction is the specified load micro instruction, the second reservation station is configured to stall dispatch of the one or more younger micro instructions until the load micro instruction has retrieved the operand.

DYNAMIC PIPELINE THROTTLING USING CONFIDENCE-BASED WEIGHTING OF IN-FLIGHT BRANCH INSTRUCTIONS

Systems and methods for operating a processor include determining confidence levels, such as high, low, and medium confidence levels, associated with in-flight branch instructions in an instruction pipeline of the processor, based on counters used for predicting directions of the in-flight branch instructions. Numbers of in-flight branch instructions associated with each of confidence levels are determined. A weighted sum of the numbers weighted with weights corresponding to the confidence levels is calculated and the weighted sum is compared with a threshold. A throttling signal may be asserted to indicate that instructions are to be throttled in a pipeline stage of the instruction pipeline based on the comparison.

Mechanism for saving and retrieving micro-architecture context

Disclosed embodiments relate to processing logic for performing function operations. In one example, and apparatus includes an execution unit within a processor to execute a code block, power management hardware coupled to the execution unit, wherein the power management hardware is to monitor a first execution of the code block, store a micro-architectural context of the processor in a metadata block associated with the code block, the micro-architectural context including performance data resulting from the first execution of the code block, the performance data comprising power and energy usage data, and power management related parameters, read the associated metadata block upon a second execution of the code block, and tune the second execution based on the performance data stored in the associated metadata block to increase efficiency of executing the code block.