Patent classifications
G06F9/30054
SCALABLE TOGGLE POINT CONTROL CIRCUITRY FOR A CLUSTERED DECODE PIPELINE
Systems, methods, and apparatuses relating to circuitry to implement toggle point insertion for a clustered decode pipeline are described. In one example, a hardware processor core includes a first decode cluster comprising a plurality of decoder circuits, a second decode cluster comprising a plurality of decoder circuits, and a toggle point control circuit to toggle between sending instructions requested for decoding between the first decode cluster and the second decode cluster, wherein the toggle point control circuit is to: determine a location in an instruction stream as a candidate toggle point to switch the sending of the instructions requested for decoding between the first decode cluster and the second decode cluster, track a number of times a characteristic of multiple previous decodes of the instruction stream is present for the location, and cause insertion of a toggle point at the location, based on the number of times, to switch the sending of the instructions requested for decoding between the first decode cluster and the second decode cluster.
Method performed by a microcontroller for managing a NOP instruction and corresponding microcontroller
Disclosed herein is a method for managing of NOP instructions in a microcontroller, the method comprising duplicating all jump instructions causing a NOP instruction to form a new instruction set; inserting an internal NOP instruction into each of the jump instructions; when a jump instruction is executed, executing a subsequent instruction of the new instruction set; and executing the internal NOP instruction when an execution of the subsequent instruction is skipped.
System of Multiple Stacks in a Processor Devoid of an Effective Address Generator
In one implementation devoid of an effective address generator a method of call operation comprises pushing one or more parameters onto a first stack, pushing the contents of one or more registers onto a second stack, popping off the first stack one or more of the parameters into one or more of the registers whose contents were pushed onto the second stack, performing register to register operations on the one or more registers whose contents were pushed onto the second stack with a result of the register to register operations being stored in a result register, the result register being one of the one or more registers whose contents were pushed onto the second stack, popping off the second stack the contents of all the one or more registers into their respective registers from which they came, and returning control to an instruction following the call.
Apparatuses, Devices, Methods, Computer Systems and Computer Programs for Handling Remote Procedure Calls
Various examples relate to apparatuses, devices, methods, computer systems and computer programs for handling remote procedure calls. A non-transitory, computer-readable medium comprises machine-readable instructions that, when the program code is executed on a processor of a requesting host, causes the processor to provide an interface for locally receiving remote procedure calls from a plurality of threads of a computer program, and forward, upon receiving a remote procedure call from one of the threads of the computer program, the remote procedure call to a providing host that provides the functionality associated with the remote procedure call, wherein the remote procedure call is forwarded together with information on the thread having issued the remote procedure call.
Processors, methods, systems, and instructions to protect shadow stacks
A processor of an aspect includes a decode unit to decode an instruction. The processor also includes an execution unit coupled with the decode unit. The execution unit, in response to the instruction, is to determine that an attempted change due to the instruction, to a shadow stack pointer of a shadow stack, would cause the shadow stack pointer to exceed an allowed range. The execution unit is also to take an exception in response to determining that the attempted change to the shadow stack pointer would cause the shadow stack pointer to exceed the allowed range. Other processors, methods, systems, and instructions are disclosed.
METHOD AND APPARATUS FOR IMPLEMENTING POWER MODES IN MICROCONTROLLERS USING POWER PROFILES
A method and apparatus for implementing power modes in microcontrollers (MCUs) using power profiles. In one embodiment of the method, a central processing unit (CPU) of the MCU executes a first instruction for calling a subroutine stored in a memory of the MCU, wherein the first instruction comprises a first parameter to be passed to the subroutine. Thereafter the CPU writes a first value to a first special function register (SFR) of the MCU in response to executing the first instruction, wherein the first value is related to the first parameter. The MCU operates in a first power mode in response to the CPU writing the first value to the first SFR. The CPU also executes a second instruction for calling the subroutine, wherein the second instruction comprises a second parameter to be passed to the subroutine. In response the CPU writes a second value to a second SFR of the MCU in response to executing the second instruction, wherein the second value is related to the second parameter. The MCU operates in a second power mode in response to the CPU writing the second value to the second SFR. The MCU consumes more power operating in the first power mode than it does when operating in the second power mode.
Providing code sections for matrix of arithmetic logic units in a processor
The present invention relates to a processor having a trace cache and a plurality of ALUs arranged in a matrix, comprising an analyser unit located between the trace cache and the ALUs, wherein the analyser unit analyses the code in the trace cache, detects loops, transforms the code, and issues to the ALUs sections of the code combined to blocks for joint execution for a plurality of clock cycles.
METHOD PERFORMED BY A MICROCONTROLLER FOR MANAGING A NOP INSTRUCTION AND CORRESPONDING MICROCONTROLLER
Disclosed herein is a method for managing of NOP instructions in a microcontroller, the method comprising duplicating all jump instructions causing a NOP instruction to form a new instruction set; inserting an internal NOP instruction into each of the jump instructions; when a jump instruction is executed, executing a subsequent instruction of the new instruction set; and executing the internal NOP instruction when an execution of the subsequent instruction is skipped.
CONTROL FLOW PREDICTION
A data processing apparatus is provided that includes bimodal control flow prediction circuitry for performing a prediction of whether a conditional control flow instruction will be taken. Storage circuitry stores, in association with the control flow instruction, a stored state of the data processing apparatus and reversal circuitry reverses the prediction in dependence on the stored state of the data processing apparatus corresponding with a current state of the data processing apparatus when execution of the control flow instruction is to be performed.
Neural network processing element incorporating compute and local memory elements
A novel and useful neural network (NN) processing core adapted to implement artificial neural networks (ANNs) and incorporating processing circuits having compute and local memory elements. The NN processor is constructed from self-contained computational units organized in a hierarchical architecture. The homogeneity enables simpler management and control of similar computational units, aggregated in multiple levels of hierarchy. Computational units are designed with minimal overhead as possible, where additional features and capabilities are aggregated at higher levels in the hierarchy. On-chip memory provides storage for content inherently required for basic operation at a particular hierarchy and is coupled with the computational resources in an optimal ratio. Lean control provides just enough signaling to manage only the operations required at a particular hierarchical level. Dynamic resource assignment agility is provided which can be adjusted as required depending on resource availability and capacity of the device.