Patent classifications
G06F9/26
Apparatus and method to preclude X86 special bus cycle load replays in an out-of-order processor
An apparatus including first and second reservation stations. The first reservation station dispatches a load micro instruction, and indicates on a hold bus if the load micro instruction is a specified load micro instruction directed to retrieve an operand from a prescribed resource other than on-core cache memory, where the specified load instruction comprises a load instruction resulting from execution of an x86 special bus cycle. The second reservation station is coupled to the hold bus, and dispatches one or more younger micro instructions therein that depend on the load micro instruction for execution after a number of clock cycles following dispatch of the first load micro instruction, and if it is indicated on the hold bus that the load micro instruction is the specified load micro instruction, the second reservation station is configured to stall dispatch of the one or more younger micro instructions until the load micro instruction has received the operand, and is configured to preclude assertion of any indications that would otherwise result in a replay event.
Hardware processors and methods for extended microcode patching and reloading
Hardware processors and methods for extended microcode patching through on-die and off-die secure storage are described. In one embodiment, the additional storage resources used for storing micro-operations are section(s) of a cache that are unused at runtime and/or unused by a configuration of a processor. For example, the additional storage resources may be a section of a cache that is used to store context information from a core when the core is transitioned to a power state that shuts off voltage to the core. Non-limiting examples of such sections are one or more sections for: storage of context information for a transition of a thread to idle or off, storage of context information for a transition of a core for a multiple core processor to idle or off, or storage of coherency information for a transition of a cache coherency circuit (e.g., cache box (CBo)) to idle or off.
Instruction Writing Method and Apparatus, and Network Device
An instruction writing method, apparatus, and network device are provided to reduce a requirement for a storage space of a microcode processor. The method includes obtaining, by a first device, first indication information, where the first indication information indicates the first device to enable a first service function, and writing, by the first device, a first microcode instruction set corresponding to the first service function into an unused storage space of a target microcode processor in a network processor, where a size of the unused storage space is greater than or equal to a size of the first microcode instruction set.
Efficient interruption routing for a multithreaded processor
A system and method of implementing a modified priority routing of an input/output (I/O) interruption. The system and method determines whether the I/O interruption is pending for a core and whether any of a plurality of guest threads of the core is enabled for guest thread processing of the interruption in accordance with the determining that the I/O interruption is pending. Further, the system and method determines whether at least one of the plurality of guest threads enabled for guest thread processing is in a wait state and, in accordance with the determining that the at least one of the plurality of guest threads enabled for guest thread processing is in the wait state, routes the I/O interruption to a guest thread enabled for guest thread processing and in the wait state.
SAFETY SUPERVISED GENERAL PURPOSE COMPUTING DEVICES
A computing device including a plurality of sensors, a system-on-module, a safety microcontroller and a plurality of communication interfaces communicatively coupling the system-on-module, the safety microcontroller and the plurality of sensors together. The system-on module can include an integrated interconnection of a plurality of different types of cores and one or more different types of memory. The system-on module can be configured to control operation and or performance of a system. The safety microcontroller can be configured to provide safety supervision of the system-on-module.
METHODS, SYSTEMS, AND APPARATUSES FOR A SCALABLE RESERVATION STATION IMPLEMENTING A SINGLE UNIFIED SPECULATION STATE PROPAGATION AND EXECUTION WAKEUP MATRIX CIRCUIT IN A PROCESSOR
Systems, methods, and apparatuses relating to a scalable reservation station circuit implementing a single unified speculation state propagation and execution wakeup matrix in a processor are described. In one embodiment, a hardware processor core includes a decoder circuit to decode one or more instructions into a first micro-operation to load data from a data cache, a second micro-operation dependent on the first micro-operation, and a third micro-operation dependent on the second micro-operation; an execution circuit to execute the first micro-operation, the second micro-operation, and the third micro-operation; and a reservation station circuit comprising a load speculation tracker circuit and coupled between the decoder circuit and the execution circuit, the load speculation tracker circuit to, for a reservation station entry of the third micro-operation, track progress of the first micro-operation in the data cache to generate a cancellation indication for the third micro-operation in response to a miss of the data in the data cache for the first micro-operation, wherein the load speculation tracker circuit is to begin to track the progress of the first micro-operation in the data cache in response to a dispatch of the first micro-operation into the data cache.
Scheduler for amp architecture using a closed loop performance and thermal controller
Systems and methods are disclosed for scheduling threads on an asymmetric multiprocessing system having multiple core types. Each core type can run at a plurality of selectable voltage and frequency scaling (DVFS) states. Threads from a plurality of processes can be grouped into thread groups. Execution metrics are accumulated for threads of a thread group and fed into a plurality of tunable controllers. A closed loop performance control (CLPC) system determines a control effort for the thread group and maps the control effort to a recommended core type and DVFS state. A closed loop thermal and power management system can limit the control effort determined by the CLPC for a thread group, and limit the power, core type, and DVFS states for the system. Metrics for workloads offloaded to co-processors can be tracked and integrated into metrics for the offloading thread group.
Filtering micro-operations for a micro-operation cache in a processor
A processor includes a micro-operation cache having a plurality of micro-operation cache entries for storing micro-operations decoded from instruction groups and a micro-operation filter having a plurality of micro-operation filter table entries for storing identifiers of instruction groups for which the micro-operations are predicted dead on fill if stored in the micro-operation cache. The micro-operation filter receives an identifier for an instruction group. The micro-operation filter then prevents a copy of the micro-operations from the first instruction group from being stored in the micro-operation cache when a micro-operation filter table entry includes an identifier that matches the first identifier.
Apparatus and method for secure, efficient microcode patching
An apparatus and method for efficient microcode patching. For example, one embodiment of an apparatus comprises: a package comprising one or more integrated circuit dies, the one or more integrated circuit dies comprising: a plurality of cores; and a security controller coupled to the plurality of cores, a first core of the plurality of cores comprising: a decoder to decode a microcode patching instruction, the microcode patching instruction comprising an operand to be used to identify an address; and execution circuitry to execute the microcode patching instruction, wherein responsive to the microcode patching instruction, the execution circuitry and/or security controller are to: retrieve a microcode patch from a location in memory based on the address, validate the microcode patch, apply the microcode patch to update or replace microcode associated with the one or more integrated circuit dies, and transmit the microcode patch to a persistent storage device; wherein the microcode patch is to be subsequently retrieved from the persistent storage device by one or more external security controllers of one or more external integrated circuit dies, the one or more external security controllers to cause the microcode patch to be applied to update or replace microcode associated with the one or more external integrated circuit dies.
MULTI-INDEXED MICRO-OPERATIONS CACHE FOR A PROCESSOR
Various example embodiments for supporting a multi-indexed micro-operations cache (MI-UC) in a processor are presented. Various example embodiments for supporting an MI-UC in a processor may be configured to support an MI-UC in which, for a UC line of the MI-UC, multiple indexes into the UC line, for multiple sets of micro-operations (UOPs) stored in the UC line based on decoding of multiple instructions, are supported.