Patent classifications
G06F9/38
UN-MARK INSTRUCTIONS ON AN INSTRUCTION MATCH TO REDUCE RESOURCES REQUIRED TO MATCH A GROUP OF INSTRUCTIONS
A method of performing instruction marking in a computer processor architecture includes fetching instructions from a memory unit by a fetching unit in the computer processor architecture. Instruction groups for marking are determined. Fetched instructions are matched to instruction groups for marking. The fetched instructions are marked. Some of the marked instructions are selectively unmarked. The marked and unmarked instructions are forwarded to a queue of instructions for processing in the computer processor architecture.
Frame parser executing subsets of instructions in parallel for processing a frame header
An integrated circuit (IC) may include a set of instruction list engines (ILEs) that execute in parallel, where each ILE stores a subset of a set of instructions for processing a header of a frame, and where each ILE generates an ILE result based on executing the subset of the set of instructions. The IC may include a circuit to determine a result of parsing the header of the frame based on merging ILE results generated by the set of ILEs.
Convolutional layer acceleration unit, embedded system having the same, and method for operating the embedded system
Disclosed herein are a convolutional layer acceleration unit, an embedded system having the convolutional layer acceleration unit, and a method for operating the embedded system. The method for operating an embedded system, the embedded system performing an accelerated processing capability programmed using a Lightweight Intelligent Software Framework (LISF), includes initializing and configuring, by a parallelization managing function entity (FE), entities present in resources for performing mathematical operations in parallel, and processing in parallel, by an acceleration managing FE, the mathematical operations using the configured entities.
System and method for multi-level classification of branches
A processor including a processor pipeline having one or more execution units configured to execute branch instructions, a branch predictor coupled to the processor pipeline and configured to predict a branch instruction outcome, and a branch classification unit coupled to the processor pipeline and the branch prediction unit. The branch classification unit is configured to, in response to detecting a branch instruction, classify the branch instruction as at least one of the following: static taken branch, static not-taken branch, simple easy-to-predict branch, flip flop hard-to-predict (HTP) branch, dynamic HTP branch, biased positive HTP branch, biased negative HTP branch, and other HTP branch.
Variable-length instruction buffer management
A vector processor is disclosed including a variety of variable-length instructions. Computer-implemented methods are disclosed for efficiently carrying out a variety of operations in a time-conscious, memory-efficient, and power-efficient manner. Methods for more efficiently managing a buffer by controlling the threshold based on the length of delay line instructions are disclosed. Methods for disposing multi-type and multi-size operations in hardware are disclosed. Methods for condensing look-up tables are disclosed. Methods for in-line alteration of variables are disclosed.
Broadside random access memory for low cycle memory access and additional functions
A computational system includes one or more processors. Each processor has multiple registers, as well attached memory to hold instructions. The processor is coupled to one or more broadside interfaces. A broadside interface allows the processor to load or store an entire widget state in a single clock cycle of the processor. The broadside interface also allows the processor to move and store 32 bytes of information into RAM in less than four to five clock cycles of the processor while the processor concurrently performs one or more mathematical operations on the information while the move and store operation is taking place.
Determine whether to perform action on computing device based on analysis of endorsement information of a security co-processor
Examples disclosed herein relate to a computing device that includes a central processing unit, a management controller separate from the central processing unit, and a security co-processor. The management controller is powered using an auxiliary power rail that provides power to the management controller while the computing device is in an auxiliary power state. The security co-processor includes device unique data. The management controller receives the device unique data and stores a representation at a secure location. At a later time, the management controller receives endorsement information from an expected location of the security co-processor. The management controller determines whether to perform an action on the computing device based on an analysis of the endorsement information and the stored representation of the device unique data.
Dynamic graphical processing unit register allocation
Systems, apparatuses, and methods for dynamic graphics processing unit (GPU) register allocation are disclosed. A GPU includes at least a plurality of compute units (CUs), a control unit, and a plurality of registers for each CU. If a new wavefront requests more registers than are currently available on the CU, the control unit spills registers associated with stack frames at the bottom of a stack since they will not likely be used in the near future. The control unit has complete flexibility determining how many registers to spill based on dynamic demands and can prefetch the upcoming necessary fills without software involvement. Effectively, the control unit manages the physical register file as a cache. This allows younger workgroups to be dynamically descheduled so that older workgroups can allocate additional registers when needed to ensure improved fairness and better forward progress guarantees.
Correspondence of external operations to containers and mutation events
A method is provided for determining command-to-process correspondence. The method includes identifying, by the hardware processor, initial processes resulting from executions of container immutability change events for each of multiple containers in a cluster, based on an execution time, a process identifier and a process group identifier for each of the container immutability change events. The method further includes checking, by the hardware processor, if an initial process from among the identified initial processes matches an entry in a database that stores external container commands and at least one respective process resulting from executing each of the external container commands. The method also includes designating, by the hardware processor, a particular external command, from among the external container commands stored in the database, as having a correspondence to the initial process, responsive to the initial process matching the at least one respective process resulting from executing the particular external command.
Method for replenishing a thread queue with a target instruction of a jump instruction
Methods and electronic circuits for executing instructions in a central processing unit (CPU) are provided. One of the methods includes forming an instruction block by sequentially fetching, from a current thread queue, one or more instructions including one jump instruction, wherein the jump instruction is the last instruction in the instruction block; transmitting the instruction block to a CPU execution unit for execution; replenishing the current thread queue with at least one instruction to form a thread queue to be executed; determining a target instruction of the jump instruction according to an execution result of the CPU execution unit; determining whether the target instruction is contained in the thread queue to be executed; and if not, flushing the thread queue to be executed, obtaining the target instruction and adding the target instruction to the thread queue to be executed.