Patent classifications
G06F9/3879
Processor having multiple cores, shared core extension logic, and shared core extension utilization instructions
An apparatus of an aspect includes a plurality of cores and shared core extension logic coupled with each of the plurality of cores. The shared core extension logic has shared data processing logic that is shared by each of the plurality of cores. Instruction execution logic, for each of the cores, in response to a shared core extension call instruction, is to call the shared core extension logic. The call is to have data processing performed by the shared data processing logic on behalf of a corresponding core. Other apparatus, methods, and systems are also disclosed.
Processing system with a main processor pipeline and a co-processor pipeline
A system comprising a data memory, a first processor with first execution pipeline, and a co-processor with second execution pipeline branching from the first pipeline via an inter-processor interface. The first pipeline can decode instructions from an instruction set comprising first and second instruction subsets. The first subset comprises a load instruction which loads data from the memory into a register file, and a compute instruction of a first type which performs a compute operation on such loaded data. The second subset includes a compute instruction of a second type which does not require a separate load instruction to first load data from memory into a register file, but instead reads data from the memory directly and performs a compute operation on that data, this reading being performed in a pipeline stage of the second pipeline that is aligned with the memory access stage of the first pipeline.
Architecture of crossbar of inference engine
A programmable hardware system for machine learning (ML) includes a core and an inference engine. The core receives commands from a host. The commands are in a first instruction set architecture (ISA) format. The core divides the commands into a first set for performance-critical operations, in the first ISA format, and a second set of performance non-critical operations, in the first ISA format. The core executes the second set to perform the performance non-critical operations of the ML operations and streams the first set to inference engine. The inference engine generates a stream of the first set of commands in a second ISA format based on the first set of commands in the first ISA format. The first set of commands in the second ISA format programs components within the inference engine to execute the ML operations to infer data.
Electronic devices generating verification vector for verifying semiconductor circuit and methods of operating the same
An electronic device configured to generate a verification vector for verifying a semiconductor circuit including a first circuit block and a second circuit block includes a duplicate command eliminator configured to receive a first input vector including a plurality of commands and to provide a first converted vector, in which ones of the plurality of commands that generate the same state transition are changed into idle commands, based on a state transition of the first circuit block obtained by performing a simulation operation on the first input vector, a reduced vector generator configured to provide a first reduced vector in which a number of repetitions of the idle commands included in the first converted vector is reduced, and a verification vector generator configured to output the first reduced vector having a coverage that coincides with a target coverage among a plurality of first reduced vectors as a first verification vector.
Hardware accelerator configuration by a translation of configuration data
A microprocessor circuit may include a software programmable microprocessor core and a data memory accessible via a data memory bus. The data memory may include sets of configuration data structured according to respective predetermined data structure specifications for configurable math hardware accelerators, and sets of input data for configurable math hardware accelerators, each configured to apply a predetermined signal processing function to the set of input data according to received configuration data. A configuration controller is coupled to the data memory via the data memory bus and to the configurable math hardware accelerators. The configuration controller may fetch the configuration data for each math hardware accelerator from the data memory and translate the configuration data. The configuration controller may transmit each set of configuration data to the corresponding configurable math hardware accelerator and write the configuration data to configuration registers of the math hardware accelerator.
Framework to provide time bound execution of co-processor commands
When a main processor issues a command to co-processor, a timeout value is included in the command. As the co-processor attempts to execute the command, it is determined whether the attempt is taking time beyond what is permitted by the timeout value. If the timeout is exceeded then responsive action is taken, such as the generation of a command timeout type failure message. The receipt of the command with the timeout value, and the consequent determination of a timeout condition for the command, may be determined by: the co-processor that receives the command, or a watchdog timer that is separate from the co-processor. Also, detection of co-processor hang and/or hung co-processor conditions during the time that a co-processor is executing a command for the main processor.
APPARATUSES, METHODS, AND SYSTEMS FOR INSTRUCTIONS FOR OPERATING SYSTEM TRANSPARENT INSTRUCTION STATE MANAGEMENT OF NEW INSTRUCTIONS FOR APPLICATION THREADS
Systems, methods, and apparatuses relating to an instruction for operating system transparent instruction state management of new instructions for application threads are described. In one embodiment, a hardware processor includes a decoder to decode a single instruction into a decoded single instruction, and an execution circuit to execute the decoded single instruction to cause a context switch from a current state to a state comprising additional state data that is not supported by an execution environment of an operating system that executes on the hardware processor.
Coprocessor context priority
A system may include a plurality of processors and a coprocessor. A plurality of coprocessor context priority registers corresponding to a plurality of contexts supported by the coprocessor may be included. The plurality of processors may use the plurality of contexts, and may program the coprocessor context priority register corresponding to a context with a value specifying a priority of the context relative to other contexts. An arbiter may arbitrate among instructions issued by the plurality of processors based on the priorities in the plurality of coprocessor context priority registers. In one embodiment, real-time threads may be assigned higher priorities than bulk processing tasks, improving bandwidth allocated to the real-time threads as compared to the bulk tasks.
Methods of hardware and software-coordinated opt-in to advanced features on hetero ISA platforms
The present disclosure relates to a processor that includes one or more processing elements associated with one or more instruction set architectures. The processor is configured to receive a request from an application executed by a first processing element of the one or more processing elements to enable a feature associated with an instruction set architecture. Additionally, the processor is configured to enable the application to utilize the feature without a system call occurring when the feature is associated with an instruction set architecture associated with the first processing element.
Methods and apparatus for automatically transforming software process recordings into dynamic automation scripts
An apparatus includes a processor and a memory storing instructions to cause the processor to receive files including a table database and sequence record data. The processor is further caused to generate a script by mapping each screen feature from a set of screen features extracted from the sequence record data to a table from the table database. The processor is further caused to generate a screen schema based on the script, by attempting to correlate addresses associated with the sequence record data to the screen features, to generate the screen schema including a table of dynamic. The processor is further caused to automatically generate a generic automation script based on the sequence record data and the screen schema to be consumed by a software bot to execute user actions in the sequence record data in an automated fashion.