Patent classifications
G06F9/30145
APPARATUSES, METHODS, AND SYSTEMS FOR INSTRUCTIONS FOR A HARDWARE ASSISTED HETEROGENEOUS INSTRUCTION SET ARCHITECTURE DISPATCHER
Systems, methods, and apparatuses to support instructions for a hardware assisted heterogeneous instruction set architecture dispatcher are described. In one embodiment, a hardware processor includes a plurality of processor cores comprising a first type of processor core that supports a first instruction set architecture and a second type of processor core that supports a second different instruction set architecture, a decoder circuit of a processor core of the plurality of processor cores to decode a single instruction into a decoded single instruction, the single instruction including a field that identifies a requested core type and an opcode that indicates an execution circuit of the processor core is to: read a register to determine a core type of the processor core, cause the processor core to enter a first mode, that only permits execution of the first instruction set architecture by the processor core, when the requested core type and the core type of the processor core are the first type, cause the processor core to enter a second mode, that only permits execution of the second different instruction set architecture by the processor core, when the requested core type and the core type of the processor core are the second type, cause the processor core to enter a third mode, that only permits execution of the first instruction set architecture by the processor core, when the requested core type is the second type and the core type of the processor core is the first type, and cause the processor core to enter a fourth mode, that only permits execution of the second different instruction set architecture by the processor core, when the requested core type is the first type and the core type of the processor core is the second type, and the execution circuit of the processor core to execute the decoded single instruction according to the opcode.
System, Apparatus And Methods For Minimum Serialization In Response To Non-Serializing Register Write Instruction
In one embodiment, a processor includes: a plurality of registers; a front end circuit to fetch and decode a non-serializing register write instruction, the non-serializing register write instruction to cause a value to be stored in a first register of the plurality of registers; and an execution circuit coupled to the front end circuit. The execution circuit, in response to the non-serializing register write instruction, is to determine an amount of serialization for the non-serializing register write instruction and execute the non-serializing register write instruction according to the amount of serialization. Other embodiments are described and claimed.
METHOD FOR ACCELERATING DEEP NEURAL NETWORKS EXECUTION WITH ADVANCED OPERATOR FUSION
This disclosure has presented a new loop fusion framework called DNNFusion. The key advantages of DNNFusion include: 1) a new high-level abstraction comprising mapping type of operators and their combinations and the Extended Computational Graph, and analyses on these abstractions, 2) a novel mathematical-property-based graph rewriting, and 3) an integrated fusion plan generation. DNNFusion is extensively evaluated on 15 diverse DNN models on multiple mobile devices, and evaluation results show that it outperforms four state-of-the-art DNN execution frameworks by up to 8.8× speedup, and for the first time allows many cutting-edge DNN models not supported by prior end-to-end frameworks to execute on mobile devices efficiently (even in real-time). In addition, DNNFusion improves both cache performance and device utilization, enabling execution on devices with more restricted resources. It also reduces performance tuning time during compilation.
SYSTEM, APPARATUS AND METHODS FOR PERFORMANT READ AND WRITE OF PROCESSOR STATE INFORMATION RESPONSIVE TO LIST INSTRUCTIONS
In one embodiment, a processor includes: a front end circuit to fetch and decode a read list instruction, the read list instruction to cause storage to a memory of a software-provided list of processor state information; and an execution circuit coupled to the front end circuit. The execution circuit, in response to the decoded read list instruction, is to read the processor state information stored in the processor and store each datum of the processor state information into an entry of a data table in the memory. Other embodiments are described and claimed.
Computing device and method
A computing device, comprising: a computing module, comprising one or more computing units; and a control module, comprising a computing control unit, and used for controlling shutdown of the computing unit of the computing module according to a determining condition. Also provided is a computing method. The computing device and method have the advantages of low power consumption and high flexibility, and can be combined with the upgrading mode of software, thereby further increasing the computing speed, reducing the computing amount, and reducing the computing power consumption of an accelerator.
Data sharing system and data sharing method therefor
The application provides an information processing device, system and method. The information processing device mainly includes a storage module and a data processing module, where the storage module is configured to receive and store input data, instruction and output data, and the input data includes one or more key features; the data processing module is configured to identify the key features included in the input data and score the input data in the storage module according to a judgment result. The information processing device, system and method provided by the application automatically scores text, pictures, audio, video, and the like instead of manually scoring, which is more accurate and faster.
Compression assist instructions
In an embodiment, a processor supports one or more compression assist instructions which may be employed in compression software to improve the performance of the processor when performing compression/decompression. That is, the compression/decompression task may be performed more rapidly and consume less power when the compression assist instructions are employed then when they are not. In some cases, the cost of a more effective, more complex compression algorithm may be reduced to the cost of a less effective, less complex compression algorithm.
COUNT TO EMPTY FOR MICROARCHITECTURAL RETURN PREDICTOR SECURITY
An embodiment of an integrated circuit may comprise a return stack buffer (RSB), a speculative return stack buffer (SRSB), and circuitry coupled to the RSB and the SRSB, the circuitry to track a count until the SRSB is empty at a time of a prediction by a branch prediction unit, and return an output from the branch prediction unit that corresponds to one of the RSB and the SRSB based at least in part on the count until the SRSB is empty. Other embodiments are disclosed and claimed.
Apparatus and method for multi-adapter encoding
An apparatus and method for multi-adapter and/or multi-pass encoding on dual graphics processors. For example, one embodiment of a processor comprises: a central processor integrated on a first die, the central processor comprising a plurality of cores to execute instructions and process data; an first graphics processor integrated on the first die, the first graphics processor comprising media processing circuitry to perform one or more preliminary lookahead operations on video content to generate lookahead statistics; an interconnect to couple the first graphics processor to a lookahead buffer, the first graphics processor to transmit the lookahead statistics over the interconnect to the lookahead buffer; wherein the lookahead statistics are to be used by a second graphics processor to encode the video content to generate encoded video.
DECOUPLED ACCESS-EXECUTE PROCESSING
An apparatus comprises first instruction execution circuitry, second instruction execution circuitry, and a decoupled access buffer. Instructions of an ordered sequence of instructions are issued to one of the first and second instruction execution circuitry for execution in dependence on whether the instruction has a first type label or a second type label. An instruction with the first type label is an access-related instruction which determines at least one characteristic of a load operation to retrieve a data value from a memory address. Instruction execution by the first instruction execution circuitry of instructions having the first type label is prioritised over instruction execution by the second instruction execution circuitry of instructions having the second type label. Data values retrieved from memory as a result of execution of the first type instructions are stored in the decoupled access buffer.