Patent classifications
G06F9/3856
TASK EXECUTION ORDER DETERMINATION SYSTEM AND TASK EXECUTION METHOD
A technique for evaluating human cognitive and motor functions by a plurality of hand movement tasks is disclosed. A task execution method determines the execution order of a plurality of tasks which a test subject is caused to execute to acquire a characteristic quantity. A test subject group task database includes scores given in advance and characteristic quantities obtained from a plurality of tasks stored as past data corresponding to each of a plurality of test subjects. In a storage device, (1) a differentiation precision database for a case in which test subjects are divided into two groups by predetermined threshold value scores differentiated by the characteristic quantities, or (2) an estimation precision database for a case in which a score is estimated using the characteristic quantity for a predetermined score value is prepared for each of the tasks on the basis of the test subject group task database.
Computer architecture with synergistic heterogeneous processors
A computer architecture employs multiple special-purpose processors having different affinities for program execution to execute substantial portions of general-purpose programs to provide improved performance with respect to a general-purpose processor executing the general-purpose program alone.
Pipeline set selection based on duty cycle estimation
A computer implemented system is described for assigning executable jobs to pipeline sets, whereby the jobs may be network based computer jobs. The assigning includes generating a weight for each pipeline set of multiple pipeline sets to obtain multiple weights. Generating a weight includes obtaining duty cycle metrics for pipeline software threads in the pipeline set. The duty cycle metrics include a measure of an amount of time that a corresponding pipeline thread is executing and actively processing data. Generating the weight further includes determining the weight for the pipeline set based at least in part on the duty cycle metrics. The method further includes assigning a job request to a target pipeline set selected from the pipeline sets according to a weighted random algorithm, wherein the weighted random algorithm uses the weights.
APPARATUS AND METHOD FOR IDENTIFYING AND PRIORITIZING CERTAIN INSTRUCTIONS IN A MICROPROCESSOR INSTRUCTION PIPELINE
A microprocessor improves Memory Level Parallelism (MLP) with minimal added complexity and without requiring segregated storage or management of instructions, by marking memory instructions and related instructions as urgent, and dispatching marked and unmarked instructions into common queuing circuitry for scheduled execution within scheduling circuitry that is configured to prioritize the execution of marked instructions. Instruction marking may be limited to the span of the renaming stage or may be extended to the span of the reorder buffer for additional gains in MLP.
Tile assignment to processing cores within a graphics processing unit
A graphics processing unit configured to process graphics data using a rendering space which is sub-divided into a plurality of tiles, the graphics processing unit comprising: a plurality of processing cores configured to render graphics data; cost indication logic configured to obtain a cost indication for each of a plurality of sets of one or more tiles of the rendering space, wherein the cost indication for a set of one or more tiles is suggestive of a cost of processing the set of one or more tiles; similarity indication logic configured to obtain similarity indications between sets of one or more tiles of the rendering space, wherein the similarity indication between two sets of one or more tiles is indicative of a level of similarity between the two sets of tiles according to at least one processing metric; and scheduling logic configured to assign the sets of one or more tiles to the processing cores for rendering in dependence on the cost indications and the similarity indications.
Reconfigurable processor circuit architecture
A representative reconfigurable processing circuit and a reconfigurable arithmetic circuit are disclosed, each of which may include input reordering queues; a multiplier shifter and combiner network coupled to the input reordering queues; an accumulator circuit; and a control logic circuit, along with a processor and various interconnection networks. A representative reconfigurable arithmetic circuit has a plurality of operating modes, such as floating point and integer arithmetic modes, logical manipulation modes, Boolean logic, shift, rotate, conditional operations, and format conversion, and is configurable for a wide variety of multiplication modes. Dedicated routing connecting multiplier adder trees allows multiple reconfigurable arithmetic circuits to be reconfigurably combined, in pair or quad configurations, for larger adders, complex multiplies and general sum of products use, for example.
Conditional branching control for a multi-threaded, self-scheduling reconfigurable computing fabric
Representative apparatus, method, and system embodiments are disclosed for configurable computing. A representative system includes an interconnection network; a processor; and a plurality of configurable circuit clusters. Each configurable circuit cluster includes a plurality of configurable circuits arranged in an array; a synchronous network coupled to each configurable circuit of the array; and an asynchronous packet network coupled to each configurable circuit of the array. A representative configurable circuit includes a configurable computation circuit and a configuration memory having a first, instruction memory storing a plurality of data path configuration instructions to configure a data path of the configurable computation circuit; and a second, instruction and instruction index memory storing a plurality of spoke instructions and data path configuration instruction indices for selection of a master synchronous input, a current data path configuration instruction, and a next data path configuration instruction for a next configurable computation circuit.
Programmable re-order buffer for decompression
Examples described herein relate to a decompression engine that can request compressed data to be transferred over a memory bus. In some cases, the memory bus is a width that requires multiple data transfers to transfer the requested data. In a case that requested data is to be presented in-order to the decompression engine, a re-order buffer can be used to store entries of data. When a head-of-line entry is received, the entry can be provided to the decompression engine. When a last entry in a group of one or more entries is received, all entries in the group are presented in-order to the decompression engine. In some examples, a decompression engine can borrow memory resources allocated for use by another memory client to expand a size of re-order buffer available for use. For example, a memory client with excess capacity and a slowest growth rate can be chosen to borrow memory resources from.
System and Method for Implementing Strong Load Ordering in a Processor Using a Circular Ordering Ring
A system and corresponding method enforce strong load ordering in a processor. The system comprises an ordering ring that stores entries corresponding to in-flight memory instructions associated with a program order, scanning logic, and recovery logic. The scanning logic scans the ordering ring in response to execution or completion of a given load instruction of the in-flight memory instructions and detects an ordering violation in an event at least one entry of the entries indicates that a younger load instruction has completed and is associated with an invalidated cache line. In response to the ordering violation, the recovery logic allows the given load instruction to complete, flushes the younger load instruction, and restarts execution of the processor after the given load instruction in the program order, causing data returned by the given and younger load instructions to be returned consistent with execution according to the program order to satisfy strong load ordering.
TIGHTLY-COUPLED SLICE TARGET FILE DATA
A system may determine that two instructions may be combined based on a processing power of the processor and a size of the instructions, fuse the two instructions into a pair, map the two instructions with two register tags, write the two register tags into a mapper, write the fused instruction pair into an issue queue, issue the fused instruction pair to a vector-scalar transformation unit (VSU), and execute the two instructions.