G06F9/30105

Controlling the number of powered vector lanes via a register field

The vector data path is divided into smaller vector lanes. A register such as a memory mapped control register stores a vector lane number (VLX) indicating the number of vector lanes to be powered. A decoder converts this VLX into a vector lane control word, each bit controlling the ON of OFF state of the corresponding vector lane. This number of contiguous least significant vector lanes are powered. In the preferred embodiment the stored data VLX indicates that 2.sup.VLX contiguous least significant vector lanes are to be powered. Thus the number of vector lanes powered is limited to an integral power of 2. This manner of coding produces a very compact controlling bit field while obtaining substantially all the power saving advantage of individually controlling the power of all vector lanes.

Method and apparatus for data-ready memory operations

Disclosed embodiments relate to a new instruction for performing data-ready memory access operations. In one example, a system includes circuits to fetch, decode, and execute an instruction that includes an opcode, at least one memory location identifier identifying at least one data element, a register identifier, a data readiness indicator identifying at least one data access condition, and a data readiness mask, wherein the execution circuit is to, for each data element of the at least one data element, determine whether a memory request for the data element satisfies the at least one data access condition identified by the data readiness indicator, and in response to determining that the data access condition: generate a prefetch request for the data element, and set a value in a corresponding data element position of the data readiness mask to indicate that the memory request for the data element does not satisfy the at least one data access condition.

Information processing apparatus, recording medium for information processing program, and information processing method
11354068 · 2022-06-07 · ·

An information processing apparatus, includes a computation processing device that includes a memory and a processor coupled to the memory; and a storage device that stores a program, and wherein the processor is configured to: store, in the memory, a first storage area for first data that is assigned to a computation target by data definition for the computation target written in the program and a second storage area for second data that is assigned to the computation target instead of the first data, simplify the program, when the data definition for the computation target is omitted by executing the simplified program, output the second data, and perform the computation by using the output second data.

APPARATUS AND METHOD FOR COMPLEX BY COMPLEX CONJUGATE MULTIPLICATION

An apparatus and method for multiplying packed real and imaginary components of complex numbers are described. A processor embodiment includes: a decoder to decode a first instruction to generate a decoded instruction; a first source register to store a first plurality of packed real and imaginary data elements; a second source register to store a second plurality of packed real and imaginary data elements; and execution circuitry to execute the decoded instruction. The execution circuitry includes: multiplier circuitry to select real and imaginary data elements in the first source register and second source, multiply each selected imaginary data element in the first source register with a selected real data element in the second source register, and multiply each selected real data element in the first source register with a selected imaginary data element in the second source register to generate a plurality of imaginary products; adder circuitry to add a first subset of the plurality of imaginary products and subtract a second subset of the plurality of imaginary products to generate a first temporary result, and to add a third subset of the plurality of imaginary products and subtract a fourth subset of the plurality of imaginary products to generate a second temporary result; and accumulation circuitry to combine the first temporary result with first data from a destination register to generate a first final result, combine the second temporary result with second data from the destination register to generate a second final result, and store the first final result and second final result back in the destination register.

Apparatus and methods for vector operations

Aspects for vector operations in neural network are described herein. The aspects may include a vector caching unit configured to store a first vector and a second vector, wherein the first vector includes one or more first elements and the second vector includes one or more second elements. The aspects may further include one or more adders and a combiner. The one or more adders may be configured to respectively add each of the first elements to a corresponding one of the second elements to generate one or more addition results. The combiner may be configured to combine a combiner configured to combine the one or more addition results into an output vector.

Apparatuses, methods, and systems for hardware-assisted lockstep of processor cores
11340960 · 2022-05-24 · ·

Systems, methods, and apparatuses relating to circuitry to implement lockstep of processor cores are described. In one embodiment, a hardware processor comprises a first processor core comprising a first control flow signature register and a first execution circuit, a second processor core comprising a second control flow signature register and a second execution circuit, and at least one signature circuit to perform a first state history compression operation on a first instruction that executes on the first execution circuit of the first processor core to produce a first result, store the first result in the first control flow signature register, perform a second state history compression operation on a second instruction that executes on the second execution circuit of the second processor core to produce a second result, and store the second result in the second control flow signature register.

Logic circuitry

A logic circuitry package for a replaceable print apparatus component comprises an interface to communicate with a print apparatus logic circuit, and at least one logic circuit. The logic circuit may be configured to identify, from a command stream received from the print apparatus, parameters including a class parameter, and/or identify, from the command stream, a read request, and output, via the interface, a count value in response to a read request, the count value based on identified received parameters.

Memory apparatus and data processing system including the same
11334357 · 2022-05-17 · ·

A memory apparatus may include at least one memory, and a memory controller configured to receive an address signal and a command through shared pins and store data, provided from an external source, within the memory controller when a write command is inputted without the address signal.

Systems and methods for maintaining pooled time-dependent resources in a multilateral distributed register

The present disclosure is directed to a novel system for using a distributed register to generate, manage, and store data for interest-pooled time deposit resource accounts. The invention leverages a pooled resource account approach, allowing for multiple disparate resource accounts to benefit from an enhanced interest return by pooling resource accounts. The system components of the invention contemplate the use of distributed register technology to provide a verified ledger of information related to one or more resource accounts, as well as store system data, user data, and metadata related to the movement and management of resources. By using a distributed register approach to store and verify data related to time-dependent resource account services, the invention provides an automated system and methods for enhancing the flow of sensitive verified information, reducing the need for manual review and increasing the speed at which various resource account services can be validated and executed.

Method and apparatus for balancing binary instruction burstization and chaining

A method for grouping computer instructions includes receiving a set of computer instructions, grouping the set of computer instructions by register dependencies, identifying a plurality of single-definition-use flow (SDF) bundles based on a burstization criteria and a chaining criteria; and based on the SDF bundles, transforming the set of computer instructions. The transformation may include splitting one of the set of computer instructions and setting a burst parameter for the one of the set of computer instruction. The transformation may include grouping a plurality of the set of computer instructions and replacing a pair of register file accesses with a pair of temporary register accesses.