G06F9/30101

Apparatus and method for injecting spin echo micro-operations in a quantum processor
11704588 · 2023-07-18 · ·

Apparatus and method for injected spin echo sequences in a quantum processor. For example, one embodiment of a processor includes a decoder to decode quantum instructions to generate quantum microoperations (uops) and to decode non-quantum instructions to generate non-quantum uops, execution circuitry to execute the quantum uops and non-quantum uops, and a corrective sequence data structure to identify and/or store corrective sets of uops for one or more of the quantum instructions. The decoder is to query the corrective sequence data structure upon receiving a first quantum instruction to determine if one or more corrective uops exist, and if the one or more corrective uops exist, the decoder is to submit the one or more corrective uops for execution by the execution circuitry.

Performing speculative address translation in processor-based devices

Performing speculative address translation in processor-based devices is disclosed herein. In one exemplary embodiment, a processor-based device provides a processing element (PE) that defines a speculative translation instruction such as an enqueue instruction for offloading operations to a peripheral device. The speculative translation instruction references a plurality of bytes including one or more virtual memory addresses. After receiving the speculative translation instruction, an instruction decode stage of an execution pipeline circuit of the PE transmits a request for address translation of the virtual memory address to a memory management unit (MMU) of the PE. The MMU then performs speculative address translation of the virtual memory address into a corresponding translated memory address. In some embodiments, any address translation errors encountered are raised to an appropriate exception level, and may be raised synchronously or asynchronously with respect to an operation performed when the speculative translation instruction is executed.

Auto-calibrating crossbar-based apparatuses
11705196 · 2023-07-18 · ·

Aspects of the present disclosure provide a method for calibrating crossbar-based apparatuses. The method includes obtaining output data of a crossbar-based apparatus may include a plurality of cross-point devices with tunable conductance, where the output data of the crossbar-based apparatus represents computing results of at least one operation performed by the crossbar-based apparatus, and where the output data corresponding to a plurality of settings of a plurality of analog components of the crossbar-based apparatus. The method also includes obtaining, by a processing device, one or more calibration parameters based on the output data of the crossbar-based apparatus, where the one or more calibration parameters correspond to one or more errors associated with one or more of the analog components of the crossbar-based apparatus. The method further includes calibrating the crossbar-based apparatus using the one or more calibration parameters.

Processor Power Management Using Instruction Throttling
20230019271 · 2023-01-19 ·

Systems and methods are disclosed for processor power management using instruction throttling. For example, an integrated circuit may include a processor core including a processor pipeline configured to execute instructions; a register configured to store a power dial value that indicates a portion of available clock cycles for throttling of instruction flow through the processor pipeline; and an instruction throttling circuit configured to periodically stall removal of instructions from a queue in the processor pipeline for a number of clock cycles that is determined based on the power dial value.

Computing device and method

The present disclosure provides a computation device. The computation device is configured to perform a machine learning computation, and includes an operation unit, a controller unit, and a conversion unit. The storage unit is configured to obtain input data and a computation instruction. The controller unit is configured to extract and parse the computation instruction from the storage unit to obtain one or more operation instructions, and to send the one or more operation instructions and the input data to the operation unit. The operation unit is configured to perform operations on the input data according to one or more operation instructions to obtain a computation result of the computation instruction. In the examples of the present disclosure, the input data involved in machine learning computations is represented by fixed-point data, thereby improving the processing speed and efficiency of training operations.

Apparatus and method for writing data in a memory

A device for writing data to a memory, the device including: a first write buffer having a first data width that matches a width of write data included in a write request and wherein the first write buffer is configured to store the write data as first data; a second write buffer having a second data width that matches a data width of the memory and is greater than the first data width; and a controller configured to, based on a write address included in the write request and an address of the second data stored in the second write buffer, write the first data stored in the first write buffer to the second write buffer and write the second data stored in the second write buffer to the memory.

MEMORY WITH ARTIFICIAL INTELLIGENCE MODE
20230215490 · 2023-07-06 ·

The present disclosure includes apparatuses and methods related to an artificial intelligence accelerator in memory. An example apparatus can include a number of registers configured to enable the apparatus to operate in an artificial intelligence mode to perform artificial intelligence operations and an artificial intelligence (AI) accelerator configured to perform the artificial intelligence operations using the data stored in the number of memory arrays. The AI accelerator can include hardware, software, and or firmware that is configured to perform operations associated with AI operations. The hardware can include circuitry configured as an adder and/or multiplier to perform operations, such as logic operations, associated with AI operations.

Multimedia Compressed Frame Aware Cache Replacement Policy

Various embodiments include methods and devices for implementing a criterion aware cache replacement policy by a computing device. Embodiments may include updating a staling counter, writing a value of a local counter to a system cache in association with a location in the system cache for with data, in which the value of the local counter includes a value of the staling counter when (i.e., at the time) the associated data is written to the system cache, and using the value of the local counter of the associated data to determine whether the associated data is stale.

PROGRAMMABLE SIGNAL AGGREGATOR
20230214292 · 2023-07-06 ·

In an embodiment, an electronic circuit includes: a plurality of signal channels; a signal collection circuit configured to determine an action of the electronic circuit based on channel signals from the plurality of signal channels; and a first signal management circuit coupled between the plurality of signal channels and the signal collection circuit, the first signal management circuit including: a set of internal registers, a set of user registers, and a decoder configured to program the set of internal registers based on a content of the set of user registers, where the first signal management circuit is configured to receive the channel signals via the plurality of signal channels, generate first aggregated signals based on the received channel signals and a content of the set of internal registers, and transmitting the first aggregated signals to the signal collection circuit.

Deep neural networks (DNN) hardware accelerator and operation method thereof

A DNN hardware accelerator and an operation method of the DNN hardware accelerator are provided. The DNN hardware accelerator includes: a network distributor for receiving an input data and distributing respective bandwidth of a plurality of data types of a target data amount based on a plurality of bandwidth ratios of the target data amount; and a processing element array coupled to the network distributor, for communicating data of the data types of the target data amount between the network distributor based on the distributed bandwidth of the data types.