G06F7/49

Bit string conversion invoking bit strings having a particular data pattern
11496149 · 2022-11-08 · ·

Systems, apparatuses, and methods related to bit string conversion are described. A memory resource and/or logic circuitry may be used in performance of bit string conversion operations. The logic circuitry can perform operations on bit strings, such as universal number and/or posit bit strings, to alter a level of precision (e.g., a dynamic range, resolution, etc.) of the bit strings. For instance, the memory resource can receive data comprising a bit string having a first quantity of bits that correspond to a first level of precision. The logic circuitry can determine that the bit string having the first quantity of bits has a particular data pattern and alter the first quantity of bits to a second quantity of bits that correspond to a second level of precision based, at least in part, on the determination that the bit string has the particular data pattern.

CONDITIONAL MODULAR SUBTRACTION INSTRUCTION

One embodiment provides a processor comprising first circuitry to decode an instruction into a decoded instruction, the instruction to indicate a first source operand and a second source operand and second circuitry including a processing resource to execute the decoded instruction, wherein responsive to the decoded instruction, the processing resource is to output a result of first source operand data minus second source operand data in response to a determination by the processing resource that the first source operand data is greater than or equal to the second source operand data, otherwise the processing resource is to output the first source operand data.

CONDITIONAL MODULAR SUBTRACTION INSTRUCTION

One embodiment provides a processor comprising first circuitry to decode an instruction into a decoded instruction, the instruction to indicate a first source operand and a second source operand and second circuitry including a processing resource to execute the decoded instruction, wherein responsive to the decoded instruction, the processing resource is to output a result of first source operand data minus second source operand data in response to a determination by the processing resource that the first source operand data is greater than or equal to the second source operand data, otherwise the processing resource is to output the first source operand data.

System and method of performing discrete frequency transform for receivers using single-bit analog to digital converters
11601133 · 2023-03-07 · ·

A system and method for performing discrete frequency transform including a pair of single-bit analog to digital converters (ADCs), a phase converter, a memory, a discrete frequency transform converter and summation circuitry. The ADCs convert an analog input signal into N pairs of binary in-phase and quadrature component samples each being one of four values at a corresponding one of four phases. The phase converter determines a phase value for each pair of component samples. The memory stores a set of discrete frequency transform coefficient values based on N. The discrete frequency transform converter uses a phase value and a pair of discrete frequency transform coefficient values retrieved from the memory for a selected frequency bin to determine a discrete frequency component for each pair of phase component samples. The summation circuitry sums the corresponding N frequency domain components for determining a frequency domain value for the selected frequency bin.

System and method of performing discrete frequency transform for receivers using single-bit analog to digital converters
11601133 · 2023-03-07 · ·

A system and method for performing discrete frequency transform including a pair of single-bit analog to digital converters (ADCs), a phase converter, a memory, a discrete frequency transform converter and summation circuitry. The ADCs convert an analog input signal into N pairs of binary in-phase and quadrature component samples each being one of four values at a corresponding one of four phases. The phase converter determines a phase value for each pair of component samples. The memory stores a set of discrete frequency transform coefficient values based on N. The discrete frequency transform converter uses a phase value and a pair of discrete frequency transform coefficient values retrieved from the memory for a selected frequency bin to determine a discrete frequency component for each pair of phase component samples. The summation circuitry sums the corresponding N frequency domain components for determining a frequency domain value for the selected frequency bin.

Computing device and method

The present disclosure provides a computation device. The computation device is configured to perform a machine learning computation, and includes an operation unit, a controller unit, and a conversion unit. The storage unit is configured to obtain input data and a computation instruction. The controller unit is configured to extract and parse the computation instruction from the storage unit to obtain one or more operation instructions, and to send the one or more operation instructions and the input data to the operation unit. The operation unit is configured to perform operations on the input data according to one or more operation instructions to obtain a computation result of the computation instruction. In the examples of the present disclosure, the input data involved in machine learning computations is represented by fixed-point data, thereby improving the processing speed and efficiency of training operations.

Computing device and method

The present disclosure provides a computation device. The computation device is configured to perform a machine learning computation, and includes an operation unit, a controller unit, and a conversion unit. The storage unit is configured to obtain input data and a computation instruction. The controller unit is configured to extract and parse the computation instruction from the storage unit to obtain one or more operation instructions, and to send the one or more operation instructions and the input data to the operation unit. The operation unit is configured to perform operations on the input data according to one or more operation instructions to obtain a computation result of the computation instruction. In the examples of the present disclosure, the input data involved in machine learning computations is represented by fixed-point data, thereby improving the processing speed and efficiency of training operations.

METHOD AND APPARATUS FOR SIMULTANEOUS PROCESSING OF MULTIPLE FUNCTIONS

Electronic logic gates that operate using N logic state levels, where N is greater than 2, and methods of operating such gates. The electronic logic gates operate according to truth tables. At least two input signals each having a logic state that can range over more than two logic states are provided to the logic gates. The logic gates each provide an output signal that can have one of N logic states. Examples of gates described include NAND/NAND gates having two inputs A and B and NAND/NAND gates having three inputs A, B, and C, where A, B and C can take any of four logic states. Systems using such gates are described, and their operation illustrated. Optical logic gates that operate using N logic state levels are also described.

High performance merge sort with scalable parallelization and full-throughput reduction

Disclosed herein is a novel multi-way merge network, referred to herein as a Hybrid Comparison Look Ahead Merge (HCLAM), which incurs significantly less resource consumption as scaled to handle larger problems. In addition, a parallelization scheme is disclosed, referred to herein as Parallelization by Radix Pre-sorter (PRaP), which enables an increase in streaming throughput of the merge network. Furthermore, high performance reduction scheme is disclosed to achieve full throughput.

High performance merge sort with scalable parallelization and full-throughput reduction

Disclosed herein is a novel multi-way merge network, referred to herein as a Hybrid Comparison Look Ahead Merge (HCLAM), which incurs significantly less resource consumption as scaled to handle larger problems. In addition, a parallelization scheme is disclosed, referred to herein as Parallelization by Radix Pre-sorter (PRaP), which enables an increase in streaming throughput of the merge network. Furthermore, high performance reduction scheme is disclosed to achieve full throughput.