G06F7/52

HYBRID FIXED LOGIC FOR PERFORMING MULTIPLICATION
20230030495 · 2023-02-02 ·

A fixed logic circuit configured to perform a multiplication operation a*x, where a is an integer constant, x is an integer variable in the range 0 to 2.sup.m−1, and m is a positive integer. The fixed logic circuit includes division logic configured to determine a predetermined number of one or more most significant bits of the result of the division operation:


└2.sup.ix/q┘

where q,i are selected such that:


a*x=└2.sup.ix/q┘

Multiplication logic determines a predetermined number of one or more least significant bits of the result of the multiplication operation a*x; and output logic combines the predetermined number of one or more most significant bits of the result of the division operation with the predetermined number of one or more least significant bits of the result of the multiplication operation so as to provide an output for the multiplication operation a*x.

Machine learning sparse computation mechanism for arbitrary neural networks, arithmetic compute microarchitecture, and sparsity for training mechanism

An apparatus to facilitate processing of a sparse matrix for arbitrary graph data is disclosed. The apparatus includes a graphics processing unit having a data management unit (DMU) that includes a scheduler for scheduling matrix operations, an active logic for tracking active input operands, and a skip logic for tracking unimportant input operands to be skipped by the scheduler. Processing circuitry is coupled to the DMU. The processing circuitry comprises a plurality of processing elements including logic to read operands and a multiplication unit to multiply two or more operands for the arbitrary graph data.

Machine learning sparse computation mechanism for arbitrary neural networks, arithmetic compute microarchitecture, and sparsity for training mechanism

An apparatus to facilitate processing of a sparse matrix for arbitrary graph data is disclosed. The apparatus includes a graphics processing unit having a data management unit (DMU) that includes a scheduler for scheduling matrix operations, an active logic for tracking active input operands, and a skip logic for tracking unimportant input operands to be skipped by the scheduler. Processing circuitry is coupled to the DMU. The processing circuitry comprises a plurality of processing elements including logic to read operands and a multiplication unit to multiply two or more operands for the arbitrary graph data.

Processing-in-memory (PIM) device and PIM system including the PIM device
11474718 · 2022-10-18 · ·

A processing-in-memory (PIM) device includes a plurality of memory banks and a plurality of multiplication/accumulation (MAC) operators. The MAC operators perform MAC arithmetic operations using data output from the plurality of memory banks and input into the MAC operators. A page is allocated to have a first page size in the plurality of memory banks in a memory mode. The page is allocated to have a second page size, which is greater than the first page size, in the plurality of memory banks in a MAC arithmetic mode.

Processing-in-memory (PIM) device and PIM system including the PIM device
11474718 · 2022-10-18 · ·

A processing-in-memory (PIM) device includes a plurality of memory banks and a plurality of multiplication/accumulation (MAC) operators. The MAC operators perform MAC arithmetic operations using data output from the plurality of memory banks and input into the MAC operators. A page is allocated to have a first page size in the plurality of memory banks in a memory mode. The page is allocated to have a second page size, which is greater than the first page size, in the plurality of memory banks in a MAC arithmetic mode.

Digital approximate multipliers for machine learning and artificial intelligence applications
11467805 · 2022-10-11 ·

Digital approximate multipliers (aMULT) utilizing interpolative apparatuses, circuits, and methods are described in this disclosure. The disclosed aMULT interpolative methods can be arranged or programmed to operate asynchronously and or synchronously. For applications where less precision is acceptable, fewer interpolations can yield less precise multiplication results, while such approximate multiplication can be computed faster and at lower power consumption. Conversely, for applications where higher precision is required, more interpolations can generate more precise multiplication results. As such, by utilizing the disclosed aMULT method, the resolution and precision objectives of an approximate multiplication function can be pre-programmed or adjusted real-time and or on the fly, which enables optimizing for different and flexible power consumption and speed of multiplication, in addition to enabling the optimization of an approximate multiplier's die size and cost in accordance with cost-performance objectives.

Digital approximate multipliers for machine learning and artificial intelligence applications
11467805 · 2022-10-11 ·

Digital approximate multipliers (aMULT) utilizing interpolative apparatuses, circuits, and methods are described in this disclosure. The disclosed aMULT interpolative methods can be arranged or programmed to operate asynchronously and or synchronously. For applications where less precision is acceptable, fewer interpolations can yield less precise multiplication results, while such approximate multiplication can be computed faster and at lower power consumption. Conversely, for applications where higher precision is required, more interpolations can generate more precise multiplication results. As such, by utilizing the disclosed aMULT method, the resolution and precision objectives of an approximate multiplication function can be pre-programmed or adjusted real-time and or on the fly, which enables optimizing for different and flexible power consumption and speed of multiplication, in addition to enabling the optimization of an approximate multiplier's die size and cost in accordance with cost-performance objectives.

Computer architecture for performing multiplication using correlithm objects in a correlithm object processing system
11645096 · 2023-05-09 · ·

A system includes a memory and a node. The memory stores first and second log string correlithm objects. The node receives first and second real-world numerical values, and identifies a first sub-string correlithm object from the first log string correlithm object that corresponds to the first real-world numerical value. The node aligns the first and second log string correlithm objects such that the first sub-string correlithm object aligns with a sub-string correlithm object from the second log string correlithm object representing the logarithmic value of one. The node identifies a second sub-string correlithm object from the second log string correlithm object that corresponds to the second real-world numerical value, and determines which sub-string correlithm object from the first log string correlithm object aligns with the second sub-string correlithm object from the second log string correlithm object. The node outputs the determined sub-string correlithm object.

Computer architecture for performing multiplication using correlithm objects in a correlithm object processing system
11645096 · 2023-05-09 · ·

A system includes a memory and a node. The memory stores first and second log string correlithm objects. The node receives first and second real-world numerical values, and identifies a first sub-string correlithm object from the first log string correlithm object that corresponds to the first real-world numerical value. The node aligns the first and second log string correlithm objects such that the first sub-string correlithm object aligns with a sub-string correlithm object from the second log string correlithm object representing the logarithmic value of one. The node identifies a second sub-string correlithm object from the second log string correlithm object that corresponds to the second real-world numerical value, and determines which sub-string correlithm object from the first log string correlithm object aligns with the second sub-string correlithm object from the second log string correlithm object. The node outputs the determined sub-string correlithm object.

METHODS & SYSTEMS FOR IMPROVING CORRELATION
20170366219 · 2017-12-21 ·

Systems and methods for improving correlation. In at least one system and method, a signal is received and divided into a plurality of slices. Each of the slices is divided into a plurality of sub-slices. A plurality of chips of a PN code are generated, and sub-slice correlation results are generated in parallel. Summation of the sub-slice correlation results generates a slice correlation results, and the accumulated slice correlation results provide a correlation result.