G06F7/4981

CONNECTIVITY IN COARSE GRAINED RECONFIGURABLE ARCHITECTURE
20230053439 · 2023-02-23 ·

A reconfigurable compute fabric can include multiple nodes, and each node can include multiple tiles with respective processing and storage elements. The tiles can be arranged in an array or grid and can be communicatively coupled. In an example, the tiles can be arranged in a one-dimensional array and each tile can be coupled to its respective adjacent neighbor tiles using a direct bus coupling. Each tile can be further coupled to at least one non-adjacent neighbor tile that is one tile, or device space, away using a passthrough bus. The passthrough bus can extend through intervening tiles.

iNecklace & iAlphabet & iUniverse
20170220326 · 2017-08-03 ·

A method for iNecklace-iAlphabet-iUniverse comprises receiving a request for intuitive structures from the environment, constructing the iAlphabet, computing the identity, performing algebraic, categorical, and homotopy type constructions, performing constructions with measures, recalling relevant instances with reconstructions, identifying the analytic device, enabling composability to construct iUniverse.

Compute in/near memory (CIM) circuit architecture for unified matrix-matrix and matrix-vector computations

A memory circuit includes a number (X) of multiply-accumulate (MAC) circuits that are dynamically configurable. The MAC circuits can either compute an output based on computations of X elements of the input vector with the weight vector, or to compute the output based on computations of a single element of the input vector with the weight vector, with each element having a one bit or multibit length. A first memory can hold the input vector having a width of X elements and a second memory can store the weight vector. The MAC circuits include a MAC array on chip with the first memory.

COMPUTING APPARATUS, METHOD, BOARD CARD AND COMPUTER-READABLE STORAGE MEDIUM
20220253280 · 2022-08-11 ·

The present disclosure provides a computing device for processing a multi-bit width value, an integrated circuit board card, a method, and a computer readable storage medium. The computing device is included in the combined processing apparatus, and the combined processing apparatus further includes a general interconnection interface, and other processing devices. The computing device interacts with the other processing device to jointly complete a computing operation specified by a user. The combined processing apparatus further includes a storage device connected to an apparatus and the other processing devices and configured to store data of the apparatus and the other processing device. The solution of the present disclosure can split the multi-bit width value so that the processing capability of the processor is not influenced by the bit width.

Connectivity in coarse grained reconfigurable architecture
11841823 · 2023-12-12 · ·

A reconfigurable compute fabric can include multiple nodes, and each node can include multiple tiles with respective processing and storage elements. The tiles can be arranged in an array or grid and can be communicatively coupled. In an example, the tiles can be arranged in a one-dimensional array and each tile can be coupled to its respective adjacent neighbor tiles using a direct bus coupling. Each tile can be further coupled to at least one non-adjacent neighbor tile that is one tile, or device space, away using a passthrough bus. The passthrough bus can extend through intervening tiles.

Multi-addend adder circuit for stochastic computing

A multi-addend adder circuit used for multi-addend addition in a polar representation in stochastic computing. The multi-addend adder circuit includes a buffer circuit and a computing circuit, where the buffer circuit is configured to store to-be-buffered data for at least one cycle and output buffer data, and the computing circuit is configured to process a plurality of pieces of bitstream data and the buffer data and output one piece of bitstream data and the to-be-buffered data, where the piece of output bitstream data is a quotient of dividing a sum of summation data and the buffer data by a scale-down coefficient, the output to-be-buffered data is a remainder of dividing a sum of all summation data until a current cycle by the scale-down coefficient, and the summation data is a quantity of bits whose values are 1 in the plurality of pieces of first bitstream data.

MULTI-ADDEND ADDER CIRCUIT FOR STOCHASTIC COMPUTING
20200394018 · 2020-12-17 · ·

A multi-addend adder circuit used for multi-addend addition in a polar representation in stochastic computing. The multi-addend adder circuit includes a buffer circuit and a computing circuit, where the buffer circuit is configured to store to-be-buffered data for at least one cycle and output buffer data, and the computing circuit is configured to process a plurality of pieces of bitstream data and the buffer data and output one piece of bitstream data and the to-be-buffered data, where the piece of output bitstream data is a quotient of dividing a sum of summation data and the buffer data by a scale-down coefficient, the output to-be-buffered data is a remainder of dividing a sum of all summation data until a current cycle by the scale-down coefficient, and the summation data is a quantity of bits whose values are 1 in the plurality of pieces of first bitstream data.

MULTIPLY AND ACCUMULATE (MAC) UNIT AND A METHOD OF ADDING NUMBERS
20200034117 · 2020-01-30 ·

A method and a MAC unit that may include accumulation unit and a multiplier. A accumulation unit that includes a first part, a second part and a third part. The first part may calculate a truncated sum. The second part may be configured to (a) receive, during each calculation cycle, a carry out of an add operation performed during a calculation cycle, (b) receive a sign bit of an intermediate product calculated during the calculation cycle; and (c) calculate, by the counter logic, a counter logic value, and (d) convert, after a start of a last calculation cycle of the calculation cycles, an output value of the counter logic to an intermediate value having a two's complement format. The third part may be configured to calculate an output value of the MAC unit based on the intermediate value and a truncated sum calculated by the first part of the accumulation unit.

COMPUTE IN/NEAR MEMORY (CIM) CIRCUIT ARCHITECTURE FOR UNIFIED MATRIX-MATRIX AND MATRIX-VECTOR COMPUTATIONS

A memory circuit includes a number (X) of multiply-accumulate (MAC) circuits that are dynamically configurable. The MAC circuits can either compute an output based on computations of X elements of the input vector with the weight vector, or to compute the output based on computations of a single element of the input vector with the weight vector, with each element having a one bit or multibit length. A first memory can hold the input vector having a width of X elements and a second memory can store the weight vector. The MAC circuits include a MAC array on chip with the first memory.

CONNECTIVITY IN COARSE GRAINED RECONFIGURABLE ARCHITECTURE
20240086355 · 2024-03-14 ·

A reconfigurable compute fabric can include multiple nodes, and each node can include multiple tiles with respective processing and storage elements. The tiles can be arranged in an array or grid and can be communicatively coupled. In an example, the tiles can be arranged in a one-dimensional array and each tile can be coupled to its respective adjacent neighbor tiles using a direct bus coupling. Each tile can be further coupled to at least one non-adjacent neighbor tile that is one tile, or device space, away using a passthrough bus. The passthrough bus can extend through intervening tiles.