Patent classifications
G06F7/53
METHOD AND APPARATUS FOR VECTOR SORTING
A method for sorting of a vector in a processor is provided that includes performing, by the processor in response to a vector sort instruction, sorting of values stored in lanes of the vector to generate a sorted vector, wherein the values are sorted in an order indicated by the vector sort instruction, and storing the sorted vector in a storage location.
METHOD AND APPARATUS FOR VECTOR PERMUTATION
A method is provided that includes performing, by a processor in response to a vector permutation instruction, permutation of values stored in lanes of a vector to generate a permuted vector, wherein the permutation is responsive to a control storage location storing permute control input for each lane of the permuted vector, wherein the permute control input corresponding to each lane of the permuted vector indicates a value to be stored in the lane of the permuted vector, wherein the permute control input for at least one lane of the permuted vector indicates a value of a selected lane of the vector is to be stored in the at least one lane, and storing the permuted vector in a storage location indicated by an operand of the vector permutation instruction.
MULTICHIP SYSTEM AND DATA PROCESSING METHOD ADAPTED TO THE SAME FOR IMPLEMENTING NEURAL NETWORK APPLICATION
A data processing method, a multichip system, and a non-transitory computer-readable medium for implementing a neuron network application are provided. The data processing method includes: allocating corresponding chips to process a corresponding part of a first stage data and a corresponding part of a second stage data; transmitting, by a first chip, a first part of the first stage data to a second chip through a channel; transmitting, by the second chip, a second part of the first stage data to the first chip through the channel; computing, by the first chip, the first stage data with a first part of weight values to obtain a first result, and computing, by the second chip, the first stage data with a second part of weight values to obtain a second result, where the first result and the second result are one of the second stage data.
SEMICONDUCTOR DEVICE, DATA GENERATION METHODS USED FOR THE SAME, AND METHOD OF CONTROLLING THE SAME
A semiconductor device includes: a local memory outputting a plurality of pieces of weight data in parallel; a plurality of product-sum operation units corresponding to the plurality of pieces of weight data; and a plurality of unit selectors corresponding to the product-sum operations units, supplied with a plurality of pieces of input data in parallel, selecting the one piece of input data from the supplied plurality of pieces of input data according to a plurality of pieces of additional information each indicating a position of the input data to be calculated with the corresponding product-sum arithmetic unit calculator in the pieces of input data, and outputting the selected input data. Each of the plurality of product-sum arithmetic units performs a product-sum operation between the weight data different from each other in the plurality of pieces of weight data and the input data outputted from the corresponding unit selector.
Full adder cell with improved power efficiency
An adder circuit provides a first operand input and a second operand input to an XNOR cell. The XNOR cell transforms these inputs to a propagate signal that is applied to an OAT cell to produce a carry out signal. A third OAT cell transforms a third operand input and the propagate signal into a sum output signal.
Method and apparatus for performing field programmable gate array packing with continuous carry chains
A method for designing a system on a target device includes identifying a length for a carry chain that is supported by predefined quanta of a resource on the target device. A plurality of logical adders is mapped onto a single logical adder implemented on the carry chain subject to the identified length to increase logic utilization in a design for the system.
Methods and apparatus to estimate population reach from different marginal ratings and/or unions of marginal ratings based on impression data
Example methods, apparatus, and articles of manufacture are disclosed to estimate population reach. An example apparatus includes an association controller to generate a tree association corresponding to a union of a first margin of media and a second margin of the media; and one or more commercial solvers to determine first multipliers by solving first equations corresponding to panelist impressions and panelist audience totals, the second margin, or the union; perform first parallel computations with a processor to determine second multipliers to solve second equations corresponding to the tree association using the first multipliers; discard the first multipliers; perform second parallel computations to determine third multipliers by solving third equations corresponding to the tree association using database proprietor impression totals; and determine an estimate for a population reach of the media for at least one of the first margin, the second margin, or the union based on the third multipliers.
Methods and apparatus to estimate population reach from different marginal ratings and/or unions of marginal ratings based on impression data
Example methods, apparatus, and articles of manufacture are disclosed to estimate population reach. An example apparatus includes an association controller to generate a tree association corresponding to a union of a first margin of media and a second margin of the media; and one or more commercial solvers to determine first multipliers by solving first equations corresponding to panelist impressions and panelist audience totals, the second margin, or the union; perform first parallel computations with a processor to determine second multipliers to solve second equations corresponding to the tree association using the first multipliers; discard the first multipliers; perform second parallel computations to determine third multipliers by solving third equations corresponding to the tree association using database proprietor impression totals; and determine an estimate for a population reach of the media for at least one of the first margin, the second margin, or the union based on the third multipliers.
Systems and methods for data placement for in-memory-compute
According to one embodiment, a memory module includes: a memory die including a dynamic random access memory (DRAM) banks, each including: an array of DRAM cells arranged in pages; a row buffer to store values of one of the pages; an input/output (IO) module; and an in-memory compute (IMC) module including: an arithmetic logic unit (ALU) to receive operands from the row buffer or the IO module and to compute an output based on the operands and one of a plurality of ALU operations; and a result register to store the output of the ALU; and a controller to: receive, from a host processor, operands and an instruction; determine, based on the instruction, a data layout; supply the operands to the DRAM banks in accordance with the data layout; and control an IMC module to perform one of the ALU operations on the operands in accordance with the instruction.
Controlling carry-save adders in multiplication
A multiplier circuit is provided to multiply a first operand and a second operand. The multiplier circuit includes a carry-save adder network comprising a plurality of carry-save adders to perform partial product additions to reduce a plurality of partial products to a redundant result value that represents a product of the first operand and the second operand. A number of the carry-save adders that is used to generate the redundant result value is controllable and is dependent on a width of at least one of the first operand and the second operand.