Patent classifications
G06F2205/00
COMPARE AND SWAP FUNCTIONALITY FOR KEY-VALUE AND OBJECT STORES
Embodiments for providing compare and swap (CAS) functionality to key value storage to allow multi-threaded applications to share storage devices and synchronize multiple concurrent threads or processes. A key-value application programming interface (API) is modified to include a CAS API in addition to the standard Put and Get APIs. The CAS function uses a key, expected old value, and new value to compare and swap an existing key value only if its current value equals the expected old value. Hash values of the key value and expected old value may be used by the CAS function to improve performance and reduce bandwidth.
Methods and apparatus for performing fixed-point normalization using floating-point functional blocks
An integrated circuit may include normalization circuitry that can be used when converting a fixed-point number to a floating-point number. The normalization circuitry may include at least a floating-point generation circuit that receives the fixed-point number and that creates a corresponding floating-point number. The normalization circuitry may then leverage an embedded digital signal processing (DSP) block on the integrated circuit to perform an arithmetic operation by removing the leading one from the created floating-point number. The resulting number may have a fractional component and an exponent value, which can then be used to derive the final normalized value.
System, Method, and recording medium for mirroring matrices for batched Cholesky decomposition on a graphic processing unit
A batched Cholesky decomposition method, system, and non-transitory computer readable medium for a Graphics Processing Unit (GPU), include mirroring a second problem matrix of a second problem to a first problem matrix of a first problem as paired matrices and shifting the second problem matrix by N+1 and combining the first problem matrix and the mirrored second problem matrix into one matrix of (N+1)N by merging the first problem matrix and the mirrored second problem matrix. The first problem matrix and the second problem matrix are symmetric and positive definite matrices.
SYSTEM, METHOD, AND RECORDING MEDIUM FOR MIRRORING MATRICES FOR BATCHED CHOLESKY DECOMPOSITION ON A GRAPHIC PROCESSING UNIT
A batched Cholesky decomposition method, system, and non-transitory computer readable medium for a Graphics Processing Unit (GPU), include mirroring a second problem matrix of a second problem to a first problem matrix of a first problem as paired matrices and shifting the second problem matrix by N+1 and combining the first problem matrix and the mirrored second problem matrix into one matrix of (N+1)N, where the first problem shared memory comprises regular intervals, where the second problem shared memory is continuous, and where the GPU performs batched dense Cholesky decomposition with the one matrix from the combining to accelerate the Cholesky decomposition.
Signed division in memory
Examples of the present disclosure provide apparatuses and methods for performing signed division operations. An apparatus can include a first group of memory cells coupled to a sense line and to a number of first access lines. The apparatus can include a second group of memory cells coupled to the sense line and to a number of second access lines. The apparatus can include a controller configured to operate sensing circuitry to divide a signed dividend element stored in the first group of memory cells by a signed divisor element stored in the second group of memory cells by performing a number of operations.
HARDWARE ACCELERATED MACHINE LEARNING
A machine learning hardware accelerator architecture and associated techniques are disclosed. The architecture features multiple memory banks of very wide SRAM that may be concurrently accessed by a large number of parallel operational units. Each operational unit supports an instruction set specific to machine learning, including optimizations for performing tensor operations and convolutions. Optimized addressing, an optimized shift reader and variations on a multicast network that permutes and copies data and associates with an operational unit that support those operations are also disclosed.
SYSTEM, METHOD, AND RECORDING MEDIUM FOR MIRRORING MATRICES FOR BATCHED CHOLESKY DECOMPOSITION ON A GRAPHIC PROCESSING UNIT
A batched Cholesky decomposition method, system, and non-transitory computer readable medium for a Graphics Processing Unit (GPU), include mirroring a second problem matrix of a second problem to a first problem matrix of a first problem as paired matrices and shifting the second problem matrix by N+1 and combining the first problem matrix and the mirrored second problem matrix into one matrix of (N+1)N by merging the first problem matrix and the mirrored second problem matrix. The first problem matrix and the second problem matrix are symmetric and positive definite matrices.
System, method, and recording medium for mirroring matrices for batched Cholesky decomposition on a graphic processing unit
A batched Cholesky decomposition method, system, and non-transitory computer readable medium for a Graphics Processing Unit (GPU), include mirroring a second problem matrix of a second problem to a first problem matrix of a first problem as paired matrices and shifting the second problem by N+1, combining the first problem matrix and the mirrored second problem matrix into one matrix of (N+1)N, and reading the fixed size data length of the one square matrix with a fixed data interval for both the first problem and the second problem.
Determining unsigned normalized integer representations of a number in data processing systems
A method of operating a data processing system when determining a b-bit unsigned normalized integer representation U of a number x is disclosed. When the number x has a value between 0 and 1, the method comprises determining the integer part I of (x2.sup.b), and determining whether to use the integer part I, an incremented version of the integer part I, or a decremented version of the integer part I for the unsigned normalized integer representation U of the number x based on a comparison that uses the fractional part F of (x2.sup.b) and the number x.
Data processing systems
A method of operating a data processing system when determining an unsigned normalized integer representation U of a number x is disclosed. When the number x has a value between 0 and 1, it is determined 31 whether the number x is greater than or equal to 0.5. When it is determined that the number x is greater than or equal to 0.5, the bit of the binary representation of the number x that represents the value 0.5 is inverted 32, and the unsigned normalized integer representation U of the number x is determined using the value of the binary representation of the number x having its bit that represents the value 0.5 inverted.