G06T1/60

Guaranteed data compression using intermediate compressed data
11716094 · 2023-08-01 · ·

Methods for converting an n-bit number into an m-bit number for situations where n>m and also for situations where n<m, where n and m are integers. The methods use truncation or bit replication followed by the calculation of an adjustment value which is applied to the replicated number.

Techniques to perform fast fourier transform

Apparatuses, systems, and techniques to perform a fast Fourier transform operation. In at least one embodiment, a fast Fourier transform operation is performed based on one or more parameters, wherein the one or more parameters indicate information about one or more operands of the fast Fourier transform.

Techniques to perform fast fourier transform

Apparatuses, systems, and techniques to perform a fast Fourier transform operation. In at least one embodiment, a fast Fourier transform operation is performed based on one or more parameters, wherein the one or more parameters indicate information about one or more operands of the fast Fourier transform.

Mapping Multi-Dimensional Coordinates to a 1D Space
20230026788 · 2023-01-26 ·

A circuit for mapping N coordinates to a 1D space receives N input bit-strings representing respective coordinates, which can be of different sizes; produces a grouped bit-string therefrom, in which the bits, including non-data bits, are grouped into groups of bits originating from the same bit position per group; and demultiplexes this into n=1 . . . N demultiplexed bit-strings, and sends each to a respective n-coordinate channel. The nth demultiplexed bit-string includes a respective part of the grouped bit-string that has n coordinate data bits and N-n non-data bits per group, and all other groups filled with null bits. Each but the N-coordinate channel includes bit-packing circuitry which packs down the respective demultiplexed bit-string by removing the no-data bits, and removing the same number of bits per group from the null bit. The packed bit-strings are then aligned relative to one another according to the corresponding bit positions, and combined.

Mapping Multi-Dimensional Coordinates to a 1D Space
20230026788 · 2023-01-26 ·

A circuit for mapping N coordinates to a 1D space receives N input bit-strings representing respective coordinates, which can be of different sizes; produces a grouped bit-string therefrom, in which the bits, including non-data bits, are grouped into groups of bits originating from the same bit position per group; and demultiplexes this into n=1 . . . N demultiplexed bit-strings, and sends each to a respective n-coordinate channel. The nth demultiplexed bit-string includes a respective part of the grouped bit-string that has n coordinate data bits and N-n non-data bits per group, and all other groups filled with null bits. Each but the N-coordinate channel includes bit-packing circuitry which packs down the respective demultiplexed bit-string by removing the no-data bits, and removing the same number of bits per group from the null bit. The packed bit-strings are then aligned relative to one another according to the corresponding bit positions, and combined.

PERFORMING GLOBAL MEMORY ATOMICS IN A PRIVATE CACHE OF A SUB-CORE OF A GRAPHICS PROCESSING UNIT

Embodiments are directed to systems and methods for performing global memory atomics in a private cache of a sub-core of a GPU. An embodiment of a GPU includes multiple sub-cores each including a load/store pipeline. The load/store pipeline is operable to receive information specifying an atomic operation to be performed within a primary data cache of the load/store pipeline. The load/store pipeline is also operable to read data to be modified by the atomic operation into the primary data cache from a memory hierarchy shared by the multiple sub-cores. The load/store pipeline is further operable to produce an atomic result of the atomic operation by modifying the data within the primary data cache based on the atomic operation.

Thread group scheduling for graphics processing

Embodiments are generally directed to thread group scheduling for graphics processing. An embodiment of an apparatus includes a plurality of processors including a plurality of graphics processors to process data; a memory; and one or more caches for storage of data for the plurality of graphics processors, wherein the one or more processors are to schedule a plurality of groups of threads for processing by the plurality of graphics processors, the scheduling of the plurality of groups of threads including the plurality of processors to apply a bias for scheduling the plurality of groups of threads according to a cache locality for the one or more caches.

CACHE-BASED WARP ENGINE

The present invention relates to an image warping system capable of quickly performing image warping with low costs by using a cache memory, and a method thereof. The image warping system is provided to generate a transformed image by warping an input image with the help of a cache based WARP engine. The WARP engine accesses the input image and loads a portion of the image to the cache memory for speeding up the engine process. The WARP engine performs interpolation on the input image to generate an output image which is devoid of distortions. The output image obtained is then stored in the DDR of an electronic device.

CACHE-BASED WARP ENGINE

The present invention relates to an image warping system capable of quickly performing image warping with low costs by using a cache memory, and a method thereof. The image warping system is provided to generate a transformed image by warping an input image with the help of a cache based WARP engine. The WARP engine accesses the input image and loads a portion of the image to the cache memory for speeding up the engine process. The WARP engine performs interpolation on the input image to generate an output image which is devoid of distortions. The output image obtained is then stored in the DDR of an electronic device.

SYSTEM AND METHOD OF CONVOLUTIONAL NEURAL NETWORK

A method the following operations: downscaling an input image to generate a scaled image; performing, to the scaled image, a first convolutional neural networks (CNN) modeling process with first non-local operations, to generate global parameters; and performing, to the input image, a second CNN modeling process with second non-local operations that are performed with the global parameters, to generate an output image corresponding to the input image. A system is also disclosed herein.