Patent classifications
H03M7/6017
Technologies for dividing work across accelerator devices
Technologies for dividing work across one or more accelerator devices include a compute device. The compute device is to determine a configuration of each of multiple accelerator devices of the compute device, receive a job to be accelerated from a requester device remote from the compute device, and divide the job into multiple tasks for a parallelization of the multiple tasks among the one or more accelerator devices, as a function of a job analysis of the job and the configuration of each accelerator device. The compute engine is further to schedule the tasks to the one or more accelerator devices based on the job analysis and execute the tasks on the one or more accelerator devices for the parallelization of the multiple tasks to obtain an output of the job.
COMPRESSION AND DECOMPRESSION ENGINES AND COMPRESSED DOMAIN PROCESSORS
Compressed domain processors configured to perform operations on data compressed in a format that preserves order. The Compressed domain processors may include operations such as addition, subtraction, multiplication, division, sorting, and searching. In some cases, compression engines for compressing the data into the desired formats are provided.
METHODS AND APPARATUS FOR SPARSE TENSOR STORAGE FOR NEURAL NETWORK ACCELERATORS
Methods, apparatus, systems and articles of manufacture are disclosed for sparse tensor storage for neural network accelerators. An example apparatus includes sparsity map generating circuitry to generate a sparsity map corresponding to a tensor, the sparsity map to indicate whether a data point of the tensor is zero, static storage controlling circuitry to divide the tensor into one or more storage elements, and a compressor to perform a first compression of the one or more storage elements to generate one or more compressed storage elements, the first compression to remove zero points of the one or more storage elements based on the sparsity map and perform a second compression of the one or more compressed storage elements, the second compression to store the one or more compressed storage elements contiguously in memory.
Operation accelerator and compression method
The present disclosure discloses example operation accelerators and compression methods. One example operation accelerator performs operations, including storing, in a first buffer, first input data. In a second buffer, weight data can be stored. A computation result is obtained by performing matrix multiplication on the first input data and the weight data by an operation circuit connected to the input buffer and the weight buffer. The computation result is compressed by a compression module to obtain compressed data. The compressed data can be stored into a memory outside the operation accelerator by a direct memory access controller (DMAC) connected to the compression module.
METHOD OF INPUT DATA COMPRESSION, ASSOCIATED COMPUTER PROGRAM PRODUCT, COMPUTER SYSTEM AND EXTRACTION METHOD
A method of data compression performed by at least one core communicating with a central memory. The input data presents a two-dimensional input array formed by a plurality data items stored contiguously in the central memory according to a contiguous direction. The method comprises a step of wavelet transform comprising the following sub-steps: forming from the input array at least one tile comprising a plurality of consecutive data block columns, each data block column being formed by a plurality of lines of consecutive data items according to the contiguous direction, the length of each line being a multiple of the cache line length; and for each data block column computing dot products between a filter vector and each group of N lines using fused multiply-add instructions for the core.
Mixing software based compression requests with hardware accelerated requests
A computer program product for data compression is provided. The computer program product includes a computer readable storage medium having program instructions embodied therewith. The program instructions are readable and executable by a processing circuit to cause the processing circuit to execute software compression for first requests for data compression that have respective sizes below a predefined threshold, forward second requests for data compression having respective sizes above the predefined threshold to a hardware accelerator and maintain a persistence of a compression dictionary used for executing the second requests across executions of the first and second requests.
TECHNOLGIES FOR MILLIMETER WAVE RACK INTERCONNECTS
Racks and rack pods to support a plurality of sleds are disclosed herein. Switches for use in the rack pods are also disclosed herein. A rack comprises a plurality of sleds and a plurality of electromagnetic waveguides. The plurality of sleds are vertically spaced from one another. The plurality of electromagnetic waveguides communicate data signals between the plurality of sleds.
TECHNOLOGIES FOR DIVIDING WORK ACROSS ACCELERATOR DEVICES
Technologies for dividing work across one or more accelerator devices include a compute device. The compute device is to determine a configuration of each of multiple accelerator devices of the compute device, receive a job to be accelerated from a requester device remote from the compute device, and divide the job into multiple tasks for a parallelization of the multiple tasks among the one or more accelerator devices, as a function of a job analysis of the job and the configuration of each accelerator device. The compute engine is further to schedule the tasks to the one or more accelerator devices based on the job analysis and execute the tasks on the one or more accelerator devices for the parallelization of the multiple tasks to obtain an output of the job.
Computerized methods of data compression and analysis
A computerized method and apparatus compresses symbolic information, such as text. Symbolic information is compressed by recursively identifying pairs of symbols (e.g., pairs of words or characters) and replacing each pair with a respective replacement symbol. The number of times each symbol pair appears in the uncompressed text is counted, and pairs are only replaced if they appear more than a threshold number of times. In recursive passes, each replaced pair can include a previously substituted replacement symbol. The method and apparatus can achieve high compression especially for large datasets. Metadata, such as the number of times each pair appears, generated during compression of the documents can be used to analyze the documents and find similarities between two documents.
Compression and decompression engines and compressed domain processors
Compressed domain processors configured to perform operations on data compressed in a format that preserves order. The Compressed domain processors may include operations such as addition, subtraction, multiplication, division, sorting, and searching. In some cases, compression engines for compressing the data into the desired formats are provided.