H03M7/6058

Dynamic sequencing of data partitions for optimizing memory utilization and performance of neural networks

Optimized memory usage and management is crucial to the overall performance of a neural network (NN) or deep neural network (DNN) computing environment. Using various characteristics of the input data dimension, an apportionment sequence is calculated for the input data to be processed by the NN or DNN that optimizes the efficient use of the local and external memory components. The apportionment sequence can describe how to parcel the input data (and its associated processing parameters—e.g., processing weights) into one or more portions as well as how such portions of input data (and its associated processing parameters) are passed between the local memory, external memory, and processing unit components of the NN or DNN. Additionally, the apportionment sequence can include instructions to store generated output data in the local and/or external memory components so as to optimize the efficient use of the local and/or external memory components.

MANAGING MEMORY FRAGMENTATION IN HARDWARE-ASSISTED DATA COMPRESSION

Systems, devices, and methods for managing fragmentation in hardware-assisted compression of data in physical computer memory which may result in reduced internal fragmentation. An example computer-implemented method comprises: providing, by a memory management program to compression hardware, a compression command including an address in physical computer memory of data to be compressed and a list of at least two available buffers for storing compressed data; using, by the compression hardware, the address included in the compression command to retrieve uncompressed data; compressing the uncompressed data; and selecting, by the compression hardware, from the list of at least two available buffers, at least two buffers for storing compressed data based on an amount of space that would remain if the compressed data were stored in the at least two buffers, wherein each of the at least two selected buffers differs in size from at least one other of the selected buffers.

Lossy statistical data compression

A method performed in real-time includes receiving and storing time-based data over a specific time period and dividing the specific time period into a plurality of time windows. The method further includes determining that data associated with two or more proximate time windows are within a predetermined variance of one another and responsive to the determination: generating a mathematical function representative of the data associated with the two or more proximate time windows, deleting the data associated with the two or more proximate time windows, and generating a representation of the deleted data from the mathematical function. In certain embodiments, the data comprises empirical network telemetry data.

Power-efficient deep neural network module configured for parallel kernel and parallel input processing

A deep neural network (DNN) module utilizes parallel kernel and parallel input processing to decrease bandwidth utilization, reduce power consumption, improve neuron multiplier stability, and provide other technical benefits. Parallel kernel processing enables the DNN module to load input data only once for processing by multiple kernels. Parallel input processing enables the DNN module to load kernel data only once for processing with multiple input data. The DNN module can implement other power-saving techniques like clock-gating (i.e. removing the clock from) and power-gating (i.e. removing the power from) banks of accumulators based upon usage of the accumulators. For example, individual banks of accumulators can be power-gated when all accumulators in a bank are not in use, and do not store data for a future calculation. Banks of accumulators can also be clock-gated when all accumulators in a bank are not in use, but store data for a future calculation.

METHOD FOR COMPRESSING DIGITAL SIGNAL DATA AND SIGNAL COMPRESSOR MODULE

A method of compressing digital signal data obtained from a signal is described. The method includes: receiving digital signal data associated with a signal and/or generating digital signal data based on a signal; transforming the digital signal data into a transform domain, thereby generating transformed digital signal data; determining at least one characteristic parameter based on the transformed digital signal data by an artificial intelligence circuit; detecting and/or classifying at least one wanted signal portion based on the at least one characteristic parameter by the artificial intelligence circuit; and storing only a subset of the digital signal data that is associated with the at least one wanted signal portion. Further, a signal compressor circuit for compressing digital signal data obtained from a signal and a computer program are described.

DYNAMIC SEQUENCING OF DATA PARTITIONS FOR OPTIMIZING MEMORY UTILIZATION AND PERFORMANCE OF NEURAL NETWORKS

Optimized memory usage and management is crucial to the overall performance of a neural network (NN) or deep neural network (DNN) computing environment. Using various characteristics of the input data dimension, an apportionment sequence is calculated for the input data to be processed by the NN or DNN that optimizes the efficient use of the local and external memory components. The apportionment sequence can describe how to parcel the input data (and its associated processing parameters—e.g., processing weights) into one or more portions as well as how such portions of input data (and its associated processing parameters) are passed between the local memory, external memory, and processing unit components of the NN or DNN. Additionally, the apportionment sequence can include instructions to store generated output data in the local and/or external memory components so as to optimize the efficient use of the local and/or external memory components.

ON-BOARD DATA STORAGE METHOD AND SYSTEM

This invention is directed to an on-board data storage method and system. The on-board data storage method includes obtaining, in a current time period, various pieces of time sequence data to be processed written into the on-board database; for each piece of the time sequence data to be processed, determining a node to be stored in the on-board database and a partition to be stored in the node according to the piece of the time sequence data to be processed, and writing the piece of the time sequence data to be processed into a corresponding time sequence in the determined partition to be stored; and writing each piece of the time sequence data written into each partition to be stored into each corresponding data bucket in memory respectively and merging the time sequence data in each data bucket in memory.

Dynamic sequencing of data partitions for optimizing memory utilization and performance of neural networks

Optimized memory usage and management is crucial to the overall performance of a neural network (NN) or deep neural network (DNN) computing environment. Using various characteristics of the input data dimension, an apportionment sequence is calculated for the input data to be processed by the NN or DNN that optimizes the efficient use of the local and external memory components. The apportionment sequence can describe how to parcel the input data (and its associated processing parameters—e.g., processing weights) into one or more portions as well as how such portions of input data (and its associated processing parameters) are passed between the local memory, external memory, and processing unit components of the NN or DNN. Additionally, the apportionment sequence can include instructions to store generated output data in the local and/or external memory components so as to optimize the efficient use of the local and/or external memory components.

Enhancing processing performance of a DNN module by bandwidth control of fabric interface

An exemplary computing environment having a DNN module can maintain one or more bandwidth throttling mechanisms. Illustratively, a first throttling mechanism can specify the number of cycles to wait between transactions on a cooperating fabric component (e.g., data bus). Illustratively, a second throttling mechanism can be a transaction count limiter that operatively sets a threshold of a number of transactions to be processed during a given transaction sequence and limits the number of transactions such as multiple transactions in flight to not exceed the set threshold. In an illustrative operation, in executing these two exemplary calculated throttling parameters, the average bandwidth usage and the peak bandwidth usage can be limited. Operatively, with this fabric bandwidth control, the processing units of the DNN are optimized to process data across each transaction cycle resulting in enhanced processing and lower power consumption.

Electronic device performing outlier-aware approximation coding and method thereof

An electronic device includes a coding module that determines whether a parameter of an artificial neural network is an outlier, depending on a value of the parameter and compresses the parameter by truncating a first bit of the parameter when the parameter is a non-outlier and truncating a second bit of the parameter when the parameter is the outlier, and a decoding module that decodes a compressed parameter.