G06F12/0875

Memory system

A memory system includes: a first memory module including first volatile memories; a second memory module including second volatile memories, non-volatile memories and a module controller; a memory controller controlling the first and second memory modules through second and third control buses, respectively; and a switch array electrically coupling the second and third control buses, wherein the module controller controls the switch array to electrically couple the second and third control buses in a backup operation for backing up data of the first volatile memories to the non-volatile memories, wherein the first and second memory modules include one or more first memory stacks and one or more second memory stacks, respectively, wherein the first volatile memories are stacked in the first memory stacks, and wherein the second volatile memories, the non-volatile memories and the module controller are stacked in the second memory stacks.

ZERO-COPY SPARSE MATRIX FACTORIZATION SYNTHESIS FOR HETEROGENEOUS COMPUTE SYSTEMS
20230024035 · 2023-01-26 ·

A system, method, and computer-readable medium for synthesizing zero-copy sparse matrix factorization operations in heterogeneous compute systems are provided. The system includes a host and an accelerator device. The host device is configured to divide an input matrix into a plurality of blocks which are transferred to a memory of the accelerator device. The host device is also configured to generate at least one index buffer that includes pointers to the block in the accelerator's memory, where each index buffer represents a frontal matrix associated with a matrix decomposition algorithm. The host processor is configured to receive one or more kernels configured to process the index buffer(s) on an accelerator device. The index buffers are processed by the accelerator device and the modified block data is written back to a memory of the host device to generate a factorized output matrix.

METHOD AND APPARATUS TO SORT A VECTOR FOR A BITONIC SORTING ALGORITHM
20230229448 · 2023-07-20 ·

A method is provided that includes performing, by a processor in response to a vector sort instruction, sorting of values stored in lanes of the vector to generate a sorted vector, wherein the values in a first portion of the lanes are sorted in a first order indicated by the vector sort instruction and the values in a second portion of the lanes are sorted in a second order indicated by the vector sort instruction; and storing the sorted vector in a storage location.

METHOD AND APPARATUS FOR DISTRIBUTED AND COOPERATIVE COMPUTATION IN ARTIFICIAL NEURAL NETWORKS

An apparatus and method are described for distributed and cooperative computation in artificial neural networks. For example, one embodiment of an apparatus comprises: an input/output (I/O) interface; a plurality of processing units communicatively coupled to the I/O interface to receive data for input neurons and synaptic weights associated with each of the input neurons, each of the plurality of processing units to process at least a portion of the data for the input neurons and synaptic weights to generate partial results; and an interconnect communicatively coupling the plurality of processing units, each of the processing units to share the partial results with one or more other processing units over the interconnect, the other processing units using the partial results to generate additional partial results or final results. The processing units may share data including input neurons and weights over the shared input bus.

METHOD AND APPARATUS FOR DISTRIBUTED AND COOPERATIVE COMPUTATION IN ARTIFICIAL NEURAL NETWORKS

An apparatus and method are described for distributed and cooperative computation in artificial neural networks. For example, one embodiment of an apparatus comprises: an input/output (I/O) interface; a plurality of processing units communicatively coupled to the I/O interface to receive data for input neurons and synaptic weights associated with each of the input neurons, each of the plurality of processing units to process at least a portion of the data for the input neurons and synaptic weights to generate partial results; and an interconnect communicatively coupling the plurality of processing units, each of the processing units to share the partial results with one or more other processing units over the interconnect, the other processing units using the partial results to generate additional partial results or final results. The processing units may share data including input neurons and weights over the shared input bus.

Computing device and method

The present disclosure provides a computation device. The computation device is configured to perform a machine learning computation, and includes an operation unit, a controller unit, and a conversion unit. The storage unit is configured to obtain input data and a computation instruction. The controller unit is configured to extract and parse the computation instruction from the storage unit to obtain one or more operation instructions, and to send the one or more operation instructions and the input data to the operation unit. The operation unit is configured to perform operations on the input data according to one or more operation instructions to obtain a computation result of the computation instruction. In the examples of the present disclosure, the input data involved in machine learning computations is represented by fixed-point data, thereby improving the processing speed and efficiency of training operations.

Message Cloud

A method for error management is provided. The method comprises receiving a message call request regarding an error event generated by a software application. The message call request comprises a message ID associated with an error type. In response to the call request a message cache is searched for the message ID. If the ID is in the cache, an error message associated with the ID is returned. The error message provides a description of the error and suggested remedial action. If the message ID is not in the cache, the error message is fetched from a message repository that contains error messages corresponding to respective message IDs. The fetched error message is loaded into the cache and returned. Message call request data is stored in a metrics repository. The message call request data comprises frequency metrics that describe how often the message ID is received.

Message Cloud

A method for error management is provided. The method comprises receiving a message call request regarding an error event generated by a software application. The message call request comprises a message ID associated with an error type. In response to the call request a message cache is searched for the message ID. If the ID is in the cache, an error message associated with the ID is returned. The error message provides a description of the error and suggested remedial action. If the message ID is not in the cache, the error message is fetched from a message repository that contains error messages corresponding to respective message IDs. The fetched error message is loaded into the cache and returned. Message call request data is stored in a metrics repository. The message call request data comprises frequency metrics that describe how often the message ID is received.

SYSTEMS AND METHODS FOR ENERGY-EFFICIENT DATA PROCESSING

An energy-efficient sequencer comprising inline multipliers and adders causes a read source that contains matching values to output an enable signal to enable a data item prior to using a multiplier to multiply the data item with a weight to obtain a product for use in a matrix-multiplication in hardware. A second enable signal causes the output to be written to the data item.

NEVER STALE CACHING OF EFFECTIVE PROPERTIES
20230222115 · 2023-07-13 · ·

The technology disclosed relates to maintaining a cache of effective properties in an identity management system employing a graph. In particular, it relates to handling vertex/edge and/or graph topology updates in accordance with update notification requirements configured from a schema and, in conjunction with detecting updating of vertex/edge attributes and/or graph topology, recalculating effective attributes in accordance with the configured notification requirements.