Patent classifications
G06F9/3877
SYSTEM ON CHIP AND METHOD FOR OPERATING SYSTEM ON CHIP
A system on chip and a method for operating a system on chip are provided. The system on chip a plurality of intellectual property (IP) cores including a first IP core configured to process data in real-time, a buffer including a plurality of queues, and processing circuitry configured to, generate first traffic data corresponding to first data output from the first IP core, and reserve at least one queue of the plurality of queues as a first dedicated area based on the first traffic data, the first dedicated area configured to be used as a queue for transmission of the first data.
Signal-processing apparatus including a second processor that, after receiving an instruction from a first processor, independantly controls a second data processing unit without further instruction from the first processor
A signal-processing apparatus includes an instruction-parallel processor, a first data-parallel processor, a second data-parallel processor, and a motion detection unit, a de-blocking filtering unit and a variable-length coding/decoding unit which are dedicated hardware. With this structure, during signal processing of an image compression and decompression algorithm needing a large amount of processing, the load is distributed between software and hardware, so that the signal-processing apparatus can realize high processing capability and flexibility.
Accelerator for computing combinatorial cost function
A computing device, including memory, an accelerator device, and a processor. The processor may generate a plurality of data packs that each indicate an update to a variable of one or more variables of a combinatorial cost function. The processor may transmit the plurality of data packs to the accelerator device. The accelerator device may, for each data pack, retrieve a variable value of the variable indicated by the data pack and generate an updated variable value. The accelerator device may generate an updated cost function value based on the updated variable value. The accelerator device may be further configured to determine a transition probability using a Monte Carlo algorithm and may store the updated variable value and the updated cost function value with the transition probability. The accelerator device may output a final updated cost function value to the processor.
MACHINE-LEARNING TRAINING SERVICE FOR SYNTHETIC DATA
Various embodiments, methods and systems for implementing a distributed computing system machine-learning training service are provided. Initially a machine learning model is accessed. A plurality of synthetic data assets are accessed, where a synthetic data asset is associated with asset-variation parameters that are programmable for machine-learning. The machine learning model is retrained using the plurality of synthetic data assets. The machine-learning training service is further configured for executing real-time calls to generate an on-the-fly-generated synthetic data asset such that the on-the-fly-generated synthetic data asset is rendered in real-time to preclude pre-rendering and storing the on-the-fly-generated synthetic data asset. The machine-learning training service further supports hybrid-based machine learning training, where the machine learning model is trained based on a combination of the plurality of synthetic data assets, a plurality of non-synthetic data assets, and synthetic data asset metadata associated with the plurality of synthetic data assets.
Graphics systems and methods for accelerating synchronization using fine grain dependency check and scheduling optimizations based on available shared memory space
Accelerated synchronization operations using fine grain dependency check are disclosed. A graphics multiprocessor includes a plurality of execution units and synchronization circuitry that is configured to determine availability of at least one execution unit. The synchronization circuitry to perform a fine grain dependency check of availability of dependent data or operands in shared local memory or cache when at least one execution unit is available.
Apparatus and method for injecting spin echo micro-operations in a quantum processor
Apparatus and method for injected spin echo sequences in a quantum processor. For example, one embodiment of a processor includes a decoder to decode quantum instructions to generate quantum microoperations (uops) and to decode non-quantum instructions to generate non-quantum uops, execution circuitry to execute the quantum uops and non-quantum uops, and a corrective sequence data structure to identify and/or store corrective sets of uops for one or more of the quantum instructions. The decoder is to query the corrective sequence data structure upon receiving a first quantum instruction to determine if one or more corrective uops exist, and if the one or more corrective uops exist, the decoder is to submit the one or more corrective uops for execution by the execution circuitry.
Deep learning-based variant classifier
The technology disclosed directly operates on sequencing data and derives its own feature filters. It processes a plurality of aligned reads that span a target base position. It combines elegant encoding of the reads with a lightweight analysis to produce good recall and precision using lightweight hardware. For instance, one million training examples of target base variant sites with 50 to 100 reads each can be trained on a single GPU card in less than 10 hours with good recall and precision. A single GPU card is desirable because it a computer with a single GPU is inexpensive, almost universally within reach for users looking at genetic data. It is readily available on could-based platforms.
Adjusting virtual machine GPU refresh rate to remote desktop stream frame rate
A system and method of adjusting a refresh rate to match a given remote desktop stream frame rate is described. The system may include a processing device to transmit, as a media stream, a portion of a remote desktop image with a frame rate that matches a refresh rate to a remote desktop client.
Datapath load distribution for a RIC
To provide a low latency near RT RIC, some embodiments separate the RIC's functions into several different components that operate on different machines (e.g., execute on VMs or Pods) operating on the same host computer or different host computers. Some embodiments also provide high speed interfaces between these machines. Some or all of these interfaces operate in non-blocking, lockless manner in order to ensure that critical near RT RIC operations (e.g., datapath processes) are not delayed due to multiple requests causing one or more components to stall. In addition, each of these RIC components also has an internal architecture that is designed to operate in a non-blocking manner so that no one process of a component can block the operation of another process of the component. All of these low latency features allow the near RT RIC to serve as a high speed IO between the E2 nodes and the xApps.
Computing device and method
The present disclosure provides a computation device. The computation device is configured to perform a machine learning computation, and includes an operation unit, a controller unit, and a conversion unit. The storage unit is configured to obtain input data and a computation instruction. The controller unit is configured to extract and parse the computation instruction from the storage unit to obtain one or more operation instructions, and to send the one or more operation instructions and the input data to the operation unit. The operation unit is configured to perform operations on the input data according to one or more operation instructions to obtain a computation result of the computation instruction. In the examples of the present disclosure, the input data involved in machine learning computations is represented by fixed-point data, thereby improving the processing speed and efficiency of training operations.