Patent classifications
G06F9/3555
Data processing apparatus and related products
The present disclosure provides a data processing apparatus and related products. The products include a control module including an instruction caching unit, an instruction processing unit, and a storage queue unit. The instruction caching unit is configured to store computation instructions associated with an artificial neural network operation; the instruction processing unit is configured to parse the computation instructions to obtain a plurality of operation instructions; and the storage queue unit is configured to store an instruction queue, where the instruction queue includes a plurality of operation instructions or computation instructions to be executed in the sequence of the queue. By adopting the above-mentioned method, the present disclosure can improve the operation efficiency of related products when performing operations of a neural network model.
REGISTER BASED SIMD LOOKUP TABLE OPERATIONS
An approach is provided for implementing register based single instruction, multiple data (SIMD) lookup table operations. According to the approach, an instruction set architecture (ISA) can support one or more SIMD instructions that enable vectors or multiple values in source data registers to be processed in parallel using a lookup table or truth table stored in one or more function registers. The SIMD instructions can be flexibly configured to support functions with inputs and outputs of various sizes and data formats. Various approaches are also described for supporting very large lookup tables that span multiple registers.
SERVER, CONTROL DEVICE FOR VEHICLE, AND MACHINE LEARNING SYSTEM FOR VEHICLE
A server including a processor configured to: receive a data set from a vehicle; create a plurality of learned models of different scales by performing machine learning using the data set; receive from the vehicle information on computing power of an electronic control unit that controls the vehicle by applying the learned model; and transmit the learned model to the vehicle, wherein the processor is configured to transmit the learned model of a larger scale to the vehicle equipped with the electronic control device having high computing power, than to the vehicle equipped with the electronic control device having low computing power.
Scaling performance across a large number of customer nodes
Described are systems and methods for scaling performance across a large number of customer nodes by delegating management of execution of one or more tasks to the customer nodes. An example method may commence with ascertaining a set of the customer nodes eligible for delegation of the one or more tasks. The method may continue with deploying one or more control agents to the eligible set of the customer nodes. The one or more control agents may be configured to coordinate and execute the one or more tasks on the eligible set of customer nodes and selectively take one or more actions based on results of the execution of the one or more tasks.
Thermal state inference based frequency scaling
The systems and methods monitor thermal states associated with a device. The systems and methods set thermal thresholds associated with the device. The systems and methods infer the thermal thresholds from information gathered by a client application running on the device. The systems and methods implement a stored policy associated with a violation of one of the thermal thresholds by one of the monitored thermal states.
DYNAMIC WORKFLOW OPTIMIZATION USING MACHINE LEARNING TECHNIQUES
A system and a method are disclosed for recommending a change to improve performance of a target workflow. A workflow management system receives the target workflow intended to be used in a particular context to achieve a target result. The target workflow has a structure with a plurality of steps performed in a predefined order, but there may be options for modifying the workflow to lead to better performance (e.g., change type of action performed in a step, change order of steps, add a new step). The workflow management system identifies candidate workflows that are similar to the target workflow and identifies historical changes that have been made to these candidate workflows. Using a machine learning model, the workflow management system determines a change from one of the historical changes made to the candidate workflows associated with the highest expected impact when applied to the target workflow.
Scaling Performance Across a Large Number of Customer Nodes
Described are systems and methods for scaling performance across a large number of customer nodes by delegating management of execution of one or more tasks to the customer nodes. An example method may commence with ascertaining a set of the customer nodes eligible for delegation of the one or more tasks. The method may continue with deploying one or more control agents to the eligible set of the customer nodes. The one or more control agents may be configured to coordinate and execute the one or more tasks on the eligible set of customer nodes and selectively take one or more actions based on results of the execution of the one or more tasks.
Initialization of Parameters for Machine-Learned Transformer Neural Network Architectures
An online system trains a transformer architecture by an initialization method which allows the transformer architecture to be trained without normalization layers of learning rate warmup, resulting in significant improvements in computational efficiency for transformer architectures. Specifically, an attention block included in an encoder or a decoder of the transformer architecture generates the set of attention representations by applying a key matrix to the input key, a query matrix to the input query, a value matrix to the input value to generate an output, and applying an output matrix to the output to generate the set of attention representations. The initialization method may be performed by scaling the parameters of the value matrix and the output matrix with a factor that is inverse to a number of the set of encoders or a number of the set of decoders.
Autoscaling and throttling in an elastic cloud service
Techniques described herein can optimize usage of computing resources in a data system. Dynamic throttling can be performed locally on a computing resource in the foreground and autoscaling can be performed in a centralized fashion in the background. Dynamic throttling can lower the load without overshooting while minimizing oscillation and reducing the throttle quickly. Autoscaling may involve scaling in or out the number of computing resources in a cluster as well as scaling up or down the type of computing resources to handle different types of situations.
Processing of a temporary-register-using instruction including determining whether to process a register move micro-operation for transferring data from a first register file to a second register file based on whether a temporary variable is still available in the second register file
An apparatus has a processing pipeline, and first and second register files. A temporary-register-using instruction is supported which controls the pipeline to perform an operation using a temporary variable derived from an operand stored in the first register file. In response to the instruction, when a predetermined condition is not satisfied, the pipeline processes at least one register move micro-operation to transfer data from the at least one source register of the first register file to at least one newly allocated temporary register of the second register file. When the condition is satisfied, the operation can be performed using a temporary variable already stored in the temporary register of the second register file used by an earlier temporary-register-using instruction specifying the same source register for determining the temporary variable, in the absence of an intervening instruction for rewriting the source register.