Patent classifications
G06F9/28
Software assisted power management
Embodiments include an apparatus comprising an execution unit coupled to a memory, a microcode controller, and a hardware controller. The microcode controller is to identify a global power and performance hint in an instruction stream that includes first and second instruction phases to be executed in parallel, identify a local hint based on synchronization dependence in the first instruction phase, and use the first local hint to balance power consumption between the execution unit and the memory during parallel executions of the first and second instruction phases. The hardware controller is to use the global hint to determine an appropriate voltage level of a compute voltage and a frequency of a compute clock signal for the execution unit during the parallel executions of the first and second instruction phases. The first local hint includes a processing rate for the first instruction phase or an indication of the processing rate.
Software assisted power management
Embodiments include an apparatus comprising an execution unit coupled to a memory, a microcode controller, and a hardware controller. The microcode controller is to identify a global power and performance hint in an instruction stream that includes first and second instruction phases to be executed in parallel, identify a local hint based on synchronization dependence in the first instruction phase, and use the first local hint to balance power consumption between the execution unit and the memory during parallel executions of the first and second instruction phases. The hardware controller is to use the global hint to determine an appropriate voltage level of a compute voltage and a frequency of a compute clock signal for the execution unit during the parallel executions of the first and second instruction phases. The first local hint includes a processing rate for the first instruction phase or an indication of the processing rate.
Analytic workload partitioning for security and performance optimization
The present disclosure provides privacy preservation of analytic workflows based on splitting the workflow into sub-workflows each with different privacy-preserving characteristics. Libraries are generated that provide for formatting and/or encrypting data for use in the sub-workflows and also for compiling a machine learning algorithm for the sub-workflows. Subsequently, the sub-workflows can be executed using the compiled algorithm and formatted data.
Analytic workload partitioning for security and performance optimization
The present disclosure provides privacy preservation of analytic workflows based on splitting the workflow into sub-workflows each with different privacy-preserving characteristics. Libraries are generated that provide for formatting and/or encrypting data for use in the sub-workflows and also for compiling a machine learning algorithm for the sub-workflows. Subsequently, the sub-workflows can be executed using the compiled algorithm and formatted data.
Automatic scaling of microservices applications
A device may receive information identifying a set of tasks to be executed by a microservices application that includes a plurality of microservices. The device may determine an execution time of the set of tasks based on a set of parameters and a model. The set of parameters may include a first parameter that identifies a first number of instances of a first microservice of the plurality of microservices, and a second parameter that identifies a second number of instances of a second microservice of the plurality of microservices. The device may compare the execution time and a threshold. The threshold may be associated with a service level agreement. The device may selectively adjust the first number of instances or the second number of instances based on comparing the execution time and the threshold.
Automatic scaling of microservices applications
A device may receive information identifying a set of tasks to be executed by a microservices application that includes a plurality of microservices. The device may determine an execution time of the set of tasks based on a set of parameters and a model. The set of parameters may include a first parameter that identifies a first number of instances of a first microservice of the plurality of microservices, and a second parameter that identifies a second number of instances of a second microservice of the plurality of microservices. The device may compare the execution time and a threshold. The threshold may be associated with a service level agreement. The device may selectively adjust the first number of instances or the second number of instances based on comparing the execution time and the threshold.
TECHNOLOGIES FOR DYNAMIC ACCELERATOR SELECTION
Technologies for dynamic accelerator selection include a compute sled. The compute sled includes a network interface controller to communicate with a remote accelerator of an accelerator sled over a network, where the network interface controller includes a local accelerator and a compute engine. The compute engine is to obtain network telemetry data indicative of a level of bandwidth saturation of the network. The compute engine is also to determine whether to accelerate a function managed by the compute sled. The compute engine is further to determine, in response to a determination to accelerate the function, whether to offload the function to the remote accelerator of the accelerator sled based on the telemetry data. Also the compute engine is to assign, in response a determination not to offload the function to the remote accelerator, the function to the local accelerator of the network interface controller.
NPU IMPLEMENTED FOR ARTIFICIAL NEURAL NETWORKS TO PROCESS FUSION OF HETEROGENEOUS DATA RECEIVED FROM HETEROGENEOUS SENSORS
A neural processing unit (NPU) includes a controller including a scheduler, the controller configured to receive from a compiler a machine code of an artificial neural network (ANN) including a fusion ANN, the machine code including data locality information of the fusion ANN, and receive heterogeneous sensor data from a plurality of sensors corresponding to the fusion ANN; at least one processing element configured to perform fusion operations of the fusion ANN including a convolution operation and at least one special function operation; a special function unit (SFU) configured to perform a special function operation of the fusion ANN; and an on-chip memory configured to store operation data of the fusion ANN, wherein the schedular is configured to control the at least one processing element and the on-chip memory such that all operations of the fusion ANN are processed in a predetermined sequence according to the data locality information.
NPU IMPLEMENTED FOR ARTIFICIAL NEURAL NETWORKS TO PROCESS FUSION OF HETEROGENEOUS DATA RECEIVED FROM HETEROGENEOUS SENSORS
A neural processing unit (NPU) includes a controller including a scheduler, the controller configured to receive from a compiler a machine code of an artificial neural network (ANN) including a fusion ANN, the machine code including data locality information of the fusion ANN, and receive heterogeneous sensor data from a plurality of sensors corresponding to the fusion ANN; at least one processing element configured to perform fusion operations of the fusion ANN including a convolution operation and at least one special function operation; a special function unit (SFU) configured to perform a special function operation of the fusion ANN; and an on-chip memory configured to store operation data of the fusion ANN, wherein the schedular is configured to control the at least one processing element and the on-chip memory such that all operations of the fusion ANN are processed in a predetermined sequence according to the data locality information.
NPU IMPLEMENTED FOR ARTIFICIAL NEURAL NETWORKS TO PROCESS FUSION OF HETEROGENEOUS DATA RECEIVED FROM HETEROGENEOUS SENSORS
A neural processing unit (NPU) includes a controller including a scheduler, the controller configured to receive from a compiler a machine code of an artificial neural network (ANN) including a fusion ANN, the machine code including data locality information of the fusion ANN, and receive heterogeneous sensor data from a plurality of sensors corresponding to the fusion ANN; at least one processing element configured to perform fusion operations of the fusion ANN including a convolution operation and at least one special function operation; a special function unit (SFU) configured to perform a special function operation of the fusion ANN; and an on-chip memory configured to store operation data of the fusion ANN, wherein the schedular is configured to control the at least one processing element and the on-chip memory such that all operations of the fusion ANN are processed in a predetermined sequence according to the data locality information.