IPIQ

G06F9/5094

ALLOCATION OF DATA SUB-TENSORS ONTO HARDWARE SUB-ARRAYS

20230100036 · 2023-03-30 ·

Certain aspects of the present disclosure provide techniques for improved hardware utilization. An input data tensor is divided into a first plurality of sub-tensors, and a plurality of logical sub-arrays in a physical multiply-and-accumulate (MAC) array is identified. For each respective sub-tensor of the first plurality of sub-tensors, the respective sub-tensor is mapped to a respective logical sub-array of the plurality of logical sub-arrays, and the respective sub-tensor is processed using the respective logical sub-array.

CACHE RESIZING BASED ON PROCESSOR WORKLOAD

20230094030 · 2023-03-30 ·

A processor sets the size of a processor cache based on an identified workload executing at the processor. The cache size is set in response to the processor exiting a low-power mode. By setting the size of the cache based on the workload, the processor is able to tailor the size of the cache to the characteristics of a particular workload while also reducing, for at least some workloads, the overhead associated with entering or exiting the low-power mode.

POWER CONSUMPTION OPTIMIZATION ON THE CLOUD

20230034295 · 2023-02-02 ·

A power consumption optimization system includes a virtual machine (VM) provisioned on a host, a memory, a server, and a processor in communication with the memory. The processor causes the server to store a power consumption profile of the VM. The VM runs at a processor frequency state. Additionally, the processor causes the server to receive a request to lower a processor frequency for the VM from an original processor frequency state to a reduced processor frequency state. The request has request criteria indicating a time duration associated with the request. The server validates the request criteria and a requirement of another tenant on the host. Responsive to validating the request criteria and the requirement the other tenant on the host, the server confirms the request to lower the processor frequency. Additionally, the server lowers the processor frequency to the reduced processor frequency state during the time duration.

ADAPTIVELY REALLOCATING RESOURCES OF RESOURCE-CONSTRAINED DEVICES

20230102495 · 2023-03-30 ·

Implementations are disclosed for adaptively reallocating computing resources of resource-constrained devices between tasks performed in situ by those resource-constrained devices. In various implementations, while the resource-constrained device is transported through an agricultural area, computing resource usage of the resource-constrained device ma may be monitored. Additionally, phenotypic output generated by one or more phenotypic tasks performed onboard the resource-constrained device may be monitored. Based on the monitored computing resource usage and the monitored phenotypic output, a state may be generated and processed based on a policy model to generate a probability distribution over a plurality of candidate reallocation actions. Based on the probability distribution, candidate reallocation action(s) may be selected and performed to reallocate at least some computing resources between a first phenotypic task of the one or more phenotypic tasks and a different task while the resource-constrained device is transported through the agricultural area.

CLOUD SERVICE SYSTEM AND OPERATION METHOD THEREOF

20230032842 · 2023-02-02 ·

Shanghai Biren Technology Co.,Ltd

Xin Wang

A cloud service system and an operation method thereof are provided. The cloud service system includes a first computing resource pool, a second computing resource pool, and a task dispatch server. Each computing platform in the first computing resource pool does not have a co-processor. Each computing platform in the second computing resource pool has at least one co-processor. The task dispatch server is configured to receive a plurality of tasks. The task dispatch server checks a task attribute of a task to be dispatched currently among the tacks. The task dispatch server chooses to dispatch the task to be dispatched currently to the first computing resource pool or to the second computing resource pool for execution according to the task attribute.

Power controller communication latency mitigation

11493980 · 2022-11-08 ·

Qualcomm Incorporated

In controlling power in a portable computing device (“PCD”), a power supply input to a PCD subsystem may be modulated with a modulation signal when an over-current condition is detected. Detection of the modulation signal may indicate to a processing core of the subsystem to reduce its processing load. Compensation for the modulation signal in the power supply input may be applied so that the processing core is essentially unaffected by the modulation signal.

ACCELERATOR SCHEDULING

20230034105 · 2023-02-02 ·

Dell Products L.P.

An information handling system may include at least one central processing unit (CPU); and a plurality of special-purpose processing units. The information handling system may be configured to: receive information regarding cooling characteristics of the plurality of special-purpose processing units; and assign identification (ID) numbers to each of the plurality of special-purpose processing units in an order that is determined based at least in part on the cooling characteristics.

Method and apparatus for differentially optimizing quality of service QoS

11616702 · 2023-03-28 ·

GUANGDONG POLYTECHNIC NORMAL UNIVERSITY

A method and apparatus for differentially optimizing a quality of service (QoS) includes: establishing a system model of a multi-task unloading framework; acquiring a mode for users executing a computation task, executing, according to the mode for users executing the computation task, the system model of the multi-task unloading framework; and optimizing a quality of service (QoS) on the basis of a multi-objective optimization method for a multi-agent deep reinforcement learning. According to the present invention, an unloading policy is calculated on the basis of a multi-user differentiated QoS of a multi-agent deep reinforcement learning, and with the differentiated QoS requirements among different users in a system being considered, a global unloading decision is performed according to a task performance requirement and a network resource state, and differentiated performance optimization is performed on different user requirements, thereby effectively improving a system resource utilization rate and a user service quality.

COMPUTER-READABLE RECORDING MEDIUM STORING PROGRAM AND MANAGEMENT METHOD

20230035134 · 2023-02-02 ·

Fujitsu Limited

Akira Hirai

A recording medium stores a program for causing a computer to execute processing including: acquiring a first process execution time and energy consumption of a first processor core in the execution time when a process executed by the first processor core is switched from a first process to a second process; specifying one or more processes of a first process group to which the first process belongs, from among process groups each of which is a group of processes and calculating an index that indicates the energy consumption per unit time involved in execution of the first process group based on the execution time and the energy consumption acquired for the specified one or more processes; and controlling an operation of a processor core to which the process is allocated according to comparison between the index calculated for the first process group with a threshold.

Compiler for implementing memory shutdown for neural network implementation configuration

11615322 · 2023-03-28 ·

Perceive Corporation

Some embodiments provide a compiler for optimizing the implementation of a machine-trained network (e.g., a neural network) on an integrated circuit (IC). The compiler of some embodiments receives a specification of a machine-trained network including multiple layers of computation nodes and generates a graph representing options for implementing the machine-trained network in the IC. In some embodiments, the graph includes nodes representing options for implementing each layer of the machine-trained network and edges between nodes for different layers representing different implementations that are compatible. The compiler of some embodiments is also responsible for generating instructions relating to shutting down (and waking up) memory units of cores. In some embodiments, the memory units to shutdown are determined by the compiler based on the data that is stored or will be stored in the particular memory units.

Patent classifications

G06F9/5094