G06F9/3555

DYNAMIC WORKFLOW SELECTION USING STRUCTURE AND CONTEXT FOR SCALABLE OPTIMIZATION
20210383289 · 2021-12-09 ·

A system and a method are disclosed for recommending a change to improve performance of a target workflow. A workflow management system receives the target workflow intended to be used in a particular context to achieve a target result. The target workflow has a structure with a plurality of steps performed in a predefined order, but there may be options for modifying the workflow to lead to better performance (e.g., change type of action performed in a step, change order of steps, add a new step). The workflow management system identifies candidate workflows that are similar to the target workflow and identifies historical changes that have been made to these candidate workflows. Using a machine learning model, the workflow management system determines a change from one of the historical changes made to the candidate workflows associated with the highest expected impact when applied to the target workflow.

Thermal state inference based frequency scaling
11368558 · 2022-06-21 · ·

The systems and methods monitor thermal states associated with a device. The systems and methods set thermal thresholds associated with the device. The systems and methods infer the thermal thresholds from information gathered by a client application running on the device. The systems and methods implement a stored policy associated with a violation of one of the thermal thresholds by one of the monitored thermal states.

Hierarchical workload allocation in a storage system
11366700 · 2022-06-21 · ·

A method for hierarchical workload allocation in a storage system, the method may include determining to reallocate a compute workload of a current compute core of the storage system; wherein the current compute core is responsible for executing a workload allocation unit that comprises one or more first type shards; and reallocating the compute workload by (a) maintaining the responsibility of the current compute core for executing the workload allocation unit, and (b) reallocating at least one first type shard of the one or more first type shards to a new workload allocation unit that is allocated to a new compute core of new compute cores.

Autoscaling and throttling in an elastic cloud service

Techniques described herein can optimize usage of computing resources in a data system. Dynamic throttling can be performed locally on a computing resource in the foreground and autoscaling can be performed in a centralized fashion in the background. Dynamic throttling can lower the load without overshooting while minimizing oscillation and reducing the throttle quickly. Autoscaling may involve scaling in or out the number of computing resources in a cluster as well as scaling up or down the type of computing resources to handle different types of situations.

SCALING SYSTEM AND CALCULATION METHOD
20220156101 · 2022-05-19 · ·

A scaling system ranges over a first base in which a first calculating apparatus having first and second worker nodes operates, and a second base in which a storage apparatus connected to the first base by a network is set. The storage apparatus includes first and second network ports, and first and second volumes accessed by first and second worker node, respectively. A second calculating apparatus is further set in the first base, and the second worker node is moved to the second calculating apparatus and operates as a third worker node and the second volume communicates with the third worker node through the second network port if the transfer rate of the first calculating apparatus or the transfer rate of the first network port exceeds a predetermined threshold when the first and second volumes are in communication with the first calculating apparatus through the first network port.

MESSAGE BASED MULTI-PROCESSOR SYSTEM AND METHOD OF OPERATING THE SAME
20230266974 · 2023-08-24 ·

The present application discloses a message based multi-processor system (1) comprising a message exchange network (R,L) and a plurality of processor clusters (Ci,j) capable to mutually exchange messages via the message exchange network. A processor cluster (Ci,j) comprises one or more processor cluster elements (PCE), and a message generator (MG). The message based multiprocessor system (1) is configured as a neural network processor system having a plurality of neural network processing layers (e.g. NL1, . . . ,NL5), each being assigned one or more of the processor clusters with their associated processor cluster elements being neural network processing elements therein. The message generator (MG) of a processor cluster (Ci,j) (associated with a neural network processing layer) comprises a logic module (MGL) and an associated message generator control storage space (MGM), wherein the logic module of a message generator in response to an activation signal (Sact([X,Y])) of a processor cluster element is configured to selectively generate and transmit a message for each of a set of destination processor clusters in accordance with respective message generation control data (CD1, CD2, CD3) for said destination processor clusters stored in the message generator control storage space (MGM).

Hierarchical workload allocation in a storage system
11726827 · 2023-08-15 · ·

A method for hierarchical workload allocation in a storage system, the method may include determining to reallocate a compute workload of a current compute core of the storage system; wherein the current compute core is responsible for executing a workload allocation unit that comprises one or more first type shards; and reallocating the compute workload by (a) maintaining the responsibility of the current compute core for executing the workload allocation unit, and (b) reallocating at least one first type shard of the one or more first type shards to a new workload allocation unit that is allocated to a new compute core of new compute cores.

Initialization of Parameters for Machine-Learned Transformer Neural Network Architectures

An online system trains a transformer architecture by an initialization method which allows the transformer architecture to be trained without normalization layers of learning rate warmup, resulting in significant improvements in computational efficiency for transformer architectures. Specifically, an attention block included in an encoder or a decoder of the transformer architecture generates the set of attention representations by applying a key matrix to the input key, a query matrix to the input query, a value matrix to the input value to generate an output, and applying an output matrix to the output to generate the set of attention representations. The initialization method may be performed by scaling the parameters of the value matrix and the output matrix with a factor that is inverse to a number of the set of encoders or a number of the set of decoders.

Logic scaling sets for cloud-like elasticity of legacy enterprise applications
11323389 · 2022-05-03 · ·

Methods, systems, and computer-readable storage media for determining, by an instance manager and from a pattern associated with a system executing within a landscape, that a status of the system is to change to scaled-in, the pattern being absent any reference to instances of systems executed within landscapes, in response, identifying, by the instance manager and from a logic scaling set that is associated with the system, one or more instances of the system that are able to be scaled-in, selecting, by the instance manager, at least one instance of the one or more instances, and executing, by the instance manager, scaling of the system based on the at least one instance.

METHOD AND APPARATUS FOR CONTENTION WINDOW SIZE ADJUSTMENT

Methods and apparatus for contention window size adjustment are described. A method includes: receiving a first group of feedback signals indicating a first state corresponding to data transmitted in a first band used by a carrier, the first group of feedback signals comprising at least one first feedback signal; receiving a second group of feedback signals indicating a second state corresponding to data transmitted in the first band used by the carrier; determining a first scaling factor α for one of the at least one first feedback signal; determining a ratio of the first group of feedback signals to the first group of feedback signals and the second group of feedback signals; and comparing the ratio to a threshold to determine a contention window size (CWS) of the first band, wherein the first scaling factor α substitutes a weight of the one of the at least one first feedback signal in determination of the ratio of the first group of feedback signals to the first group of feedback signals and the second group of feedback signals.