IPIQ

G06F2209/509

TECHNOLOGIES FOR MULTI-TENANT AUTOMATIC LOCAL BREAKOUT SWITCHING AND DATA PLANE DYNAMIC LOAD BALANCING

20220357989 · 2022-11-10 ·

Technologies for providing a multi-tenant local breakout switching and dynamic load balancing include a network device to receive network traffic that includes a packet associated with a tenant. Upon a determination that the packet is encrypted, a secret key associated with the tenant is retrieved. The network device decrypts a payload from the packet using the secret key. The payload is indicative of one or more characteristics associated with network traffic. The network device evaluates the characteristics and determines whether the network traffic is associated with a workload requesting compute from a service hosted by a network platform. If so, the network device forwards the network traffic to the service.

SERVER CONTROL DEVICE, SERVER CONTROL METHOD, AND PROGRAM

20220358263 · 2022-11-10 ·

A server control apparatus 100 includes a request and configuration collection unit 120 that acquires a request to offload a certain process of an application to an accelerator for each application in a server, and configurations of the accelerator and the application in a server 30, an optimization arithmetic unit 130 that determines, by referring to information of the acquired request and configurations of the server 30, a ratio of processing performance to the request, and optimizes allocation of the accelerator so that variance of the ratio between the applications is equal to or less than a predetermined threshold, and a configuration determination unit 140 that determines a configuration suggestion to be taken by the server 30 by using an arithmetic result from the optimization arithmetic unit 130 and a predetermined policy, and commands the server 30 to execute the configuration suggestion.

METHOD FOR ALLOCATING DATA PROCESSING TASKS, ELECTRONIC DEVICE, AND STORAGE MEDIUM

20220357990 · 2022-11-10 ·

A method for allocating data processing tasks, an electronic device, and a readable storage medium are provided, which relate to the fields of computer vision and artificial intelligence. The method includes: determining a plurality of data processing tasks of a target application for a graphics processor; and allocating, by using a load balancing strategy, the plurality of data processing tasks to a plurality of worker processes created for the target application, wherein the plurality of worker processes are pre-configured with a corresponding graphics processor resource.

DETERMINATION OF HARDWARE RESOURCE UTILIZATION

20220357988 · 2022-11-10 ·

A method may include determining a first weight for a first type of operation and a second weight for a second type of operation. The first weight may correspond to a first quantity of the first type of operation a hardware resource is capable of performing during a time interval. The second weight may correspond to a second quantity of the second type of operation the hardware resource is capable of performing during the time interval. Utilization of the hardware resource may correspond to a weighted sum of the respective quantities of the first type of operation and the second type of operation offloaded to the hardware resource. Allocation of hardware resources may be adjusted based on utilization. Related systems and articles of manufacture are also provided.

WORKLOAD DATA TRANSFER SYSTEM AND METHOD

20220357999 · 2022-11-10 ·

A system including a management application and a satellite container. The management application is configured to manage workload operations of a customer computer cluster and a plurality of third-party compute systems. The satellite container and a customer computer cluster are on a customer network. Further, the satellite container is configured to: provide configuration data of the local computer cluster and authorization data to the management application; provide a first data request to the management application; receive workload data from the management application in response to the first data request; and convey the workload data from the satellite to the local computer cluster. The management application is outside the customer network.

TECHNIQUES FOR MULTI-SOURCE TO MULTI-DESTINATION WEIGHTED ROUND ROBIN ARBITRATION

20230038862 · 2023-02-09 ·

Examples include techniques to arbitrate a plurality of input requests received from input clients that request data to be stored or placed in a destination. An arbiter may be arranged to grant an input request based on an assigned weight and based on an indication that the destination is ready to receive the data to be stored or placed in the destination.

LOGICAL NODE DISTRIBUTED SIGNATURE DECISION SYSTEM AND A METHOD THEREOF

20230097489 · 2023-03-30 ·

The present disclosure provides a logical node distributed signature decision system for a distributed data processing system, including: an initial logical node generating assembly, configured to receive task configuration data input by a user, and generate an initial logical node topology for the distributed data processing system, wherein a source logical node has a specified logical distributed signature, each initial logical node is attached with a candidate logical distributed signature set based on the task configuration data; and a logical distributed signature selecting assembly, configured to, according to a distributed descriptor of an output end of each upstream logical node for which a logical distributed signature is already determined, for each candidate logical distributed signature of a current logical node, compute a cost of data transmission required to transform the distributed descriptor of the tensor of the output end of each upstream logical node into the distributed descriptor.

Computing 2-body statistics on graphics processing units (GPUs)

11573797 · 2023-02-07 ·

University Of South Florida

Disclosed are various embodiments for computing 2-body statistics on graphics processing units (GPUs). Various types of two-body statistics (2-BS) are regarded as essential components of data analysis in many scientific and computing domains. However, the quadratic complexity of these computations hinders timely processing of data. According, various embodiments of the present disclosure involve parallel algorithms for 2-BS computation on Graphics Processing Units (GPUs). Although the typical 2-BS problems can be summarized into a straightforward parallel computing pattern, traditional wisdom from (general) parallel computing often falls short in delivering the best possible performance. Therefore, various embodiments of the present disclosure involve techniques to decompose 2-BS problems and methods for effective use of computing resources on GPUs. We also develop analytical models that guide users towards the appropriate parameters of a GPU program. Although 2-BS problems share the same core computations, each 2-BS problem however carries its own characteristics that calls for different strategies in code optimization. Accordingly, various embodiments of the present disclosure involve a software framework that automatically generates high-performance GPU code based on a few parameters and short primer code input.

Estimating resource requests for workloads to offload to host systems in a computing environment

11573835 · 2023-02-07 ·

International Business Machines Corporation

Provided are a computer program product, system, and method for estimating resource requests for workloads to offload to host systems in a computing environment. A calculation is made required resources of computational resources required to complete processing a plurality of unfinished workloads that have not completed. A determination is made of allocated resources that are not yet provisioned to workloads. The required resources are reduced by the allocated resources not yet provisioned to determine resources to provision. The resources to provision for the unfinished workloads are requested.

ALLOCATION OF DATA SUB-TENSORS ONTO HARDWARE SUB-ARRAYS

20230100036 · 2023-03-30 ·

Certain aspects of the present disclosure provide techniques for improved hardware utilization. An input data tensor is divided into a first plurality of sub-tensors, and a plurality of logical sub-arrays in a physical multiply-and-accumulate (MAC) array is identified. For each respective sub-tensor of the first plurality of sub-tensors, the respective sub-tensor is mapped to a respective logical sub-array of the plurality of logical sub-arrays, and the respective sub-tensor is processed using the respective logical sub-array.

Patent classifications

G06F2209/509