G06F2209/5019

DYNAMIC CROSS-ARCHITECTURE APPLICATION ADAPTION
20230014741 · 2023-01-19 · ·

Embodiments described herein are generally directed to improving performance of high-performance computing (HPC) or artificial intelligence (AI) workloads on cluster computer systems. According to one embodiment, a section of a high-performance computing (HPC) or artificial intelligence (AI) workload executing on a cluster computer system is identified as significant to a figure of merit (FOM) of the workload. An alternate placement among multiple heterogeneous compute resources of a node of the cluster computer system is determined for a portion of the section currently executing on a given compute resource of the multiple heterogeneous compute resources. After predicting an improvement to the FOM based on the alternate placement, the portion is relocated to the alternate placement.

Resource usage prediction for cluster provisioning

A system for provisioning resources includes a processor and a memory. The processor is configured to receive a time series of past usage data. The past usage data comprises process usage data and instance usage data. The processor is further configured to determine an upcoming usage data based at least in part on the time series of the past usage data, and provision a computing system according to the upcoming usage data.

Enhanced selection of cloud architecture profiles

This document describes modeling and simulation techniques to select a cloud architecture profile based on correlations between application workloads and resource utilization. In some aspects, a method includes obtaining infrastructure data specifying utilization of computing resources of an existing computing system. Application workload data specifying tasks performed by one or more applications running on the existing computing system is obtained. One or more models are generated based on the infrastructure data and the application workload data. The model(s) define an impact on utilization of each computing resource in response to changes in workloads of the application(s). A workload is simulated, using the model(s), on a candidate cloud architecture profile that specifies a set of computing resources. A simulated utilization of each computing resource of the candidate cloud architecture profile is determined based on the simulation. An updated cloud architecture profile is generated based on the simulated utilization.

SYSTEMS, METHODS, AND APPARATUS FOR WORKLOAD OPTIMIZED CENTRAL PROCESSING UNITS (CPUS)

Systems, methods, and apparatus for workload optimized central processing units are disclosed herein. An example apparatus includes a workload analyzer to determine an application ratio associated with the workload, the application ratio based on an operating frequency to execute the workload, a hardware configurator to configure, before execution of the workload, at least one of (i) one or more cores of the processor circuitry based on the application ratio or (ii) uncore logic of the processor circuitry based on the application ratio, and a hardware controller to initiate the execution of the workload with the at least one of the one or more cores or the uncore logic.

Capacity management in a cloud computing system using virtual machine series modeling

A method for minimizing allocation failures in a cloud computing system without overprovisioning may include determining a predicted supply for a virtual machine series in a system unit of the cloud computing system during an upcoming time period. The predicted supply may be based on a shared available current capacity and a shared available future added capacity for the virtual machine series in the system unit. The method may also include predicting an available capacity for the virtual machine series in the system unit during the upcoming time period. The predicted available capacity may be based at least in part on a predicted demand for the virtual machine series in the system unit during the upcoming time period and the predicted supply. The method may also include taking at least one mitigation action in response to determining that the predicted demand exceeds the predicted supply during the upcoming time period.

Utilizing machine learning to concurrently optimize computing resources and licenses in a high-performance computing environment

A device may receive a job request that requests performance of one or more operations by resources of a high-performance computing environment, and may process the job request, with a policy execution model trained with policy parameters, to identify policies to apply during execution of the job request. The device may process the job request, with a forecast object model trained with job data and profile data, to generate a forecast of resources and licenses required from the high-performance computing environment. The device may process the job request, other job requests, the one or more of the policies, and the forecast, with a heuristic model, to determine a schedule for the job request, and may process the schedule and current constraints on the resources and the licenses, with a linear programming model, to determine an optimized schedule for the job request.

Using delayed autocorrelation to improve the predictive scaling of computing resources

Techniques are described for filtering and normalizing training data used to build a predictive auto scaling model used by a service provider network to proactively scale users' computing resources. Further described are techniques for identifying collections of computing resources that exhibit suitably predictable usage patterns such that a predictive auto scaling model can be used to forecast future usage patterns with reasonable accuracy and to scale the resources based on such generated forecasts. The filtering of training data and the identification of suitably predictable collections of computing resources are based in part on autocorrelation analyses, and in particular on “delayed” autocorrelation analyses, of time series data, among other techniques described herein.

Technologies for assigning workloads to balance multiple resource allocation objectives

Technologies for allocating resources of managed nodes to workloads to balance multiple resource allocation objectives include an orchestrator server to receive resource allocation objective data indicative of multiple resource allocation objectives to be satisfied. The orchestrator server is additionally to determine an initial assignment of a set of workloads among the managed nodes and receive telemetry data from the managed nodes. The orchestrator server is further to determine, as a function of the telemetry data and the resource allocation objective data, an adjustment to the assignment of the workloads to increase an achievement of at least one of the resource allocation objectives without decreasing an achievement of another of the resource allocation objectives, and apply the adjustments to the assignments of the workloads among the managed nodes as the workloads are performed. Other embodiments are also described and claimed.

Resource management based on ranking of importance of applications

This application provides a method for managing a resource in a computer system and a terminal device. The method includes: obtaining data, where the data includes application sequence feature data related to a current foreground application, and the data further includes at least one of the following real-time data: a system time of the computer system, current status data of the computer system, and current location data of the computer system; selecting, from a plurality of machine learning models based on at least one of the real-time data, a target machine learning model that matches the real-time data; inputting the obtained data into the target machine learning model to rank importance of a plurality of applications installed in the computer system; and performing resource management based on a result of the importance ranking.

System and method for infrastructure scaling
11693698 · 2023-07-04 · ·

A method, system and computer program product, the method comprising: determining properties of a set of containers that are deployed over a computer infrastructure, wherein the computer infrastructure is provisioned via an infrastructure management service; determining properties of one or more headroom containers, wherein the one or more headroom containers are not deployed over the computer infrastructure; simulating the container orchestrator using the properties of the set of container and the properties of the headroom containers, for obtaining an expected deployment of the set of containers together with the one or more headroom containers; based on the expected deployment, determining whether the computer infrastructure is sufficient for deploying the set of containers together with the one or more headroom containers; and subject to the computer infrastructure being insufficient, issuing a request to the infrastructure management service to allocate additional computer infrastructure.