Patent classifications
G06F9/505
Logical Slot to Hardware Slot Mapping for Graphics Processors
Disclosed techniques relate to work distribution in graphics processors. In some embodiments, an apparatus includes circuitry that implements a plurality of logical slots and a set of graphics processor sub-units that each implement multiple distributed hardware slots. The circuitry may determine different distribution rules for first and second sets of graphics work and map logical slots to distributed hardware slots based on the distribution rules. In various embodiments, disclosed techniques may advantageously distribute work efficiently across distributed shader processors for graphics kicks of various sizes.
NOISY-NEIGHBOR DETECTION AND REMEDIATION
Noisy-neighbor detection and remediation is provided by performing real-time monitoring of workload processing and associated resource consumption of application components that use shared resource(s) of a computing environment, determining workload and shared resource consumption patterns for each of the application components, for each application, of a plurality of applications, that includes at least one application component of the application components, correlating the determined workload and shared resource consumption patterns of each of those application component(s) and determining a correlated shared resource usage pattern for that application, performing impact analysis to determine impact of the applications on each other, and identifying noisy-neighbor(s) that use the one or more shared resources and automatically raising an alert indicating those noisy-neighbor(s).
AI VIDEO PROCESSING METHOD AND APPARATUS
The method comprises: connecting to a plurality of AI computing boards in an AI processing resource pool and a plurality of video encoding and decoding boards in a video processing resource pool by means of a unified high-speed interface; respectively allocating a specified number of AI computing boards and video encoding and decoding boards on account of resources and bandwidths required for completing a processing task to form a temporary cooperation relationship based on the processing task; in response to resource overflow or insufficiency in the AI processing resource pool or the video processing resource pool caused by a processing task change, accessing more AI computing boards or video encoding and decoding boards or stopping using redundant AI computing boards or video encoding and decoding boards; performing the processing task on account of the allocated AI computing boards or video encoding and decoding boards, and releasing the temporary cooperation relationship.
SERVICE PROCESSING METHOD AND APPARATUS, AND STORAGE MEDIUM
A service processing method, performed by a cloud application management server, includes: upon receiving an allocation request from a target terminal, acquiring N pieces of selection reference information corresponding to a pending edge server and related to the target terminal and running reference information, the pending edge server being one of P edge servers connected to the cloud application management server; upon determining that the pending edge server meets a requirement of providing a running service of a target cloud application for the target terminal, determining a connection reference score corresponding to the pending edge server; storing the connection reference score and identification information about the pending edge server into a candidate set; and transmitting the candidate set to the target terminal.
APPARATUSES AND METHODS FOR SCHEDULING COMPUTING RESOURCES
Apparatus and methods for scheduling computing resources is disclosed that facilitate the cooperation of resource managers in the resource layer and workload schedulers in the workload layer working together so that resource managers can efficiently manage and schedule resources for horizontally and vertically scaling resources on physical hosts shared among workload schedulers to run workloads.
PREDICTIVE SCALING OF CONTAINER ORCHESTRATION PLATFORMS
Systems, methods, and computer programming products leveraging recurrent neural network architectures to proactively predict workload demand of container orchestration platforms. The platform continuously collects metric data from clusters of the platform and train multiple parallel neural networks with different architectures to predict future platform workload demands. At periodic intervals, the registered neural networks in consideration for controlling the scaling operations of the platform are compared against one another to identify the neural network demonstrating the highest performance and/or most accurate workload prediction strategy for scaling the orchestration platform. The selected neural network is enforced as controller for the platform to implement the workload prediction strategy. The neural network controller enforced by the platform predictively scales up or down the number of pods within nodes of the platform and/or the number of clusters providing computational resources to the platform, in anticipation of future increased or decreased end user demand.
Software Control Techniques for Graphics Hardware that Supports Logical Slots
Disclosed embodiments relate to software control of graphics hardware that supports logical slots. In some embodiments, a GPU includes circuitry that implements a plurality of logical slots and a set of graphics processor sub-units that each implement multiple distributed hardware slots. Control circuitry may determine mappings between logical slots and distributed hardware slots for different sets of graphics work. Various mapping aspects may be software-controlled. For example, software may specify one or more of the following: priority information for a set of graphics work, to retain the mapping after completion of the work, a distribution rule, a target group of sub-units, a sub-unit mask, a scheduling policy, to reclaim hardware slots from another logical slot, etc. Software may also query status of the work.
WORKLOAD PERFORMANCE PREDICTION AND REAL-TIME COMPUTE RESOURCE RECOMMENDATION FOR A WORKLOAD USING PLATFORM STATE SAMPLING
Embodiments described herein are generally directed to improving predictions regarding workload performance to facilitate dynamic auto device selection. In an example, based on telemetry samples collected from a computer system in real-time and indicative of a state of the computer system, one or more workload performance prediction models are built or updated for a heterogeneous set of computer resources of the computer system with reference to one or more optimization goals. At a time of execution of a workload, a particular computer resource of the heterogeneous set of computer resources on which to dispatch the workload is dynamically determined by: (i) generating multiple predicted performance scores each corresponding to one of the computer resources based on the state of the computer system and the one or more workload performance prediction models; and (ii) selecting the particular computer resource based on the predicted performance scores.
CLOUD-BASED SYSTEMS FOR OPTIMIZED MULTI-DOMAIN PROCESSING OF INPUT PROBLEMS USING MACHINE LEARNING SOLVER TYPE SELECTION
Various embodiments of the present disclosure provide methods, apparatuses, systems, computing devices, computing entities, and/or the like for determining optimized solutions to input problems in a containerized, cloud-based (e.g., serverless) manner. In one embodiment, an example method is provided. The method comprises: receiving a problem type of an input problem originating from a client computing entity; mapping the problem type to one or more selected solver types; generating one or more container instances of one or more compute containers, each compute container corresponding to a selected solver type; generating a problem output using the one or more container instances; and providing the problem output comprising a solution to the input problem to the client computing entity. In various embodiments, optimized solutions for input problems are determined using a cloud-based multi-domain solver system configured to dynamically allocate computing and processing resources between different solution-determining tasks.
Kickslot Manager Circuitry for Graphics Processors
Disclosed embodiments relate to controlling sets of graphics work (e.g., kicks) assigned to graphics processor circuitry. In some embodiments, tracking slot circuitry implements entries for multiple tracking slots. Slot manager circuitry may store, using an entry of the tracking slot circuitry, software-specified information for a set of graphics work, where the information includes: type of work, dependencies on other sets of graphics work, and location of data for the set of graphics work. The slot manager circuitry may prefetch, from the location and prior to allocating shader core resources for the set of graphics work, configuration register data for the set of graphics work. Control circuitry may program configuration registers for the set of graphics work using the prefetched data and initiate processing of the set of graphics work by the graphics processor circuitry according to the dependencies. Disclosed techniques may reduce kick-to-kick transition time, in some embodiments.