G06F9/5044

Methods, systems, articles of manufacture and apparatus to map workloads

Methods, apparatus, systems and articles of manufacture are disclosed to map workloads. An example apparatus includes a constraint definer to define performance characteristic targets of the neural network, an action determiner to apply a first resource configuration to candidate resources corresponding to the neural network, a reward determiner to calculate a results metric based on (a) resource performance metrics and (b) the performance characteristic targets, and a layer map generator to generate a resource mapping file, the mapping file including respective resource assignments for respective corresponding layers of the neural network, the resource assignments selected based on the results metric.

Method for executing task by scheduling device, and computer device and storage medium

A method for executing a task by a scheduling device, belonging to the technical field of electronics. The method includes: acquiring a target algorithm corresponding to a target task to be executed; acquiring an execution environment condition for a target algorithm, and current execution environment information of various execution devices; in the execution devices, determining a target execution device of which the execution environment information satisfies the execution environment condition; and sending a control message for executing the target task to the target execution device.

NON-TRANSITORY COMPUTER-READABLE STORAGE MEDIUM, INFORMATION PROCESSING APPARATUS, AND MULTIPLEX CONTROL METHOD

An information processing apparatus that uses a graphical processing unit (GPU) for inference processing, the information processing apparatus includes a processor. The processor configured to monitor a message output from an application that executes the inference processing. The processor configured to determine, from a pattern of the message, timing of a start and an end of core processing that uses the GPU, the core processing serving as a core of the inference processing. The processor configured to start the core processing when there is no process executing another core processing and accumulates a process identifier that identifies a process of the core processing in a queue when there is a process executing the another core processing in a case where the timing of the start of the core processing is determined.

Specifying behavior among a group of computing tasks

A method of specifying behavior among a group of computing tasks included in a request to be performed in a domain of computing resources is disclosed. Method steps include receiving, at a scheduler operably coupled to the domain, a p/f request, the received p/f request including a first group and a first relationship, the first group comprising at least a first p/f group element and a second p/f group element, the first relationship defining a desired behavior of the first and second p/f group elements with respect to each other during performance of the p/f request; determining whether the domain includes available computing resources capable of satisfying the first relationship; and in response to a determination that the domain includes available computing resources capable of satisfying the first relationship, allocating, with the scheduler, at least one available computing resource to fulfill the p/f request.

Dynamic task allocation for neural networks

The subject technology provides for dynamic task allocation for neural network models. The subject technology determines an operation performed at a node of a neural network model. The subject technology assigns an annotation to indicate whether the operation is better performed on a CPU or a GPU based at least in part on hardware capabilities of a target platform. The subject technology determines whether the neural network model includes a second layer. The subject technology, in response to determining that the neural network model includes a second layer, for each node of the second layer of the neural network model, determines a second operation performed at the node. Further the subject technology assigns a second annotation to indicate whether the second operation is better performed on the CPU or the GPU based at least in part on the hardware capabilities of the target platform.

Method for allocating and managing cluster resource on cloud platform
11520639 · 2022-12-06 · ·

The present invention provides a method for allocating and managing a cluster resource on a cloud platform, the method comprising the steps of: generating service group information, by means of a cloud platform system, when a cluster resource allocation request is input; adding/deleting a user of the service group; selecting a cluster to be allocated to the user of the service group and inputting allocation information; querying a cluster resource availability; registering resource allocation information in the service group and allocating a resource to complete cluster resource allocation, when the resource is available as a result of the querying of the cluster resource availability; checking whether a cluster resource can be added when the resource is insufficient as a result of the querying of the cluster resource availability; registering the further cluster resource when the cluster resource can be added; registering resource allocation information in the service group and allocating the resource to complete cluster resource allocation; and reselecting an available cluster when the cluster resource cannot be added.

System and method to dynamically and automatically sharing resources of coprocessor AI accelerators
11521042 · 2022-12-06 ·

A system and method for dynamically and automatically sharing resources of a coprocessor AI accelerator based on workload changes during training and inference of a plurality of neural networks. The method comprising steps of receiving a plurality of requests from each neural network and high-performance computing applications (HPCs) through a dynamic adaptive scheduler module. The dynamic adaptive scheduler module morphs the received requests into threads, dimensions and memory sizes. The method then receives the morphed requests from the dynamic adaptive scheduler module through client units. Each of the neural network applications is mapped with at least one of the client units on a graphics processing unit (GPU) hosts. The method then receives the morphed requests from the plurality of client units through a plurality of server units. Further, the method receives the morphed request from the plurality of server units through one or more coprocessors.

Automated learning technology to partition computer applications for heterogeneous systems

Systems, apparatuses and methods may provide for technology that identifies a prioritization data structure associated with a function, wherein the prioritization data structure lists hardware resource types in priority order. The technology may also allocate a first type of hardware resource to the function if the first type of hardware resource is available, wherein the first type of hardware resource has a highest priority in the prioritization data structure. Additionally, the technology may allocate, in the priority order, a second type of hardware resource to the function if the first type of hardware resource is not available.

Systems and methods for customization of workflow design

Disclosed here are systems and methods that allow users, upon detecting errors within a running workflow, to either 1) pause the workflow and directly correct its design before resuming the workflow, or 2) pause the workflow, correct the erred action within the workflow, resume running the workflow, and afterwards apply the corrections to the design of the workflow. The disclosure comprises functionality that pauses a single workflow and other relevant workflows as soon as the error is detected and while it is corrected. The disclosed systems and methods improve communication technology between the networks and servers of separate parties relevant and/or dependent on successful execution of other workflows.

Extensible schemes and scheme signaling for cloud based processing
11520630 · 2022-12-06 · ·

A method and system for processing media content in Moving Picture Experts Group (MPEG) Network Based Media Processing (NBMP) includes receiving, from an NBMP source, a first message including a workflow descriptor document corresponding to a workflow for processing the media content; obtaining, based on the workflow, a task having a task template; obtaining, based on the task, a function having a function template; and managing the processing of the media content by transmitting, to a media processing entity, a second message instructing the media processing entity to perform the function based on the task. The first message, the workflow descriptor document, the task template, the function template, and/or the second message may be used to signal a scheme for processing the media content.