Patent classifications
G06F9/5033
DATA SEQUENCE PREDICTION AND RESOURCE ALLOCATION
A method for data sequence prediction and resource allocation includes determining, by a memory system, a plurality of resource parameters associated with operation of the memory system and determining respective time intervals associated with usage patterns corresponding to the memory system, the respective time intervals being associated with one or more sets of the plurality of resource parameters. The method further includes determining, using the plurality of resource parameters, one or more weights for hidden layers of a neural network for the respective time intervals associated with the usage patterns and allocating computing resources within the memory system for use in execution of workloads based on the determined one or more weights for hidden layers of the neural network.
APPLICATION EXECUTION ENVIRONMENT SELECTION BASED ON RESOURCE USAGE PROFILES
A plurality of resource usage profiles is generated, wherein at least two of the resource usage profiles each include a corresponding plurality of resource values that quantify real-time computing resources used by an instance of an application that previously executed under two different corresponding operating conditions. It is determined that a new instance of the application is to be initiated. A particular resource usage profile from the plurality of resource usage profiles is selected. The new instance is initiated using the particular resource usage profile.
OPTIMIZER AGNOSTIC EXPLANATION SYSTEM FOR LARGE SCALE SCHEDULES
A computer implemented method using an artificial intelligence (A.I.) module to explain large scale scheduling solutions includes receiving an original instance of a resource constrained scheduling problem. The instance includes a set of tasks and a variety of resource requirements and a variety of constraints. An optimizer process determines a schedule for the set of tasks while minimizing a makespan of the schedule. A minimal set of resource links is generated based on resource dependencies between tasks. The resource links are added to the original instance of scheduling problem, as precedence constraints. All the resource constraints are removed from the original instance of the resource constrained scheduling problem. A set of critical tasks is computed using a non-resource constrained critical path. Schedules are provided with an explanation of an optimized order of the set of tasks based on the use of the non-resource constrained critical path.
DATA PROCESSING SYSTEM, OPERATING METHOD THEREOF, AND COMPUTING SYSTEM USING THE SAME
A data processing system includes a controller configured to receive a neural network operation processing request from a host device; and an in-memory computing device including a plurality of processing elements. The in-memory computing device is configured to receive an input feature map and a weight filter from the controller, and perform a neural network operation in the plurality of processing elements based on the weight filter and a plurality of division maps generated from the input feature map, wherein the in-memory computing device performs the neural network operation by not moving a reused element, which is operated at least twice among elements constituting the division maps during the neural network operation, between the processing elements.
ESTIMATING FUTURE CLOUD RESOURCE REQUESTS
An approach for estimating future cloud resource requests. The approach receives information defining a present budget interval. The approach analyzes a cloud resource request database based on entry times in respective past budget intervals occurring within a remaining time in the present budget interval. The approach creates estimates of future expected cloud resource requests based on the analysis.
FRAMEWORK FOR ESTIMATION OF RESOURCE USAGE AND EXECUTION TIME OF WORKLOADS IN A HETEROGENEOUS INFRASTRUCTURE
One example method includes performing the various operations concerning a model that is operable to predict resource usage and execution time of computing workloads. The operations include extracting a fingerprint associated with telemetry data, and the telemetry data was generated based on performance of one of the computing workloads, in a constrained infrastructure, checking a fingerprint catalog to determine if there is a same or similar fingerprint to the fingerprint, when the same or similar fingerprint is found in the fingerprint catalog, inferring that the model includes information about the computing workload and the model is able to predict telemetry data and execution time for the computing workload in a target infrastructure, and when the same or similar fingerprint is not found, inserting the extracted fingerprint into the fingerprint catalog, and generating a retrained model by retraining the model using the telemetry data associated with the extracted fingerprint.
Graphics processor with non-blocking concurrent architecture
In some aspects, systems and methods provide for forming groupings of a plurality of independently-specified computation workloads, such as graphics processing workloads, and in a specific example, ray tracing workloads. The workloads include a scheduling key, which is one basis on which the groupings can be formed. Workloads grouped together can all execute from the same source of instructions, on one or more different private data elements. Such workloads can recursively instantiate other workloads that reference the same private data elements. In some examples, the scheduling key can be used to identify a data element to be used by all the workloads of a grouping. Memory conflicts to private data elements are handled through scheduling of non-conflicted workloads or specific instructions and/or deferring conflicted workloads instead of locking memory locations.
NEURAL NETWORK PROCESSOR USING COMPRESSION AND DECOMPRESSION OF ACTIVATION DATA TO REDUCE MEMORY BANDWIDTH UTILIZATION
A deep neural network (“DNN”) module can compress and decompress neuron-generated activation data to reduce the utilization of memory bus bandwidth. The compression unit can receive an uncompressed chunk of data generated by a neuron in the DNN module. The compression unit generates a mask portion and a data portion of a compressed output chunk. The mask portion encodes the presence and location of the zero and non-zero bytes in the uncompressed chunk of data. The data portion stores truncated non-zero bytes from the uncompressed chunk of data. A decompression unit can receive a compressed chunk of data from memory in the DNN processor or memory of an application host. The decompression unit decompresses the compressed chunk of data using the mask portion and the data portion. This can reduce memory bus utilization, allow a DNN module to complete processing operations more quickly, and reduce power consumption.
TENANT RESOURCE OPTIMIZATION (TRO) IN CLOUDS
A method of using a multi-cluster network is provided. The multi-cluster network has at least a plurality of clusters, where each cluster of the plurality of clusters has at least a node and a pod. The method includes collecting cluster and application information of the multi-cluster network. The cluster and application information includes at least a cluster capacity and an application performance metric. The application performance metric corresponds to at least an application. The method further includes analyzing the cluster and application information for a current pod arrangement on each cluster and adjusting at least the application performance metric based at least on the analyzed cluster and application information by taking at least a pod-based action achieving a new pod arrangement on a target cluster. An apparatus for using the multi-cluster network is also provided.
SYSTEM AND METHOD FOR PROCESSING AND TRANSFORMING INCOMING RESOURCES TO AUTO-CODE REPORTING PARAMETERS
Embodiments of the present invention provide a system for processing and transforming incoming resources to auto-code reporting parameters. In particular, the system may be configured to receive one or more resources from one or more data sources, pre-process the one or more resources to extract one or more input parameters, process the one or more input parameters, via a decisioning engine, auto-code one or more reporting parameters based on processing the one or more input parameters, via the decisioning engine, check for predetermined conditions based on one or more predetermined rules, and prepare an output file comprising the auto-coded one or more reporting parameters and the predetermined conditions.