G06F11/3433

MANAGING HIGH PERFORMANCE STORAGE SYSTEMS WITH HYBRID STORAGE TECHNOLOGIES

There is provided a method for managing a solid state storage system with hybrid storage technologies. The method includes monitoring one or more storage request streams to identify operating mode characteristics therein from among a set of possible operating mode characteristics. The set of possible operating mode characteristics correspond to a set of available operating modes of the hybrid storage technologies. The method further includes identifying a current operating mode from among the set of available operating modes responsive to the identified operating mode characteristics. The method also includes predicting a likely future operating mode responsive to variations in workload requirements to generate at least one future operating mode prediction. The method additionally includes controlling at least one of data placement, wear leveling, and garbage collection, responsive to the at least one future operating mode prediction.

MOVEMENT DATA FOR FAILURE IDENTIFICATION

Configurations for data center component monitoring are disclosed. In at least one embodiment, movement of a server component is determined based on sensor data and the movement is used to diagnose a root cause for a server component failure.

SYSTEMS AND METHODS FOR CONTROLLING ACCESS TO A DATABASE
20230050525 · 2023-02-16 ·

Systems and methods for throttling requests submitted to a database are designed to maximize the rate at which information can be obtained from the database. In the throttling methods, the time required for the database to perform a certain operation is monitored. If the time required to perform the operation exceeds a threshold time period, a request limit is imposed on the database, the request limit limiting the number of data read and/or write requests that can be submitted to the database per unit of time.

RUNTIME ENTROPY-BASED SOFTWARE OPERATORS
20230048137 · 2023-02-16 ·

A system may include a historical managed software system data store that contains electronic records associated with controllers and deployed workloads (each electronic record may include time series data representing performance metrics). An entropy calculation system, coupled to the historical managed software system data store, may calculate at least one historical entropy value based on information in the historical managed software system data store. A detection engine, coupled to a monitored system currently executing a deployed workload in the cloud computing environment, may collect time series data representing current performance metrics associated with the monitored system. The detection engine may then calculate a current monitored entropy value (based on the collected time series data representing current performance metrics) and (iii) compare the current monitored entropy value with a threshold value (based on the historical entropy value). Based on the comparison, a corrective action for the monitored system may be triggered.

Dynamic graphical processing unit register allocation

Systems, apparatuses, and methods for dynamic graphics processing unit (GPU) register allocation are disclosed. A GPU includes at least a plurality of compute units (CUs), a control unit, and a plurality of registers for each CU. If a new wavefront requests more registers than are currently available on the CU, the control unit spills registers associated with stack frames at the bottom of a stack since they will not likely be used in the near future. The control unit has complete flexibility determining how many registers to spill based on dynamic demands and can prefetch the upcoming necessary fills without software involvement. Effectively, the control unit manages the physical register file as a cache. This allows younger workgroups to be dynamically descheduled so that older workgroups can allocate additional registers when needed to ensure improved fairness and better forward progress guarantees.

Methods and systems for rapid failure recovery for a distributed storage system
11579992 · 2023-02-14 · ·

Methods and systems are provided for rapid failure recovery for a distributed storage system for failures by one or more nodes.

Prioritizing internet-accessible workloads for cyber security
11582257 · 2023-02-14 · ·

Methods and systems for assessing internet exposure of a cloud-based workload are disclosed. A method comprises accessing at least one cloud provider API to determine a plurality of entities capable of routing traffic in a virtual cloud environment associated with a target account containing the workload, querying the at least one cloud provider API to determine at least one networking configuration of the entities, building a graph connecting the plurality of entities based on the networking configuration, accessing a data structure identifying services publicly accessible via the Internet and capable of serving as an internet proxy; integrating the identified services into the graph; traversing the graph to identify at least one source originating via the Internet and reaching the workload, and outputting a risk notification associated with the workload. Systems and computer-readable media implementing the above method are also disclosed.

Control cluster for multi-cluster container environments

The disclosure herein describes managing multiple clusters within a container environment using a control cluster. The control cluster includes a single deployment model that manages deployment of cluster components to a plurality of clusters at the cluster level. Changes or updates made to one cluster are automatically propagated to other clusters in the same environment, reducing system update time across clusters. The control cluster aggregates and/or stores monitoring data for the plurality of clusters creating a centralized data store for metrics data, log data and other systems data. The monitoring data and/or alerts are displayed on a unified dashboard via a user interface. The unified dashboard creates a single representation of clusters and monitor data in a single location providing system health data and unified alerts notifying a user as to issues detected across multiple clusters.

Method and apparatus for stress management in a searchable data service

Method and apparatus for stress management in a searchable data service. The searchable data service may provide a searchable index to a backend data store, and an interface to build and query the searchable index, that enables client applications to search for and retrieve locators for stored entities in the backend data store. Embodiments of the searchable data service may implement a distributed stress management mechanism that may provide functionality including, but not limited to, the automated monitoring of critical resources, analysis of resource usage, and decisions on and performance of actions to keep resource usage within comfort zones. In one embodiment, in response to usage of a particular resource being detected as out of the comfort zone on a node, an action may be performed to transfer at least part of the resource usage for the local resource to another node that provides a similar resource.

SYSTEM FOR MONITORING AND OPTIMIZING COMPUTING RESOURCE USAGE OF CLOUD BASED COMPUTING APPLICATION
20230043579 · 2023-02-09 ·

A system of monitoring and optimizing computing resources usage for computing application may include predicting a first performance metric for job load capacity of a computing application for optimal job concurrency and optimal resource utilization. The system may include generating an alerting threshold based on the first performance metric. The system may further include, in response to a difference between the alerting threshold and a job load of the computing application within an interval exceeding a threshold, predicting a second performance metric for job load capacity of the computing application for optimal job concurrency and optimal resource utilization. The system may further include, in response to a difference between the first performance metric and the second performance metric exceeding a difference threshold, updating the alerting threshold with a job load capacity with the optimal resource utilization rate corresponding to the second performance metric.