G06F11/3006

Predicting and managing requests for computing resources or other resources

Requests for computing resources and other resources can be predicted and managed. For example, a system can determine a baseline prediction indicating a number of requests for an object over a future time-period. The system can then execute a first model to generate a first set of values based on seasonality in the baseline prediction, a second model to generate a second set of values based on short-term trends in the baseline prediction, and a third model to generate a third set of values based on the baseline prediction. The system can select a most accurate model from among the three models and generate an output prediction by applying the set of values output by the most accurate model to the baseline prediction. Based on the output prediction, the system can cause an adjustment to be made to a provisioning process for the object.

Variable replication levels for an object of a snapshot of a block storage volume
11550816 · 2023-01-10 · ·

Systems and methods are provided to manage a replication service of a block storage volume to increase dependability and/or decrease data loss. Each snapshot of a block storage volume can include a point-in-time representation of the volume. Each snapshot may include multiple objects that correspond to one or more blocks of the volume. One or more objects of a snapshot may reference a parent snapshot instead of a block of the volume. Each object of a snapshot may be replicated a number of times based on the number of references by other snapshots. The number of replicas may be based on the number of snapshots referencing the object or the number of unique clients referencing the object. The replication service can manage the replicas of the object and increase or decrease the number of replicas as needed.

ENHANCED PERFORMANCE DIAGNOSIS IN A NETWORK COMPUTING ENVIRONMENT

Embodiments provide enhanced performance diagnosis in a network computing environment. In response to an occurrence of a performance issue for a node while under operating conditions, common logs for applications on the node are analyzed. The applications are respectively registered in advance for diagnosis services. The applications each register rules in advance for the diagnosis services. At a time of the performance issue, debug programs are automatically issued to generate debug level logs respectively for the applications. Debug level logs are analyzed according to the rules to determine a root cause of the performance issue. A potential solution to the root cause of the performance issue is determined using the rules, without having to recreate the operating conditions occurring during the performance issue. The potential solution to rectify the root cause of the performance issue is executed without having to recreate the operating conditions occurring during the performance issue.

Integrated remediation system for network-based services

This disclosure describes automatically collecting, analyzing, and remediating operational issues with respect to systems executing within a network. For example, a service provider network may include a monitoring service may generate notifications related to operational issues upon detection of operational issues within a system executing within the service provider network. The monitoring service may provide one or more notifications related to an aggregation service that may aggregate the one or more notifications into a standardized format. Contextual information related to the operational issues may be automatically gathered by an analytics service, which may analyze the contextual information to determine a potential cause of the operational issues. Based on the potential cause, a remediation service may automatically remediate the operational issues.

Correlation across non-logging components

Systems are provided for logging transactions in heterogeneous networks that include a combination of one or more instrumented components and one or more non-instrumented components. The instrumented components are configured to generate impersonated log records for the non-instrumented components involved in the transaction processing hand-offs with the instrumented components. The impersonated log records are persisted with other log records that are generated by the instrumented components in a transaction log that is maintained by a central logging system to reflect a complete flow of the transaction processing performed on the object, including the flow through the non-instrumented component(s).

Intra-footprint computing cluster bring-up

Methods, systems and computer program products for intra-footprint computing cluster bring-up within a virtual private cloud. A network connection is established between an initiating module and a virtual private cloud (VPC). An initiating module allocates resources of the virtual private cloud including a plurality of nodes that correspond to members of a to-be-configured computing cluster. A cluster management module having coded therein an intended computing cluster configuration is configured into at least one of the plurality of nodes. The members of the to-be-configured computing cluster interoperate from within the VPC to accomplish a set of computing cluster bring-up operations that configure the plurality of members into the intended computing cluster configuration. Execution of bring-up instructions of the management module serve to allocate networking IP addresses of the virtual private cloud. The allocated networking IP addresses of the virtual private cloud are assigned to networking interfaces of the plurality of nodes.

MONITORING AND ALERTING SYSTEM BACKED BY A MACHINE LEARNING ENGINE
20230038164 · 2023-02-09 ·

A monitoring and alerting system backed by a machine learning engine for anomaly detection and prediction of time series data indicative of health of an application, a system, an environment, or a person. Using any data of interest that is modeled into a time series known as times and values; comparing input data against learned previous patterns; predicting data; identifying anomalies; generating notifications or an alert identifying the deviation, and communicating the alert to users, applications, or devices, applying the action or health functions logic using the significance of the issue to modify/start/stop components of the system or application. The data is received via a metrics server and is cleaned into a unified format and passed through via streaming or push/pull mechanisms. Planned deviations are configured to prevent false positives. A variety of machine learning methods is used and the system has dual function components and disaster recovery.

Methods and systems for exchange of equipment performance data

A method for exchange of equipment performance data includes the steps of: obtaining performance data of a communicatively-insulated device; converting the performance data into a scannable code; capturing an image of the scannable code; decoding the scannable code using a communicatively-enabled device to extract an address string encoded in the scannable code, the address string comprising an address of a remote server and the performance data; initiating, by the communicatively-enabled device, a communications link with the remote server using the address string thereby to provide the performance data to the remote server; performing, by the remote server, analytics on the performance data; and sending historic device performance data and/or analytical results to a remote computing device and/or sending a link to the historic device performance data and/or analytical results to the remote computing device; wherein the communicatively-insulated device is packaging equipment and wherein obtaining the performance data comprises: running a calibration phantom through the packaging equipment; scanning the calibration phantom with a calibration unit; and using the calibration unit to generate a system status report identifying one or more operational parameters of the packaging equipment.

Role-based failure response training for distributed systems

Methods, systems, and computer-readable media for role-based failure response training for distributed systems are disclosed. A failure response training system determines a failure mode associated with an architecture for a distributed system comprising a plurality of components. The training system generates a scenario based at least in part on the failure mode. The scenario comprises an initial state of the distributed system which is associated with one or more metrics indicative of a failure. The training system provides, to a plurality of users, data describing the initial state. The training system solicits user input representing modification of a configuration of the components. The training system determines a modified state of the distributed system based at least in part on the input. The performance of the distributed system in the modified state is indicated by one or more modified metrics differing from the one or more initial metrics.

Anti-pattern detection in extraction and deployment of a microservice

Disclosed are various embodiments for anti-pattern detection in extraction and deployment of a microservice. A software modernization service is executed to analyze a computing application to identify various applications. When one or more of the application components are specified to be extracted as an independently deployable subunit, anti-patterns associated with deployment of the independently deployable subunit are determined prior to extraction. Anti-patterns may include increases in execution time, bandwidth, network latency, central processing unit (CPU) usage, and memory usage among other anti-patterns. The independently deployable subunit is selectively deployed separate from the computing application based on the identified anti-patterns.