G06F11/0793

PERFORMING MULTIPLE POINT TABLE LOOKUPS IN A SINGLE CYCLE IN A SYSTEM ON CHIP

In various examples, a VPU and associated components may be optimized to improve VPU performance and throughput. For example, the VPU may include a min/max collector, automatic store predication functionality, a SIMD data path organization that allows for inter-lane sharing, a transposed load/store with stride parameter functionality, a load with permute and zero insertion functionality, hardware, logic, and memory layout functionality to allow for two point and two by two point lookups, and per memory bank load caching capabilities. In addition, decoupled accelerators may be used to offload VPU processing tasks to increase throughput and performance, and a hardware sequencer may be included in a DMA system to reduce programming complexity of the VPU and the DMA system. The DMA and VPU may execute a VPU configuration mode that allows the VPU and DMA to operate without a processing controller for performing dynamic region based data movement operations.

NOISY-NEIGHBOR DETECTION AND REMEDIATION

Noisy-neighbor detection and remediation is provided by performing real-time monitoring of workload processing and associated resource consumption of application components that use shared resource(s) of a computing environment, determining workload and shared resource consumption patterns for each of the application components, for each application, of a plurality of applications, that includes at least one application component of the application components, correlating the determined workload and shared resource consumption patterns of each of those application component(s) and determining a correlated shared resource usage pattern for that application, performing impact analysis to determine impact of the applications on each other, and identifying noisy-neighbor(s) that use the one or more shared resources and automatically raising an alert indicating those noisy-neighbor(s).

Pacing in a storage sub-system

One embodiment includes data communication apparatus including a storage sub-system to be connected to storage devices, and processing circuitry to manage transfer of content with the storage devices over the storage sub-system responsively to content transfer requests, while pacing commencement of serving of respective ones of the content transfer requests responsively to availability of spare data capacity of the storage sub-system, find a malfunctioning storage device currently assigned a given data capacity of the storage sub-system and currently assigned to serve at least one content transfer request, and reallocate the given data capacity of the storage sub-system currently assigned to the malfunctioning storage device for use by at least another one of the storage devices while the at least one content transfer request assigned to be served by the malfunctioning storage device is still awaiting completion by the malfunctioning storage device.

DYNAMIC ERROR CONTROL CONFIGURATION FOR MEMORY SYSTEMS
20230052044 · 2023-02-16 ·

Methods, systems, and devices for a dynamic error control configuration for memory systems are described. The memory system may receive a read command and retrieve a set of data from a location of the memory system based on the read command. The memory system may perform a first type of error control operation on the set of data to determine whether the set of data includes one or more errors. If the set of data includes the one or more errors, the memory system may retrieve a second set of data from the location of the memory system and determine whether a syndrome weight satisfies a threshold. The memory system may perform a second type of error control operation on the second set of data based on determining that the syndrome weight satisfies the threshold.

SELF-MANAGING DATABASE SYSTEM USING MACHINE LEARNING

A self-managing database system includes a metrics collector to collect metrics data from one or more databases of a computing system and an anomaly detector to analyze the metrics data and detect one or more anomalies. The system includes a causal inference engine to mark one or more nodes in a knowledge representation corresponding to the metrics data for the one or more anomalies and to determine a root cause with a highest probability of causing the one or more anomalies using the knowledge representation. The system includes a self-healing engine, to take at least one remedial action for the one or more databases in response to determination of the root cause.

INTELLIGENT CLOUD SERVICE HEALTH COMMUNICATION TO CUSTOMERS

Example aspects include techniques for accurate and expeditious cloud service health communication to customers. These techniques may include determining that a service health incident has customer impact, the service health incident corresponding to an outage of one or more services of a cloud computing platform, identifying a plurality of customers impacted by the service health incident, and predicting, based on the service health incident and one or more other service health incidents, aggregated incident information identifying a plurality of service health incidents associated with the outage of the one or more services. In addition, the techniques may include identifying the one or more services associated with the service health incident, and transmitting, based at least in part on the aggregated incident information and the one or more services, a health notification to the plurality of customers.

TECHNIQUES FOR MANAGING TEMPORARILY RETIRED BLOCKS OF A MEMORY SYSTEM
20230045990 · 2023-02-16 ·

Methods, systems, and devices for techniques for managing temporarily retired blocks of a memory system are described. In some examples, aspects of a memory system or memory device may be configured to determine an error for a block of memory cells. For example, a controller may determine an existence of the error and may temporarily retire the block. A media management operation may be performed on the temporarily retired block and, depending on one or more characteristics of the error, the temporarily retired block may be enabled or retired.

METHOD AND SYSTEM FOR DATA SYNCHRONIZATION

A method for facilitating data synchronization across a plurality of platforms is provided. The method includes retrieving a change event, the change event corresponding to an event stream from a first platform; parsing the change event to identify a record and a data operation; examining a synchronization database to determine whether a corresponding record is persisted in a database of a second platform; inserting the record into the synchronization database when the corresponding record is not persisted in the platform, the inserted record including a change indicator; and updating, by using the synchronization database, the database of the second platform to include the record.

Artificial Intelligence Engine Providing Automated Error Resolution

Aspects of the disclosure relate to automated error processing. A computing platform may receive historical error/solution information. The computing platform may train, using the historical error/solution information, an artificial intelligence engine to automatically identify solutions for current errors for a plurality of users. The computing platform may identify current errors for a user of the plurality of users. The computing platform may notify the user of the current errors. The computing platform may receive a request to correct an error of the one or more current errors. The computing platform may identify, using the artificial intelligence engine, a solution to the error. The computing platform may automatically perform actions to achieve the solution. The computing platform may send, after performing the actions, commands directing an event processing system to process an event with which the error was associated, which may cause the event processing system to process the event.

Distributed watchdog timer and active token exchange

A system includes a plurality of watchdog components. Each watchdog component is configured to receive a kick signal from its monitored function to determine whether the monitored function is active. Each watchdog component is further configured to receive a respective token from all watchdog components that the each watchdog component is connected to. The respective token determines whether its respective watchdog component has timed out. Each watchdog component is further configured to generate a token responsive to the kick signal and further responsive to the respective token from all watchdog component that the each watchdog component is connected to. Each watchdog component is further configured to transmit the generated token to the all watchdog components that the each watchdog component is connected to.