Patent classifications
G06F11/0724
VERIFYING PROCESSING LOGIC OF A GRAPHICS PROCESSING UNIT
A method of verifying processing logic of a graphics processing unit receives a test task including a predefined set of instructions for execution on the graphics processing unit, the predefined set of instructions being configured to perform a predetermined set of operations on the graphics processing unit when executed for predefined input data. In a test phase, the test task is processed by executing the predefined set of instructions for the predefined input data first and second times at the graphics processing unit so as to, respectively, generate first and second outputs. A fault signal is raised if the first and second outputs do not match.
SYSTEM-ON-CHIP FOR SHARING GRAPHICS PROCESSING UNIT THAT SUPPORTS MULTIMASTER, AND METHOD FOR OPERATING GRAPHICS PROCESSING UNIT
A system-on-a-chip sharing a graphics processing unit supporting multi-master is provided. A system on chip (SoC) comprises a plurality of central processing units (CPUs) for executing at least one operating system, a graphics processing unit (GPU) that is connected to each of the plurality of CPUs via a bus interface and communicates with each of the plurality of CPUs, and at least one state monitoring device that is selectively connected to at least one CPU among the plurality of CPUs and transmits execution state information of at least one operating system executed in the connected CPU to the GPU. The GPU is shared by at least one operating system and controls a sharing operation by the at least one operating system based on the execution state information of the at least one operating system.
ASSURING FAILSAFE RADIO TRANSMIT POWER LEVEL CONTROL IN HARDWARE
An information handling system includes a failsafe circuit connected to a first power control interconnect conductor, to a second power control interconnect conductor, and to a processor status interconnect conductor. A processor may provide a first level indicating an operational status of the processor to the processor status interconnect conductor when the processor is operational, and provide a second level indicating a non-operational status of the processor to the processor status interconnect conductor when the processor is non-operational. The failsafe circuit may assure, upon provision of the second level to the processor status interconnect conductor, that the first power control interconnect conductor will be in a first failsafe state and the second power control interconnect conductor will be in a second failsafe state.
METHOD FOR SEARCHING FOR INTERRUPTED DEVICE, SLAVE DEVICE, MASTER DEVICE, AND STORAGE MEDIUM
A method for searching for an interrupted device, a slave device, a master device, and a storage medium. The method includes: connecting slave devices to a master device through a connection device; receiving, by the slave devices, task process information sent by the master device, where the task process information includes: task process information recorded by the master device and the corresponding slave devices when executing tasks; and when the receiving of task process information sent by the master device is interrupted, finding an interrupted slave device according to the interrupted task process information.
Method of verifying access of multi-core interconnect to level-2 cache
The present disclosure provides a method and a system of verifying access by a multi-core interconnect to an L2 cache in order to solve problems of delays and difficulties in locating errors and generating check expectation results. A consistency transmission monitoring circuitry detects, in real time, interactions among a multi-core interconnects system, all single-core processors, an L2 cache and a primary memory, and sends collected transmission information to an L2 cache expectation generator and a check circuitry. The L2 cache expectation generator obtains information from a global memory precise control circuitry according to a multi-core consistency protocol and generates an expected result. The check circuitry is responsible for comparing the expected result with an actual result, thus implementing determination of multi-core interconnect's access accuracy to the L2 cache without delay.
Systems and methods to identify production incidents and provide automated preventive and corrective measures
Various methods, apparatuses/systems, and media for identifying production incidents and implementing automated preventive and corrective measures are disclosed. A processor automatically triggers, in response to a generated incident of a job/process/host failure, a self-healing service. The processor identifies an application to which the event generated belongs to by accessing a database that stores the application and host details; fetches functional identification (ID) of the application from the database, identifies the type of job failure or service degradation; automatically executes, by utilizing predefined micro services, the steps required for mitigation; records, in response to executing, outcome of the mitigation in the database along with output at each stage of execution; and evaluates the outcome of the mitigation by executing health checks using micro services to determine whether the failed job or process or host is healthy; and closes the incident based on healthy determination.
METHOD FOR ENCODED DIAGNOSTICS IN A FUNCTIONAL SAFETY SYSTEM
A method includes, storing a set of valid codewords including: a first valid functional codeword representing a functional state of a controller subsystem; a first valid fault codeword representing a fault state of the controller subsystem and characterized by a minimum hamming distance from the first valid functional codeword; a second valid functional codeword representing a functional state of a controller; and a second valid fault codeword representing a fault state of the controller; in response to detecting functional operation of the controller subsystem, storing the first valid functional codeword in a first memory; in response to detecting a match between contents of the first memory and the first valid functional codeword, outputting the second valid functional codeword; in response to detecting a mismatch between contents of the first memory and every codeword in the first set of valid codewords, outputting the second valid fault codeword.
Cross-component health monitoring and improved repair for self-healing platforms
Systems, apparatuses and methods may provide for technology that detects a successful boot of a first firmware component in a computing system, receives a signal from a second firmware component in the computing system, and detects an incompatibility of the first firmware component with respect to the second firmware component based on the signal. In one example, only the first firmware component is repaired in response to the incompatibility.
Selective endpoint isolation for self-healing in a cache and memory coherent system
A cache and memory coherent system includes multiple processing chips each hosting a different subset of a shared memory space and one or more routing tables defining access routes between logical addresses of the shared memory space and endpoints that each correspond to a select one of the multiple processing chips. The system further includes a coherent mesh fabric that physically couples together each pair of the multiple processing chips, the coherent mesh fabric being configured to execute routing logic for updating the one or more routing tables responsive to identification of a first processing chip of the multiple processing chips hosting a defective hardware component, the update to the routing tables being effective to remove all access routes having endpoints corresponding to the first processing chip.
Parallel processing system runtime state reload
A parallel processing system includes at least three processors operating in parallel, state monitoring circuitry, and state reload circuitry. The state monitoring circuitry couples to the at least three parallel processors and is configured to monitor runtime states of the at least three parallel processors and identify a first processor of the at least three parallel processors having at least one runtime state error. The state reload circuitry couples to the at least three parallel processors and is configured to select a second processor of the at least three parallel processors for state reload, access a runtime state of the second processor, and load the runtime state of the second processor into the first processor. Monitoring and reload may be performed only on sub-systems of the at least three parallel processors. During reload, clocks and supply voltages of the processors may be altered. The state reload may relate to sub-systems.