G06F9/544

Parameterized launch acceleration for compute instances
11755357 · 2023-09-12 · ·

A request to initiate a launch procedure of a compute instance at a virtualization host configured to access a remote storage device over a network is received. A memory buffer of the host is allocated as a write-back cache for use during a portion of the launch procedure. In response to a write request directed to remote storage during the portion of the launch procedure, the write payload is stored in the buffer and an indication of fulfillment of the write is provided independently of obtaining an acknowledgement that the payload has been propagated to the remote storage. Subsequent to the portion of the launch procedure, payloads of other write requests are transmitted to the remote storage device.

Computation and storage of object identity hash values

Techniques for computing and storing object identity hash values are disclosed. In some embodiments, a runtime system generates a value, such as a nonce, that is unique to a particular allocation region within memory. The runtime system may mix the value with one or more seed values that are associated with one or more respective objects stored in the allocation region. The runtime system may obtain object identifiers for the respective objects by applying a hash function to the result of mixing the seed value with at least the value associated with the allocation region. Conditioning operations may also be applied before, during or after the mixing operations to make the values appear more random. The nonce value may be changed from time to time, such as when memory is recycled in the allocation region, to reduce the risk of hash collisions.

System and method for maximizing processor and server use

A system and method for operating fewer servers near maximum capacity as opposed to operating more servers at low capacity is disclosed. Computational tasks are made as small as possible to be completed within the available capacity of the servers. Computational tasks that are similar may be distributed to the same computing node (including a processor) to improve RAM utilization. Additionally, workloads may be scheduled onto multicore processors to maximize the average number of processing cores utilized per clock cycle.

High speed mainframe application tool

Computer-implemented systems and methods for analyzing applications include, obtaining user data records from a server, constructing an instruction template, the instruction template includes main streams, the instruction template adding the user data records as user parameters corresponding to the main streams, transmitting the user data records to a file transfer connection, inputting the instruction template and a first command into the file transfer connection, the file transfer connection executes the first command, inputting the file transfer connection and a second command into the script file, the script file executes the second command, opening each of the main streams through the pre-defined driver program by using variable records to retrieve a plurality of in-streams of each of the main streams, aggregating the main streams and the in-streams associated with the user parameters resulting a final output, transmitting the final output to the server.

RIC and RIC framework communication

To provide a low latency near RT RIC, some embodiments separate the RIC's functions into several different components that operate on different machines (e.g., execute on VMs or Pods) operating on the same host computer or different host computers. Some embodiments also provide high speed interfaces between these machines. Some or all of these interfaces operate in non-blocking, lockless manner in order to ensure that critical near RT RIC operations (e.g., datapath processes) are not delayed due to multiple requests causing one or more components to stall. In addition, each of these RIC components also has an internal architecture that is designed to operate in a non-blocking manner so that no one process of a component can block the operation of another process of the component. All of these low latency features allow the near RT RIC to serve as a high speed IO between the E2 nodes and the xApps.

Distributed geometry

Systems, apparatuses, and methods for performing geometry work in parallel on multiple chiplets are disclosed. A system includes a chiplet processor with multiple chiplets for performing graphics work in parallel. Instead of having a central distributor to distribute work to the individual chiplets, each chiplet determines on its own the work to be performed. For example, during a draw call, each chiplet calculates which portions to fetch and process of one or more index buffer(s) corresponding to one or more graphics object(s) of the draw call. Once the portions are calculated, each chiplet fetches the corresponding indices and processes the indices. The chiplets perform these tasks in parallel and independently of each other. When the index buffer(s) are processed, one or more subsequent step(s) in the graphics rendering process are performed in parallel by the chiplets.

TECHNIQUES TO CONTROL SYSTEM UPDATES AND CONFIGURATION CHANGES VIA THE CLOUD

Embodiments are generally directed apparatuses, methods, techniques and so forth determine an access level of operation based on an indication received via one or more network links from a pod management controller, and enable or disable a firmware update capability for a firmware device based on the access level of operation, the firmware update capability to change firmware for the firmware device. Embodiments may also include determining one or more configuration settings of a plurality of configuration settings to enable for configuration based on the access level of operation, and enable configuration of the one or more configuration settings.

TECHNOLOGIES FOR COORDINATING DISAGGREGATED ACCELERATOR DEVICE RESOURCES

A compute device to manage workflow to disaggregated computing resources is provided. The compute device comprises a compute engine receive a workload processing request, the workload processing request defined by at least one request parameter, determine at least one accelerator device capable of processing a workload in accordance with the at least one request parameter, transmit a workload to the at least one accelerator device, receive a work product produced by the at least one accelerator device from the workload, and provide the work product to an application.

PUBLISH-SUBSCRIBE FRAMEWORK FOR APPLICATION EXECUTION
20230153183 · 2023-05-18 ·

The described technology relates to a publish-subscribe message framework in which an application, decomposed to a plurality of processing stages, is run by executing respective processing stages of the application asynchronously and simultaneously with each other. Communications between the respective processing stages may exclusively be in accordance with the publish-subscribe execution model. The described publish-subscribe framework provides for processing stages to be executed in a multi-process and/or multi-threaded manner while also enabling the distribution of the processing stages to respective processing resources in a multi-processor/multi-core processing environment. An example electronic exchange application and a corresponding example exchange gateway application are described.

Technologies for accelerator interface

Technologies for an accelerator interface over Ethernet are disclosed. In the illustrative embodiment, a network interface controller of a compute device may receive a data packet. If the network interface controller determines that the data packet should be pre-processed (e.g., decrypted) with a remote accelerator device, the network interface controller may encapsulate the data packet in an encapsulating network packet and send the encapsulating network packet to a remote accelerator device on a remote compute device. The remote accelerator device may pre-process the data packet (e.g., decrypt the data packet) and send it back to the network interface controller. The network interface controller may then send the pre-processed packet to a processor of the compute device.