G06F9/544

MULTI-PATH APPLICATION OUTPUT
20230315602 · 2023-10-05 ·

Described techniques provide convenient, reliable, straightforward techniques for enabling multi-path application outputs. A single application may be configured to output two or more data sets to two or more output destinations within a mainframe environment, without requiring copying or forwarding by an intermediate application utility.

ALLOCATION OF MEMORY RESOURCES TO SIMD WORKGROUPS

A resource allocator receives a memory resource request for first memory resources in respect of a first-received task of a workgroup having a plurality of tasks. In response to receiving the memory resource request, the resource allocator allocates to the entire workgroup a block of memory portions of a shared memory that is sufficient in size for each task of the workgroup to receive memory resources in the block equivalent to the first memory resources.

CACHE COHERENCE SHARED STATE SUPPRESSION

A method includes receiving, by a level two (L2) controller, a first request for a cache line in a shared cache coherence state; mapping, by the L2 controller, the first request to a second request for a cache line in an exclusive cache coherence state; and responding, by the L2 controller, to the second request.

Memory protection circuit and memory protection method
11775450 · 2023-10-03 · ·

To provide a memory protection circuit and a memory protection method suitable for quick data transfer between a plurality of virtual machines via a common memory, according to an embodiment, a memory protection circuit includes a first ID storing register that stores therein an ID of any of a plurality of virtual machines managed by a hypervisor, an access determination circuit that permits the virtual machine having the ID stored in the first ID storing register to access a memory, a second ID storing register that stores therein an ID of any of the virtual machines, and an ID update control circuit that permits the virtual machine having the ID stored in the second ID storing register to rewrite the ID stored in the first ID storing register.

EFFICIENTLY ALLOCATING MEMORY ON NEURAL NETWORK COMPUTE TILES
20230297504 · 2023-09-21 ·

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training giant neural networks. One of the methods includes obtaining data indicating a neural network comprising a plurality of layers; for each layer in a subset of the plurality of layers: assigning a subset of the plurality of computing units to at least partially perform inference computations associated with the layer; determining a memory size and a common memory address for the respective addressable memory unit of each computing unit assigned for the layer; and generating a shared instruction comprising a memory allocation instruction that, when executed by each of the subset of the plurality of computing units, causes the computing unit to store a result of performing inference computations associated with the layer in the determined common memory address with the determined memory size in the addressable memory of the computing unit.

Publish-subscribe framework for application execution
11775361 · 2023-10-03 · ·

The described technology relates to a publish-subscribe message framework in which an application, decomposed to a plurality of processing stages, is run by executing respective processing stages of the application asynchronously and simultaneously with each other. Communications between the respective processing stages may exclusively be in accordance with the publish-subscribe execution model. The described publish-subscribe framework provides for processing stages to be executed in a multi-process and/or multi-threaded manner while also enabling the distribution of the processing stages to respective processing resources in a multi-processor/multi-core processing environment. An example electronic exchange application and a corresponding example exchange gateway application are described.

Runtime extension for neural network training with heterogeneous memory

Systems, apparatuses, and methods for managing buffers in a neural network implementation with heterogeneous memory are disclosed. A system includes a neural network coupled to a first memory and a second memory. The first memory is a relatively low-capacity, high-bandwidth memory while the second memory is a relatively high-capacity, low-bandwidth memory. During a forward propagation pass of the neural network, a run-time manager monitors the usage of the buffers for the various layers of the neural network. During a backward propagation pass of the neural network, the run-time manager determines how to move the buffers between the first and second memories based on the monitored buffer usage during the forward propagation pass. As a result, the run-time manager is able to reduce memory access latency for the layers of the neural network during the backward propagation pass.

Computing system for reducing latency between serially connected electronic devices

A computing system includes a host, a first electronic device connected to the host, and a second electronic device that communicates with the host through the first electronic device. The first electronic device requests a command written in a submission queue of the host based on a doorbell transmitted from the host, stores the command transmitted from the host, requests write data stored in a data buffer of the host, and stores the write data of the data buffer transmitted from the host.

Apparatus and method for performance state matching between source and target processors based on interprocessor interrupts
11775336 · 2023-10-03 · ·

Apparatus, method, and machine-readable medium to provide performance state matching between source and target processors based on inter-processor interrupts. An exemplary apparatus includes a target processor to execute a receiving task at a first performance level and a source processor to execute a sending task at a second performance level higher than the first performance level. The sending task is to store interrupt routing data indicating a pairing between the sending task and the receiving task into a memory location and that the sending task is to dispatch work to be processed by the receiving task. The apparatus further includes a performance management unit to detect the pairing between the sending task and the receiving task based on the interrupt routing data and responsively adjust the performance level of the target processor from the first performance level to the second performance level based, at least in part, on the pairing.

COMPUTER-BASED SYSTEMS CONFIGURED FOR AUTOMATED COMPUTER SCRIPT ANALYSIS AND MALWARE DETECTION AND METHODS THEREOF
20230289412 · 2023-09-14 ·

Systems and methods enable automated and scalable obfuscation detection in programming scripts, including processing devices that receive software programming scripts and a symbol set. The processing devices determine a frequency of each symbol and an average frequency of the symbols in the script text. The processing devices determine a normal score of each symbol based on the frequency of each symbol and the average frequency to create a symbol feature for each symbol including the normal score. The processing devices utilize an obfuscation machine learning model including a classifier for binary obfuscation classification to detect obfuscation in the script based on the symbol features. The processing devices cause to display an alert indicting an obfuscated software programming script on a screen of a computing device associated with an administrative user to recommend security analysis of the software programming script based on the binary obfuscation classification.