G06F9/544

Operation method of an accelerator and system including the same

An accelerator, an operation method of the accelerator, and an accelerator system including the accelerator are disclosed. The operation method includes receiving one or more workloads assigned by a host controller, determining reuse data of the workloads based on hardware resource information and/or a memory access cost of the accelerator when a plurality of processing units included in the accelerator performs the workloads, and providing a result of performing the workloads.

Systems and methods for message tunneling

According to one general aspect, a device may include a host interface circuit configured to communicate with a host device via a data protocol that employs data messages. The device may include a storage element configured to store data in response to a data message. The host interface circuit may be configured to detect when a tunneling command is embedded within the data message; extract a tunneled message address information from the data message; retrieve, via the tunneled message address information, a tunneled message stored in a memory of the host device; and route the tunneled message to an on-board processor and/or data processing logic. The on-board processor and/or data processing logic may be configured to execute one or more instructions in response to the tunneled message.

Data locality enhancement for graphics processing units

Embodiments described herein provide an apparatus comprising a plurality of processing resources including a first processing resource and a second processing resource, a memory communicatively coupled to the first processing resource and the second processing resource, and a processor to receive data dependencies for one or more tasks comprising one or more producer tasks executing on the first processing resource and one or more consumer tasks executing on the second processing resource and move a data output from one or more producer tasks executing on the first processing resource to a cache memory communicatively coupled to the second processing resource. Other embodiments may be described and claimed.

Tuple checkout with notify in coordination namespace system

A system and method for notifying a process about a creation or removal event of a named data element (NDE) in a coordination namespace distributed memory system. A controller runs methods to: generate a tuple corresponding to data generated by a requesting process, the tuple having a tuple name and data value; and generate a notification indicator in a pending notification list to indicate to one or more processes a notification of the creation or removal event associated with the corresponding tuple. Upon detecting the event performed on the tuple by a second process, the method further searches for NDEs in the distributed memory system having the same tuple name, and in response to determining an existence of an associated pending notification record in a pending notification list of records, notify each corresponding process of the one or more processes indicated in the list of the creation or removal event.

COMPUTER-BASED SYSTEMS CONFIGURED FOR AUTOMATED COMPUTER SCRIPT ANALYSIS AND MALWARE DETECTION AND METHODS THEREOF
20220138288 · 2022-05-05 ·

Systems and methods enable automated and scalable obfuscation detection in programming scripts, including processing devices that receive software programming scripts and a symbol set. The processing devices determine a frequency of each symbol and an average frequency of the symbols in the script text. The processing devices determine a normal score of each symbol based on the frequency of each symbol and the average frequency to create a symbol feature for each symbol including the normal score. The processing devices utilize an obfuscation machine learning model including a classifier for binary obfuscation classification to detect obfuscation in the script based on the symbol features. The processing devices cause to display an alert indicting an obfuscated software programming script on a screen of a computing device associated with an administrative user to recommend security analysis of the software programming script based on the binary obfuscation classification.

METHODS AND SYSTEMS FOR OPTIMIZING FILE SYSTEM USAGE
20220137964 · 2022-05-05 ·

A method for generating a thread queue, that includes obtaining, by a user space file system, CPU socket data, and based on the CPU socket data, generating a plurality of thread handles for a plurality of cores, ordering the plurality of thread handles, in the thread queue, for a first core of the plurality of cores, and saving the thread queue to a region of shared memory.

QUIESCENT STATE-BASED RECLAIMING STRATEGY FOR PROGRESSIVE CHUNKED QUEUE
20220138010 · 2022-05-05 ·

A system includes a memory for storing a plurality of memory chunks and a processor for executing a plurality of producer threads. A producer thread increases a producer sequence and determines (i) a first chunk identifier associated with the producer sequence of an identified memory chunk and (ii) a position from the producer sequence to offer an item. The producer thread determines a second chunk identifier of a last created/appended memory chunk and determines whether the second chunk identifier is valid (e.g., matches the first chunk identifier). The producer thread reads a current memory chunk and determines whether a third chunk identifier associated with the current memory chunk is valid (e.g., matches the first chunk identifier). The producer thread writes the item into the identified memory chunk at the position.

Method, apparatus, device and medium for processing topological relation of tasks

The embodiments of the present disclosure provide a method, an apparatus, a device and a medium for processing topological relation of tasks. The method includes: extracting at least one execution element from each of processing tasks based on a topological relation recognition rule; determining a dependency relation among the processing tasks according to content of the execution element of each processing task; and determining a topological relation among the processing tasks according to the dependency relation among the processing tasks.

HIGH BANDWIDTH MEMORY SYSTEM WITH DYNAMICALLY PROGRAMMABLE DISTRIBUTION SCHEME

A system comprises a processor coupled to a plurality of memory units. Each of the plurality of memory units includes a request processing unit and a plurality of memory banks. The processor includes a plurality of processing elements and a communication network communicatively connecting the plurality of processing elements to the plurality of memory units. At least a first processing element of the plurality of processing elements includes a control logic unit and a matrix compute engine. The control logic unit is configured to access data from the plurality of memory units using a dynamically programmable distribution scheme.

Ordering execution of an interrupt handler

A processing unit for a multiprocessor data processing system includes a processor core having an upper level cache and a lower level cache coupled to the processor core. The processor core is configured to, based on receipt of an interrupt, generate and issue a synchronization request prior to executing an interrupt handler and is configured to, based on receipt of a synchronization acknowledgment for the synchronization request, execute the interrupt handler. The lower level cache is configured to, based on receipt of the synchronization request, record which of its state machines are active processing a prior snooped request that can invalidate a cache line in the upper level cache, and is configured to, based on determining that each such state machine has completed processing of its respective prior snooped request, issue the synchronization acknowledgment to the processor core.