G06F9/3879

Systems and methods for simulation of dynamic systems

A highly parallelized parallel tempering technique for simulating dynamic systems, such as quantum processors, is provided. Replica exchange is facilitated by synchronizing grid-level memory. Particular implementations for simulating quantum processors by representing cells of qubits and couplers in grid-, block-, and thread-level memory are discussed. Parallel tempering of such dynamic systems can be assisted by modifying replicas based on isoenergetic cluster moves (ICMs). ICMs are generated via secondary replicas which are maintained alongside primary replicas and exchanged between blocks and/or generated dynamically by blocks without necessarily being exchanged. Certain refinements, such as exchanging energies and temperatures through grid-level memory, are also discussed.

LOCATION AGNOSTIC DATA ACCESS

Apparatuses, systems, and techniques to enable a program to access data regardless of where said data is stored. In at least one embodiment, a system enables a program to access data regardless of where said data is stored, based on, for example, one or more locations encoding one or more addresses of said data.

CROSS PLATFORM AND PLATFORM AGNOSTIC ACCELERATOR REMOTING SERVICE

Disclosed are various examples of providing cross platform accelerator remoting between complex instruction set computer (CISC) components and reduced instruction set computer (RISC) components of a computing environment. An accelerator remoting server receives accelerator instructions executable by a locally installed accelerator device and provides the accelerator instructions to the accelerator device. The accelerator remoting server transmits accelerator results to an accelerator remoting client to complete the cross platform or platform agnostic accelerator remoting.

CI/CD PIPELINE TO CONTAINER CONVERSION
20220391215 · 2022-12-08 ·

A method includes receiving, by a processing device, a definition of a CI/CD pipeline for executing a set of stages of the CI/CD pipeline. The CI/CD pipeline is associated with a first computer system. The method further includes converting, by the processing device, the definition into a container image file, and causing, by the processing device using the container image file, a second computer system to implement a container executing the CI/CD pipeline.

PROCESSOR HAVING MULTIPLE CORES, SHARED CORE EXTENSION LOGIC, AND SHARED CORE EXTENSION UTILIZATION INSTRUCTIONS
20230052630 · 2023-02-16 ·

An apparatus of an aspect includes a plurality of cores and shared core extension logic coupled with each of the plurality of cores. The shared core extension logic has shared data processing logic that is shared by each of the plurality of cores. Instruction execution logic, for each of the cores, in response to a shared core extension call instruction, is to call the shared core extension logic. The call is to have data processing performed by the shared data processing logic on behalf of a corresponding core. Other apparatus, methods, and systems are also disclosed.

Architecture for table-based mathematical operations for inference acceleration in machine learning

A processing unit to support inference acceleration for machine learning (ML) comprises an inline post processing unit configured to accept and maintain one or more lookup tables for performing each of one or more non-linear mathematical operations. The inline post processing unit is further configured to accept data from a set of registers maintaining output from a processing block instead of streaming the data from an on-chip memory (OCM), perform the one or more non-linear mathematical operations on elements of the data from the processing block via their corresponding lookup tables, and stream post processing result of the one or more non-linear mathematical operations back to the OCM after the one or more non-linear mathematical operations are complete.

Unified register file for supporting speculative architectural states
11467839 · 2022-10-11 · ·

A method for supporting architecture speculation in an out of order processor is disclosed. The method comprises fetching two threads into the processor, wherein a first thread executes in a speculative state and a second thread executes in a non-speculative state. The method also comprises enabling a speculative scope for an execution of the first thread and a non-speculative scope for an execution of the second thread in an architecture of the processor, wherein the speculative scope and the non-speculative scope can both be fetched into the architecture and be present concurrently.

Architecture to support synchronization between core and inference engine for machine learning
11687837 · 2023-06-27 · ·

A system to support a machine learning (ML) operation comprises a core configured to receive and interpret commands into a set of instructions for the ML operation and a memory unit configured to maintain data for the ML operation. The system further comprises an inference engine having a plurality of processing tiles, each comprising an on-chip memory (OCM) configured to maintain data for local access by components in the processing tile and one or more processing units configured to perform tasks of the ML operation on the data in the OCM. The system also comprises an instruction streaming engine configured to distribute the instructions to the processing tiles to control their operations and to synchronize data communication between the core and the inference engine so that data transmitted between them correctly reaches the corresponding processing tiles while ensuring coherence of data shared and distributed among the core and the OCMs.

Methods of Hardware and Software-Coordinated Opt-In to Advanced Features on Hetero ISA Platforms

The present disclosure relates to a processor that includes one or more processing elements associated with one or more instruction set architectures. The processor is configured to receive a request from an application executed by a first processing element of the one or more processing elements to enable a feature associated with an instruction set architecture. Additionally, the processor is configured to enable the application to utilize the feature without a system call occurring when the feature is associated with an instruction set architecture associated with the first processing element.

ARCHITECTURE TO SUPPORT SYNCHRONIZATION BETWEEN CORE AND INFERENCE ENGINE FOR MACHINE LEARNING
20220374774 · 2022-11-24 ·

A system to support a machine learning (ML) operation comprises a core configured to receive and interpret commands into a set of instructions for the ML operation and a memory unit configured to maintain data for the ML operation. The system further comprises an inference engine having a plurality of processing tiles, each comprising an on-chip memory (OCM) configured to maintain data for local access by components in the processing tile and one or more processing units configured to perform tasks of the ML operation on the data in the OCM. The system also comprises an instruction streaming engine configured to distribute the instructions to the processing tiles to control their operations and to synchronize data communication between the core and the inference engine so that data transmitted between them correctly reaches the corresponding processing tiles while ensuring coherence of data shared and distributed among the core and the OCMs.