G06F9/3879

SYSTEM AND METHODS FOR RECONSTITUTING AN OBJECT IN A RUNTIME ENVIRONMENT USING HEAP MEMORY
20210191745 · 2021-06-24 ·

Methods, systems, and computer-readable media are disclosed herein to provide for reconstituting an object from raw data in a heap dump. Embodiments herein provide for retrieving raw data from a managed heap dump, where the raw data corresponds to an object. The raw data is retrieved from the heap dump and loaded into a running computer application within a runtime environment, in embodiments. The raw data is transformed, in some embodiments, by regenerating the actual object from the content of the object that is represented in the raw data, through the running computer application. Various utilities may then be used on the regenerated object to reveal the internal structure of the object for program logic diagnosis, in embodiments.

Architecture to support color scheme-based synchronization for machine learning

A system to support a machine learning (ML) operation comprises an array-based inference engine comprising a plurality of processing tiles each comprising at least one or more of an on-chip memory (OCM) configured to maintain data for local access by components in the processing tile and one or more processing units configured to perform one or more computation tasks on the data in the OCM by executing a set of task instructions. The system also comprises a data streaming engine configured to stream data between a memory and the OCMs and an instruction streaming engine configured to distribute said set of task instructions to the corresponding processing tiles to control their operations and to synchronize said set of task instructions to be executed by each processing tile, respectively, to wait current certain task at each processing tile to finish before starting a new one.

PROCESSING SYSTEM
20210109760 · 2021-04-15 · ·

A system comprising a data memory, a first processor with first execution pipeline, and a co-processor with second execution pipeline branching from the first pipeline via an inter-processor interface. The first pipeline can decode instructions from an instruction set comprising first and second instruction subsets. The first subset comprises a load instruction which loads data from the memory into a register file, and a compute instruction of a first type which performs a compute operation on such loaded data. The second subset includes a compute instruction of a second type which does not require a separate load instruction to first load data from memory into a register file, but instead reads data from the memory directly and performs a compute operation on that data, this reading being performed in a pipeline stage of the second pipeline that is aligned with the memory access stage of the first pipeline.

Methods of Hardware and Software Coordinated Opt-In to Advanced Features on Hetero ISA Platforms

The present disclosure relates to a processor that includes one or more processing elements associated with one or more instruction set architectures. The processor is configured to receive a request from an application executed by a first processing element of the one or more processing elements to enable a feature associated with an instruction set architecture. Additionally, the processor is configured to enable the application to utilize the feature without a system call occurring when the feature is associated with an instruction set architecture associated with the first processing element.

Architecture to support tanh and sigmoid operations for inference acceleration in machine learning

A processing unit to support inference acceleration for machine learning (ML) comprises an inline post processing unit configured to accept and maintain one or more lookup tables for performing a tanh and/or sigmoid operation/function. The inline post processing unit is further configured to accept data from a set of registers configured to maintain output from a processing block instead of streaming the data from an on-chip memory (OCM), perform the tanh and/or sigmoid operation on each element of the data from the processing block on a per-element basis via the one or more lookup tables, and stream post processing result of the per-element tanh and/or sigmoid operation back to the OCM after the tanh and/or sigmoid operation is complete.

ELECTRONIC DEVICES GENERATING VERIFICATION VECTOR FOR VERIFYING SEMICONDUCTOR CIRCUIT AND METHODS OF OPERATING THE SAME

An electronic device configured to generate a verification vector for verifying a semiconductor circuit including a first circuit block and a second circuit block includes a duplicate command eliminator configured to receive a first input vector including a plurality of commands and to provide a first converted vector, in which ones of the plurality of commands that generate the same state transition are changed into idle commands, based on a state transition of the first circuit block obtained by performing a simulation operation on the first input vector, a reduced vector generator configured to provide a first reduced vector in which a number of repetitions of the idle commands included in the first converted vector is reduced, and a verification vector generator configured to output the first reduced vector having a coverage that coincides with a target coverage among a plurality of first reduced vectors as a first verification vector.

Processor having multiple cores, shared core extension logic, and shared core extension utilization instructions

An apparatus of an aspect includes a plurality of cores and shared core extension logic coupled with each of the plurality of cores. The shared core extension logic has shared data processing logic that is shared by each of the plurality of cores. Instruction execution logic, for each of the cores, in response to a shared core extension call instruction, is to call the shared core extension logic. The call is to have data processing performed by the shared data processing logic on behalf of a corresponding core. Other apparatus, methods, and systems are also disclosed.

ARRAY-BASED INFERENCE ENGINE FOR MACHINE LEARNING

An array-based inference engine includes a plurality of processing tiles arranged in a two-dimensional array of a plurality of rows and a plurality of columns. Each processing tile comprises at least one or more of an on-chip memory (OCM) configured to load and maintain data from the input data stream for local access by components in the processing tile and further configured to maintain and output result of the ML operation performed by the processing tile as an output data stream. The array includes a first processing unit (POD) configured to perform a dense and/or regular computation task of the ML operation on the data in the OCM. The array also includes a second processing unit/element (PE) configured to perform a sparse and/or irregular computation task of the ML operation on the data in the OCM and/or from the POD.

ARCHITECTURE TO SUPPORT SYNCHRONIZATION BETWEEN CORE AND INFERENCE ENGINE FOR MACHINE LEARNING
20210081846 · 2021-03-18 ·

A system to support a machine learning (ML) operation comprises a core configured to receive and interpret commands into a set of instructions for the ML operation and a memory unit configured to maintain data for the ML operation. The system further comprises an inference engine having a plurality of processing tiles, each comprising an on-chip memory (OCM) configured to maintain data for local access by components in the processing tile and one or more processing units configured to perform tasks of the ML operation on the data in the OCM. The system also comprises an instruction streaming engine configured to distribute the instructions to the processing tiles to control their operations and to synchronize data communication between the core and the inference engine so that data transmitted between them correctly reaches the corresponding processing tiles while ensuring coherence of data shared and distributed among the core and the OCMs.

ELECTRONIC DEVICE FOR EXECUTING INSTRUCTIONS USING PROCESSOR CORES AND VARIOUS VERSIONS OF INSTRUCTION SET ARCHITECTURES

In various embodiments, an electronic device may include: a processor including a plurality of cores, and a memory connected to the processor. The memory may store instructions which, when executed, cause the processor to, based on an abort of an execution of an instruction in a first core among the plurality of cores, determine whether a second core capable of executing the instruction exists in the plurality of cores, and to transfer the execution of the instruction to the second core, based at least on determining that the second core exists among the plurality of cores.