Patent classifications
G06F11/3404
Computer system and data analysis method
A computer system includes a first computer and a second computer. The second computer includes, a minimum analysis dataset in which a data item serving as an analysis target and a repetition unit are defined in advance for each analysis target and an agent. The agent receives an analysis target data fetching designation including the minimum analysis dataset, a repetition range of repeating acquisition of data, and a repetition unit. The agent generates a first process that acquires data from the first computer and a first instance that executes processing within the first process on the basis of the repetition range and the repetition unit and activate the first instance to acquire the accumulated data from the first computer. When the processing of the first instance is completed, the agent generates a second process that executes analysis processing and a second instance that executes processing within the second process.
PARALLEL PROGRAM SCALABILITY BOTTLENECK DETECTION METHOD AND COMPUTING DEVICE
A computer executed parallel program scalability bottleneck detection method is provided, which includes: building a program structure graph for a program source code; collecting performance data based on a sampling technique during runtime; the performance data including: performance data of each vertex of the program structure graph and inter-process communication dependence of communication vertices; building a program performance graph by filling the program structure graph with the collected performance data, the program performance graph recording data and control dependence of each process as well as inter-process communication dependence; detecting problematic vertices from the program performance graph, and starting from some or all of the problematic vertices, backtracking through data/control dependence edges within a process and communication dependence edges between different processes, to detect scalability bottleneck vertices.
GRAPH-BASED DATA MULTI-OPERATION SYSTEM
A graph-based data multi-operation system includes a data multi-operation management subsystem coupled to an application and accelerator subsystems. The data multi-operation management subsystem receives a data multi-operation graph from the application that identifies first data and defines operations for performance on the first data to transform the first data into second data. The data multi-operation management subsystem assigns each of the operations to at least one of the accelerator systems, and configures the accelerator subsystems to perform the operations in a sequence that transforms the first data into the second data, When the data multi-operation management subsystem determine a completion status for the performance of the operations by the accelerator subsystems, it transmits a completion status communication to the application that indicates the completion status of the performance of the plurality of operations by the plurality of accelerator subsystems.
Parallel program scalability bottleneck detection method and computing device
A computer executed parallel program scalability bottleneck detection method is provided, which includes: building a program structure graph for a program source code; collecting performance data based on a sampling technique during runtime; the performance data including: performance data of each vertex of the program structure graph and inter-process communication dependence of communication vertices; building a program performance graph by filling the program structure graph with the collected performance data, the program performance graph recording data and control dependence of each process as well as inter-process communication dependence; detecting problematic vertices from the program performance graph, and starting from some or all of the problematic vertices, backtracking through data/control dependence edges within a process and communication dependence edges between different processes, to detect scalability bottleneck vertices.
Information processing apparatus, computer-readable recording medium storing program, and information processing method
An information processing apparatus includes: a memory; and a processor coupled to the memory and the processor configured to calculate shortening rates by comparing execution times for each of a plurality of functions in a case where an evaluation target program is executed in an execution environment with execution times for each of the plurality of functions in a case where the evaluation target program is executed in a simulation environment, and generate a simulation program to be used in the simulation environment based on the calculated shortening rates and the evaluation target program.
Self-adjustable end-to-end stack programming
Systems and methods are provided for optimizing parameters of a system across an entire stack, including algorithms layer, toolchain layer, execution or runtime layer, and hardware layer. Results from the layer-specific optimization functions of each domain can be consolidated using one or more consolidation optimization functions to consolidate the layer-specific optimization results, capturing the relationship between the different layers of the stack. Continuous monitoring of the programming model during execution may be implemented and can enable the programming model to self-adjust based on real-time performance metrics. In this way, programmers and system administrators are relieved of the need for domain knowledge and are offered a systematic way for continuous optimization (rather than an ad hoc approach).
IN-CORE PARALLELISATION IN A DATA PROCESSING APPARATUS AND METHOD
A data processing apparatus and a method for processing data are disclosed. The data processing apparatus comprises: multithreaded processing circuitry to perform processing operations of a plurality of micro-threads, each micro-thread operating in a corresponding execution context defining an architectural state. Thread control circuitry collects runtime data indicative of a performance metric relating to the processing operations. Decoder circuitry is responsive to a detach instruction in a first micro-thread of instructions executed in a first execution context defining a first architectural state, the detach instruction specifying an address, to provide detach control signals to the thread control circuitry. When the runtime data meet a parallelisation criterion, the thread control circuitry is responsive to the detach control signals to spawn a second micro-thread of instructions executed in a second execution context defining a second architectural state based on the first architectural state, the second micro-thread of instructions comprising a subset of instructions of the first micro-thread of instructions starting at the address.
Progress visualization of computational job
The visualization of progress of a distributed computational job at multiple points of execution. After a computational job is compiled into multiple vertices, and then those multiple vertices are scheduled on multiple processing nodes in a distributed environment, a processing gathering module gathers processing information regarding processing of multiple vertices of a computational job, and at multiple instances in time in the execution of the computational job. A user interface module graphically presents a representation of an execution structure representing multiple nodes of the computational job, and dependencies between the multiple nodes, where the nodes may be a single vertex or a group of vertices (such as a stage).
ENHANCED CONFIGURATION MANAGEMENT OF DATA PROCESSING CLUSTERS
Described herein are systems, methods, and software to enhance the management and deployment of data processing clusters in a computing environment. In one example, a management system may monitor data processing efficiency information for a cluster and determine when the efficiency meets efficiency criteria. When the efficiency criteria are met, the management system may identify a new configuration for the cluster and initiate an operation to implement the new configuration for the cluster.
METHOD FOR MEASURING PERFORMANCE OF NEURAL PROCESSING DEVICE AND DEVICE FOR MEASURING PERFORMANCE
A method for measuring performance of neural processing devices and devices for measuring performance are provided. The method for measuring performance of neural processing devices comprises receiving hardware information of a neural processing device, modeling hardware components according to the hardware information as agents, dividing a calculation task by events for the agents and modeling the calculation task, thereby generating an event model which includes nodes corresponding to the agents and edges corresponding to the events and measuring a total duration of the calculation task through simulation of the event model.