G06F8/453

METHOD OF DISTRIBUTED GRAPH LOADING FOR MINIMAL COMMUNICATION AND GOOD BALANCE VIA LAZY MATERIALIZATION AND DIRECTORY INDIRECTION USING INDEXED TABULAR REPRESENTATION

Techniques herein minimally communicate between computers to repartition a graph. In embodiments, each computer receives a partition of edges and vertices of the graph. For each of its edges or vertices, each computer stores an intermediate representation into an edge table (ET) or vertex table. Different edges of a vertex may be loaded by different computers, which may cause a conflict. Each computer announces that a vertex resides on the computer to a respective tracking computer. Each tracking computer makes assignments of vertices to computers and publicizes those assignments. Each computer that loaded conflicted vertices transfers those vertices to computers of the respective assignments. Each computer stores a materialized representation of a partition based on: the ET and vertex table of the computer, and the vertices and edges that were transferred to the computer. Edges stored in the materialized representation are stored differently than edges stored in the ET.

Cognitive automation-based engine to propagate data across systems

Aspects of the disclosure relate to cognitive automation-based engine processing to propagate data across multiple systems via a private network to overcome technical system, resource consumption, and architecture limitations. Data to be propagated can be manually input or extracted from a digital file. The data can be parsed by analyzing for correct syntax, normalized into first through sixth normal forms, segmented into packets for efficient data transmission, validated to ensure that the data satisfies defined formats and input criteria, and distributed into a plurality of data stores coupled to the private network, thereby propagating data without repetitive manual entry. The data may also be enriched by, for example, correcting for any errors or linking with other potentially related data. Based on data enrichment, recommendations of additional target(s) for propagation of data can be identified. Reports may also be generated. The cognitive automation may be performed in real-time to expedite processing.

Code conversion apparatus and method for improving performance in computer operations
10908899 · 2021-02-02 · ·

A code conversion apparatus includes a memory and a processor coupled to the memory. The memory is configured to store therein a first code including a first data definition of a plurality of arrays, a first operation for the plurality of arrays, and a second data definition of an array indicating a result of the first operation. The processor is configured to convert the first data definition and the second data definition included in the first code into a data definition of an array of structures. The processor is configured to convert the first operation included in the first code into a second operation for the array of structures. The processor is configured to generate a second code including a predetermined instruction to perform the second operation on different pieces of data of the plurality of arrays in parallel with one another.

SHARED LOCAL MEMORY TILING MECHANISM

An apparatus to facilitate memory tiling is disclosed. The apparatus includes a memory, one or more execution units (EUs) to execute a plurality of processing threads via access to the memory and tiling logic to apply a tiling pattern to memory addresses for data stored in the memory.

THREAD ASSOCIATED MEMORY ALLOCATION AND MEMORY ARCHITECTURE AWARE ALLOCATION
20210011768 · 2021-01-14 ·

A method and system for thread aware, class aware, and topology aware memory allocations. Embodiments include a compiler configured to generate compiled code (e.g., for a runtime) that when executed allocates memory on a per class per thread basis that is system topology (e.g., for non-uniform memory architecture (NUMA)) aware. Embodiments can further include an executable configured to allocate a respective memory pool during runtime for each instance of a class for each thread. The memory pools are local to a respective processor, core, etc., where each thread executes.

SYSTEMS AND METHODS FOR ENERGY PROPORTIONAL SCHEDULING

A compilation system using an energy model based on a set of generic and practical hardware and software parameters is presented. The model can represent the major trends in energy consumption spanning potential hardware configurations using only parameters available at compilation time. Experimental verification indicates that the model is nimble yet sufficiently precise, allowing efficient selection of one or more parameters of a target computing system so as to minimize power/energy consumption of a program while achieving other performance related goals. A voltage and/or frequency optimization and selection is presented which can determine an efficient dynamic hardware configuration schedule at compilation time. In various embodiments, the configuration schedule is chosen based on its predicted effect on energy consumption. A concurrency throttling technique based on the energy model can exploit the power-gating features exposed by the target computing system to increase the energy efficiency of programs.

Method of distributed graph loading for minimal communication and good balance via lazy materialization and directory indirection using indexed tabular representation

Techniques herein minimally communicate between computers to repartition a graph. In embodiments, each computer receives a partition of edges and vertices of the graph. For each of its edges or vertices, each computer stores an intermediate representation into an edge table (ET) or vertex table. Different edges of a vertex may be loaded by different computers, which may cause a conflict. Each computer announces that a vertex resides on the computer to a respective tracking computer. Each tracking computer makes assignments of vertices to computers and publicizes those assignments. Each computer that loaded conflicted vertices transfers those vertices to computers of the respective assignments. Each computer stores a materialized representation of a partition based on: the ET and vertex table of the computer, and the vertices and edges that were transferred to the computer. Edges stored in the materialized representation are stored differently than edges stored in the ET.

PROCESS FOR THE AUTOMATIC GENERATION OF PARALLEL CODE
20200356373 · 2020-11-12 ·

Process for the automatic generation of parallel code, at a high level of abstraction, executable on electronic calculators having heterogeneous multi-core or many-core architectures.

Compiling a Program from a Graph
20200319861 · 2020-10-08 ·

A method for generating an executable program to run on one or more processor modules. The method comprises: receiving a graph comprising a plurality of data nodes, compute vertices and edges; and compiling the graph into an executable program including one or more types of multi-access instruction each of which performs at least two memory access (load and/or store) operations in a single instruction. The memory on each processor module comprises multiple memory banks whereby the same bank cannot be accessed by different load or store operations in the same instruction. The compilation comprises assigning instances of the multi-access instructions to implement at least some of the graph edges, and allocating the data to memory addresses within different ones of the banks. The allocating is performed subject to one or more constraints, including at least that different load/store operations should not access the same memory bank in the same instruction.

Software service execution apparatus, system, and method
10776170 · 2020-09-15 · ·

A software service execution apparatus comprising a registry of software services, each service to execute a data processing function in the registry; a controller, to receive a processing request defining requested data processing functions, compose an execution schedule of software services from the software services identified, fulfill the data processing request by identifying a software service that matches the requested data processing function, include the identified software service in the execution schedule, and control execution of the schedule. The apparatus further comprises a machine learning mechanism configured to maintain a record of the composing. The composing includes, in an automated mode, if more than one software services is identified, requesting a selection of one software service as an automated selection candidate; the mechanism providing a selection of one software service, basing the selection on analysis of the composing and execution of execution schedules in which an automated selection candidate is identified.