G06F16/2237

Fast in-memory technique to build a reverse CSR graph index in an RDBMS

In an embodiment, a computer obtains a mapping of a relational schema of a database to a graph data model. The relational schema identifies vertex table(s) that correspond to vertex type(s) in the graph data model and edge table(s) that correspond to edge type(s) in the graph data model. Each edge type is associated with a source vertex type and a target vertex type. Based on that mapping, a forward compressed sparse row (CSR) representation is populated for forward traversal of edges of a same edge type. Each edge originates at a source vertex and terminates at a target vertex. Based on the forward CSR representation, a reverse CSR representation of the edge type is populated for reverse traversal of the edges of the edge type. Acceleration occurs in two ways. Values calculated for the forward CSR are reused for the reverse CSR. Elastic and inelastic scaling may occur.

Multiple dimension layers for an analysis data system and method

A system and method are presented that analyze evaluation data concerning a subject using attributes that are logically arranged in a geometric structure such as a rectangular array. A plurality of dimension layers is laid on top of the logical arrangement of data. Each dimension layers assigns values to a plurality of dimensions based on the value of neighboring attribute groups. Each dimension layer can be associated with one or more reporting configurations that contain descriptors for the defined dimensions as well as formatting instructions for report-like output.

MECHANISM FOR MANAGING A MIGRATION OF DATA WITH MAPPED PAGE AND DIRTY PAGE BITMAP SECTIONS

A method for managing a live migration operation includes partitioning a first data structure into N sections of the first data structure, the first data structure indicating a location, associated with a source storage, having data to be copied to a target storage, and transferring less than all of the N sections of the first data structure to a migration server.

OBJECT DATA STORED OUT OF LINE VECTOR ENGINE
20220405257 · 2022-12-22 ·

Examples described herein generally relate to database systems for storing and processing both small values that are smaller than size of a database column and large objects that exceed the size of the database column. A database management system (DBMS) determines that a value to be stored in a database is a large object having a size larger than a column of the database. The DBMS stores the value as a large object in an external storage associated with a token stored in the column of the database. The token includes information for processing the large object. A vector processing engine associated with the external storage processes the large object based on the information in the token in response to a database command from the DBMS on multiple records represented as a vector.

Systems and methods for term prevalance-volume based relevance
11526672 · 2022-12-13 ·

Techniques for prevalence-volume based relevance are provided. Corresponding systems and methods may include ingesting a corpus of documents; receiving a search operator; segmenting the corpus of documents into (i) a first set of documents that matches the search operator, and (ii) a second set of documents that do not match the search operator; extracting a first and second token list of tokens; calculating a prevalence-volume value for tokens included in the first and second token lists; generating a prevalence-volume ratio (PVR) matrix that associates tokens included in the first and/or second token lists with a PVR value, wherein the PVR value for a particular token is a ratio between the prevalence-volume value of the particular token for the first set of documents and the prevalence-volume value of the particular token for the second set of documents; and associating the search operator with the generated PVR matrix.

METHOD AND APPARATUS FOR RETRIEVING A DATA PACKAGE

This invention provides a method for ranking contextual data of a plurality of data packages in a Knowledge Management System (KMS) as a computer system configured to store information on the maintenance of an apparatus, wherein each data package of the plurality of data packages includes a data part relating to maintenance of the apparatus, a first metadata part for the data part having a first metadata value and representing a first contextual information type and a second metadata part for the data part having a second metadata value and representing a second contextual information type, the method comprising the steps of: receiving a first and second relevance value for the first and second contextual information type respectively for a first data package, wherein the first and second relevance values represent the relevance of the first and second contextual information types of the first data part to the maintenance of the apparatus; receiving a third and fourth relevance value for the first and second contextual information type respectively for a second data package, wherein the third and fourth relevance values represent the relevance of the first and second contextual information types of the second data part to the maintenance of the apparatus; determining a first ranking value for the first contextual information type based on the first and third relevance values; determining a second ranking value for the second contextual information type based on the second and fourth relevance values; and storing the first and second ranking values in the KMS. The present invention also provides a method for retrieving data packages, including the steps of: receiving a data package request including a first metadata value for the first contextual information type and a second metadata value for the second contextual information type; and retrieving a first data package of the plurality of data packages based on at least one of the first and second metadata values of the data package request, the first and second ranking values, and at least one of the first and second metadata values of each data package of the plurality of data packages.

OPTIMIZATION VIA DYNAMICALLY CONFIGURABLE OBJECTIVE FUNCTION
20220391393 · 2022-12-08 ·

Provided is a system and method for dynamic configuration of a multi-objective optimization function and identifying an optimal set of records based thereon. In one example, the method may include receiving a set of data records and priority values to be applied to the set of data records, generating an objective function from an objective function template stored in a memory device, wherein the generating comprises dynamically configuring parameter values of the objective function based on the priority values, executing the objective function on the set of data records and identifying an optimal subset of data records from among the set of data records based on the dynamically configured parameter values of the executing objective function, and displaying identifiers of the identified optimal subset of data records

Computer-readable recording medium recording index generation program, information processing apparatus and search method

A non-transitory computer-readable recording medium records an index generation program for causing a computer to execute processing of: inputting data which is described by a combination of an item and a value; and generating index information regarding an appearance position of each of the item and the value for each of the item and the value which are included in the data.

Performing fine-grained question type classification

A computer-implemented method according to one embodiment includes converting an input question into a vector form using trained word embeddings; constructing a type similarity matrix using a predetermined ontology; and determining a score for all possible types for the input question, based on the input question in vector form and the type similarity matrix.

INDEX-BASED, ADAPTIVE JOIN SIZE ESTIMATION

Systems, methods, and computer media are described for index-based join size estimation. For a join operation between two tables, a filter is applied to the first table, resulting in a filter output. The filter output is then sampled. For each sample, an index for a second table is accessed and counts of records in the second table that match the sample are retrieved. Using the sample size and the retrieved counts from the index of the second table, a data size for the join operation can be efficiently and accurately estimated. Statistical confidence in the estimate can also be assessed using variance-based calculations.