G06F16/2433

Principal Component Analysis
20230045139 · 2023-02-09 · ·

A method for principal component analysis includes receiving a principal component analysis (PCA) request from a user requesting data processing hardware to perform PCA on a dataset, the dataset including a plurality of input features. The method further includes training a PCA model on the plurality of input features of the dataset. The method includes determining, using the trained PCA model, one or more principal components of the dataset. The method also includes generating, based on the plurality of input features and the one or more principal components, one or more embedded features of the dataset. The method includes returning the one or more embedded features to the user.

SYSTEMS AND METHODS FOR MANAGING STRUCTURED QUERY LANGUAGE ON DYNAMIC SCHEMA DATABASES

In various aspects of the present disclosure, systems and methods are described to identify and resolve structured queries so they execute consistently and accurately against any data architecture, and for example, dynamic or unstructured database stores. According to one embodiment, a dynamic schema data system implements a query dialect that is configured to expose underlying flexible schemas of the dynamic schema data system, any structured data, unstructured or partially structured data, and expressive querying native to the dynamic schema system in a language that is compatible with structured queries, and for example, compatible with SQL-92. In further embodiments, the query dialect is configured to enable consistency with existing dynamic schema database query semantics (e.g., the known MongoDB database and associated query semantics).

SYSTEMS AND METHODS FOR UNIFIED GRAPH DATABASE QUERYING
20230045347 · 2023-02-09 · ·

A unified graph query system provides an abstraction layer that increases the interoperability of different graph technologies by exposing graphs stored in graph databases using a unified query language. The abstraction layer generates graph models for each of the available graph databases and extracts a graph component and other source data used to identify the source of the data requested by a query. The unified graph query system executes the query across the multiple graphs included in different graph databases by using the graph models to locate the graph component in each of the multiple graphs and extract the feature data associated with the graph component. The feature data is used to generate features that are used by a machine learning service to train machine learning models and is also used to make predictions in real time.

Query plan migration in database systems
11556538 · 2023-01-17 · ·

Methods, systems, and computer-readable storage media for receiving, by a current database system, a query plan file representative of a captured query plan from a source database system, receiving, by the current database system, a set of definitions including one or more definitions, each definition in the set of definitions corresponding to an object that is implicated by the query plan, the object being included in a set of objects, and determining, by the current database system, that each definition in the set of definitions is identical to a respective definition of a corresponding object within the current database system, and in response: executing the captured query plan in the current database system to provide a query result.

Efficient semantic analysis of program code
11550556 · 2023-01-10 · ·

Provided are systems and methods of a compiler that efficiently processes semantic analysis. For example, the compiler may perform semantic analysis on as much of the source code as possible during compile time. For any instructions, such as dynamic expressions, that are not known at compile time, the compiler may encode semantic bytecode for performing the semantic checks on such dynamic expressions, and their dependent expressions, during execution/runtime of the program. In one example, the method may include compiling source code of a program into bytecode, identifying, during the compiling, a dynamic expression that includes one or more dependent static expressions within the source code, generating semantic bytecode for semantic analysis of the one or more dependent static expressions of the dynamic expression, and adding the semantic bytecode to the bytecode of the program.

METHOD AND SYSTEM FOR COMPRESSING GENOME SEQUENCES USING GRAPHIC PROCESSING UNITS
20180011870 · 2018-01-11 ·

The present invention provides a method for compressing genome sequences readers using GPU processing unit. The method comprising the steps of: identifying position of each given genome reader characters string in the sequence of a reference genome, determining alignment of each reader string within the reference genome, comparing each reader characters string to corresponding reference genome sequence based on determined alignment, filtering characters in each reader by GPU processor by eliminating similar characters and extracting only characters differences in association to their position in the genome sequence and recording filtered data of each reader in association to its alignment in genome reference at the genome compressed database.

Optimized tenant schema generation
11709807 · 2023-07-25 · ·

A system includes a memory and a processor, where the processor is in communication with the memory. The processor is configured to receive a request to create a tenant schema within a database, where the database includes one or more tenant schemas associated with one or more tenants. The tenant schema associated with a tenant of the one or more tenants is created, where the tenant schema is empty. It is determined whether the database includes a template schema. Upon determining the template schema exists, command is sent to the database to copy the template schema to the tenant schema associated with the tenant.

EXTREME VALUE COMPUTATION

The method may include providing a plurality of synopsis techniques for determining a plurality of attribute value information indicative of the at least one attribute. The method may include determining a data characteristic describing the plurality of data rows of the current data block. The method may include selecting, based on the determined data characteristic, at least one synopsis technique of the provided plurality of synopsis techniques suitable for generating the plurality of attribute value information for the at least one attribute of the current data block. The method may include determining the plurality of attribute value information for the at least one attribute of the plurality of data rows of the current data block using the at least one selected synopsis technique. The method may include storing the determined plurality of attribute value information for the current data block to be used for query processing against the data table.

SORTING TABLES IN ANALYTICAL DATABASES

A method for sorting a data table is provided. The method may include providing a plurality of attribute value information for each data block of the data table. The method may also include receiving a query requiring a sorting on the first attribute of the data table. The method may further include determining a plurality of sequences of a plurality of data blocks having disjoint value ranges of the first attribute based on the provided plurality of attribute value information. The method may also include, for each determined sequence of the plurality of data blocks, reading a plurality of data, sorting the read plurality of data from each data block, and concatenating the sorted plurality of data from the plurality of data blocks within the determined sequence, thereby providing a sorted plurality of sequences. The method may further include merging the sorted plurality of sequences.

ACCESSING ELECTRONIC DATABASES

Examples disclosed herein relate to accessing electronic databases. Some examples disclosed herein may include partitioning a computation task into subtasks. A processing node of a computation engine may generate a database query for retrieving an electronic data segment associated with at least one of the subtasks from a database. The database query may include pre-processing instructions for a database management system (DBMS) associated with the database to pre-process the electronic data segment before providing the electronic data segment to the processing node. The pre-processing instructions may include at least one of: filtering, projection, join, aggregation, count, and user-defined instructions. The generated query may be provided to the DBMS.