Patent classifications
G06F16/211
RESOURCE PROVISIONING SYSTEMS AND METHODS
A method for a first set of processors and a second set of processors comprises, the first set of processors processing a set of queries, as a result of a change in utilization of the first set of processors, processing the set of queries using the second set of processors. The change in processors is independent of a change in storage resources, the storage resources shared by the first set of processors and the second set of processors.
RELATIONSHIP ANALYSIS USING VECTOR REPRESENTATIONS OF DATABASE TABLES
A computer-implemented method includes representing a plurality of database tables as respective vectors in a multi-dimensional vector space, receiving an indication that a first database table represented by a first vector and a second database table represented by a second vector are related to each other, moving the respective vectors representing the plurality of database tables in the multi-dimensional vector space in response to the indication, and grouping the plurality of database tables into one or more table clusters based on positions of the respective vectors representing the plurality of database tables in the multi-dimensional vector space.
SEPARATION OF LOGICAL AND PHYSICAL STORAGE IN A DISTRIBUTED DATABASE SYSTEM
Distributed database systems including compute nodes and page servers are described herein that enable separating logical and physical storage of database files in a distributed database system. A distributed database system includes a page server and a compute node and is configured to store a logical database file that includes data and is associated with a file identifier. Each page server is configurable to store slices (i.e., subportions) of the logical database file. The compute node is coupled to the plurality of page servers and configured to store the logical database file responsive to a received command. In an aspect, such storage may comprise slicing the data comprising the logical database file into a set of slices with each being associated with a respective page server, maintaining an endpoint mapping for each slice of the first set of slices, and transmitting each slice to the associated for storage thereby.
DATABASE, MATERIAL DATA PROCESSING SYSTEM, AND METHOD OF CREATING DATABASE
A database storing data associated with an identifier unique to each sample, the data including first data representative of at least one of composition data, processing data, and property data for the each sample, and second data representative of microstructure data for the each sample. The microstructure data includes a feature determined based on a temperature dependence of magnetization for the each sample.
SYSTEMS AND METHODS FOR MANAGING STRUCTURED QUERY LANGUAGE ON DYNAMIC SCHEMA DATABASES
In various aspects of the present disclosure, systems and methods are described to identify and resolve structured queries so they execute consistently and accurately against any data architecture, and for example, dynamic or unstructured database stores. According to one embodiment, a dynamic schema data system implements a query dialect that is configured to expose underlying flexible schemas of the dynamic schema data system, any structured data, unstructured or partially structured data, and expressive querying native to the dynamic schema system in a language that is compatible with structured queries, and for example, compatible with SQL-92. In further embodiments, the query dialect is configured to enable consistency with existing dynamic schema database query semantics (e.g., the known MongoDB database and associated query semantics).
SEMANTICS BASED DATA AND METADATA MAPPING
The present disclosure involves computer-implemented method, medium, and system for automatically correlating semantically connected data and metadata. One example method includes identifying a document that is to be analyzed using a semantics based mapping (SBM) infrastructure. A matching process is performed for the identified document using the SBM infrastructure, where the matching process identifies a plurality of matching terms within the document, the plurality of matching terms are assigned to a plurality of semantics identifiers (IDs), and each semantics ID corresponds to one or more terms in the plurality of matching terms. Each of the plurality of matching terms is replaced with a respective term ID to generate an updated document. A request to search for a target term in the document is received. The target term is translated to a target term ID based on the SBM infrastructure. The updated document is searched for one or more matching terms.
Consistent schema-less scalable storage system for orders
In various example embodiments, a system and method for consistent schema-less and scalable database storage are described herein. A data object is generated. The data object corresponds to a column of a table from a database. The data object includes information regarding an order that is placed over a network publication system. The data object is stored in the column of the table in the database. A request to access the data object is received from a device of a first user. The data object is transmitted to the device of the first user. The data is kept coherent during concurrent updates by using optimistic locks. The data is kept backward and forward compatible utilizing intermediate data structures common to both versions of the software. The data is kept searchable by using lookup indexes. The storage system is kept scalable by sharding data across many databases.
Data stream processing
Techniques for partitioning data from a data stream into batches and inferring schema for individual batches based on the field values of each batch are disclosed. The system may infer different schemas corresponding to different batches of data records even though the batches are received from a common data stream or a common data source. The system may infer a schema by determining whether a field contains single values or multiple values. Then the system determines the field type(s) associated with the values. These determinations are then stored in a dictionary generated for each batch.
MULTI-DIMENSIONAL DATA LABELING
Methods and systems for multi-dimensional data labeling. A structured data set having a plurality of rows is obtained, the structured data set comprising a set of data attributes, each data attribute having a data value for each of the plurality of rows of the structured data set. The structured data set is decomposed into a plurality of dimensions, each dimension defining a proper subset of the data attributes based on coherence criterion. A dimension label is obtained for each dimension of at least a portion of the plurality of rows of the structured data set and the dimension labels for a given one of the rows of the structured data set are consolidated into at least one row label for the given one of the rows.
DETERMINING DATA SUITABILITY FOR TRAINING MACHINE LEARNING MODELS
Technologies are provided for determining a suitability of data payloads for training a machine learning model. A schema can be generated based on sample data payloads that have different data formats. The sample data payloads (and/or additional data payloads) can be converted to a format that conforms to the schema. Feature vectors can then be generated based on the converted data payloads, and used to determine a suitability of the data payloads for training a machine learning model. If the data payloads are sufficiently suitable, the converted data payloads can be used to train the machine learning mode. Otherwise, the schema may be annotated and new converted payloads may be generated based on the annotated schema. The feature vector generation and suitability analysis can then be repeated.