G06F16/2264

Index and storage management for multi-tiered databases
11494359 · 2022-11-08 · ·

Disclosed herein are system, method, and computer program product embodiments for providing index and storage management for multi-tiered databases. An embodiment operates by receiving a request to create an index on a multi-tiered database including both an in-memory store and a disk store. A multi-store table associated with the index is determined, wherein the multi-store table includes both a first set of data stored on the memory store and a second set of data stored on the disk store. Either the first set of data or the second set of data on which to create the index is selected based on the request. The index for the selected set of data of the multi-store table is generated. The index is stored on either the disk store or the memory store as corresponding to the selected set of data for which the index was generated.

System and method for determining exact location results using hash encoding of multi-dimensioned data
11573942 · 2023-02-07 · ·

Aspects of the present invention are directed to system and methods for optimizing identification of locations within a search area using hash values. A hash value represents location information in a single dimension format. Computing points around some location includes calculating an identification boundary that surrounds the location of interest based on the location's hash value. The identification boundary is expanded until it exceeds a search area defined by the location and a distance. Points around the location can be identified based on having associated hash values that fall within the identification boundary. Hashing operations let a system reduce the geometric work (i.e. searching inside boundaries) and processing required, by computing straightforward operations on hash quantities (e.g. searching a linear range of geohashes), instead of, for example, point to point comparisons.

Object Scriptability

Object scriptability includes receiving a high-level language script describing at least one data-analysis object, including a node representing the data-analysis object in a graph-based data structure including a plurality of nodes, where each node from the plurality of nodes represents a respective data-analysis object in a data analysis system, where each node from the plurality of nodes is connected to at least one other node from the plurality of nodes by an edge, and where the edge represents a relationship between the respective objects in the data analysis system.

Systems and methods for removing identifiable information

Systems and methods for censoring text characters in text-based data are provided. In some embodiments, an artificial intelligence system may be configured to receive text-based data and store the text-based data in a database. The artificial intelligence system may be configured to receive a list of target pattern types identifying sensitive data and receive censorship rules for the target pattern types determining target pattern types requiring censorship. The artificial intelligence system may be configured to assemble a computer-based model related to a received target pattern type in the list of target pattern types. The artificial intelligence system may be configured to use a computer-based model to identify a target data pattern corresponding to the received target pattern type within the text-based data, identify target characters within the target data pattern, and to assign an identification token to the target characters.

Efficient and scalable storage of sparse tensors

In a system for storing in memory a tensor that includes at least three modes, elements of the tensor are stored in a mode-based order for improving locality of references when the elements are accessed during an operation on the tensor. To facilitate efficient data reuse in a tensor transform that includes several iterations, on a tensor that includes at least three modes, a system performs a first iteration that includes a first operation on the tensor to obtain a first intermediate result, and the first intermediate result includes a first intermediate-tensor. The first intermediate result is stored in memory, and a second iteration is performed in which a second operation on the first intermediate result accessed from the memory is performed, so as to avoid a third operation, that would be required if the first intermediate result were not accessed from the memory.

SYSTEMS AND METHODS FOR PERFORMING DATA PROCESSING OPERATIONS USING VARIABLE LEVEL PARALLELISM
20230093911 · 2023-03-30 · ·

Techniques for determining processing layouts to nodes of a dataflow graph. The techniques include: obtaining information specifying a dataflow graph, the dataflow graph comprising a plurality of nodes and a plurality of edges connecting the plurality nodes, the plurality of edges representing flows of data among nodes in the plurality of nodes, the plurality of nodes comprising: a first set of one or more nodes; and a second set of one or more nodes disjoint from the first set of nodes; obtaining a first set of one or more processing layouts for the first set of nodes; and determining a processing layout for each node in the second set of nodes based on the first set of processing layouts and one or more layout determination rules, the one or more layout determination rules including at least one rule for selecting among processing layouts having different degrees of parallelism, and information indicating that data generated by at least one node in the first and/or third set of nodes is not used by any nodes in the dataflow graph downstream from the at least one node.

SYSTEMS AND METHODS FOR DETERMINING THE SHAREABILITY OF VALUES OF NODE PROFILES
20230031801 · 2023-02-02 · ·

The present disclosure relates to determining the shareability of values of node profiles. Record objects and electronic activities of a system of record corresponding to a data source provider may be accessed. Each record object may correspond to a record object type and have one or more object field-value pairs. Node profiles may be maintained. Values of fields corresponding to a predetermined type of field including fewer than a predetermined threshold number of data source providers may be identified. A restriction tag used to restrict populating other node profiles may be generated. Provision of the value with a second data source provider may be restricted.

RULE EVALUATION FOR REAL-TIME DATA STREAM

Methods, systems, apparatuses, and computer program products are described. A computing device may receive a user input indicating a data stream, a data metric configured for a tenant of a multi-tenant database system, a rule associated with the data metric, a trigger based on the data metric, or some combination thereof. The computing device may receive, from the data stream, a real-time data stream including information corresponding to the data metric configured for the tenant, where the real-time data stream may be associated with a first user profile stored at the multi-tenant database system. The computing device may evaluate the rule, the trigger, or both based on ingesting the data stream and may perform the action based on the evaluation. Performing the action may involve sending a message to a user device associated with the first user profile in response to at least a portion of the data stream.

Enumeration of trees from finite number of nodes

Embodiments of methods, apparatuses, devices and/or systems for manipulating hierarchical sets of data are disclosed.

Effective user modeling with time-aware based binary hashing

In one embodiment, a computer-implemented method includes acquiring sequential user behavior data including one-dimensional data. The user behavior data is associated with a user. The method includes abstracting features from the sequential user behavior data to cover short-term and long-term timeframes. The method includes determining one or more properties of the user based on the features.