Patent classifications
G06F16/2264
High-dimensional data nearest-neighbor query method based on variable-length hash codes
A high-dimensional data nearest-neighbor query method based on variable-length hash codes is disclosed. Specifically, in this method, hash codes with the same code frequency are taken as a sub-data set, all the sub-data sets are ranked, a compression ratio is set for each sub-data set, the sub-data sets are compressed and trained according to the compression ratios, and hash codes and original codes corresponding to the trained sub-data sets are obtained; the hash code of each trained sub-data sets is copied to obtain multiple replicas, and the original codes and the corresponding replicas are strung to obtain strung hash codes which are integrated to form a final nearest-neighbor query table; and, a query code is obtained, and the nearest-neighbor query table is searched for a nearest-neighbor data set to complete query. The query efficiency and accuracy are greatly improved according to the invention.
ENHANCED PLATFORM AND PROCESSES FOR SCALABILITY
A computer-implemented method of receiving and incrementally processing hierarchical data in a computing environment. Receiving hierarchical data within a computing environment. The computing environment including a plurality of interrelated components. Incrementally processing the hierarchical data to obtain processed portions. The incremental processing of the portions of hierarchical data able to be initiated with requiring receipt of the entirety of the hierarchical data. Maintaining an indexed representation of previously processed portions of the hierarchical data to prevent unnecessarily processing a same portion of the hierarchical data.
Generating multidimensional database queries
Techniques for generating a multidimensional database query are disclosed. A system receives a user-supplied natural language query and performs natural language processing to extract a literal from the natural language query. The system performs a lookup of the literal in one or more dictionary data structures associated with a multidimensional database, to determine that the literal is associated with a particular dimension of multiple dimensions in the multidimensional database. The system performs a lookup of the literal and the dimension in the one or more dictionary data structures, to determine that the literal is associated with a particular member of the dimension. The system generates a multidimensional database query to satisfy the user-supplied natural language query. The multidimensional database query includes a query clause that references the particular member of the dimension.
Query processing using hybrid inverted index of predicates
A query processing system generates and employs a hybrid inverted index of predicates for predicate statement evaluation. The query processing system converts a collection of predicate statements to two parts, a matrix and a set of reduced predicate statements. The query processing system then generates a hybrid inverted index that maps values for variables to predicates from the matrix and the reduced predicate statements that evaluate to true for corresponding values. When querying data, the query processing system performs a lookup on the hybrid inverted index to identify predicates from the matrix and reduced predicate statements that evaluate to true for values of variables for the data. The query processing system identifies predicate statements that evaluate to true by evaluating the matrix and reduced predicate statements using treating predicates identified from the hybrid inverted index as true.
Visualizing sparse multi-dimensional data
A computer-implemented method, system and computer program product for visualizing sparse multi-dimensional data. A multi-dimensional dataset (“dataset”) is converted into a three-dimensional architecture and the remaining dimensions, if any, are arranged into one or more planes. The sparse numeric data of the dataset is converted into multiple planes based on partitioning the three-dimensional architecture by the most sparse dimension and aligning the remaining two-dimensions as two-dimensional planes. Colors or shades of colors are assigned to these planes based on the density quantum of the data present in the planes. Furthermore, planes of the dataset are constructed using the assigned colors or shades of color and the defined opacity values of the planes. The constructed planes are mapped to the dataset in the form of a cube(s) and possibly two-dimensional planes, where the darkest color and the least translucent section(s) of the dataset are positioned in the center of the cube(s).
Systems and methods for dynamic computer aided innovation via multidimensional complementary difference recommendation and exploration
Systems and methods for dynamic computer aided innovation via multidimensional complementary difference recommendation and exploration are disclosed including categorizing a first and second data element in a database with a first attribute and second attribute, respectively, of a first dimension, a dimension being an aspect of a situation, problem, or thing. The first and second data elements are categorized with a first attribute and a second attribute of a second dimension, the second dimension being different from the first dimension. Analyzing the first and second attribute of the first dimension and the first and second attribute of the second dimension to determine a ratio of similarity and dissimilarity; calculating a composite score of the ratio of the first dimension and the ratio of the second dimension; and generating and storing a link between the first and second data element when the composite score is within numerical limits.
MACHINE LEARNING TECHNIQUES FOR PREDICTIVE STRUCTURAL ANALYSIS
Various embodiments of the present invention provide methods, apparatus, systems, computing devices, computing entities, and/or the like for performing predictive structural analysis. Certain embodiments of the present invention utilize systems, methods, and computer program products that perform predictive structural analysis using at least one of table column classification machine learning models, table column clustering machine learning models, structural variance generation machine learning models, and emergence report generation machine learning models.
Systems and Methods for Natural Language Querying
Systems and methods for natural language querying in accordance with embodiments of the invention are illustrated. One embodiment includes a data visualization system, including a processor, and a memory, the memory including a core grammar library, comprising a list of regular expression-system function pairs, and a natural language query (NLQ) application, where the NLQ application configures the processor to obtain a database from a user, obtain an NLQ directed at the database, parse the NLQ using the core grammar library to identify a system function and a set of one or more parameters, and perform the system function using the set of one or more parameters to visualize at least a portion of the database.
SYSTEMS AND METHODS FOR GENERATING A FILTERED DATA SET
The present disclosure relates to generating a filtered data set. Data from a plurality of systems of record of a plurality of data source providers may be accessed. A master data set generated using the data accessed from the plurality of systems of record may be maintained. Restriction policies including one or more rules for restricting sharing of data may be maintained. A filtered data set may be generated for a data source provider responsive to an application of restriction policies of other data source providers to the master data set. The filtered data set may be provisioned.
SYSTEMS AND METHODS FOR UPDATING RECORD OBJECTS OF A SYSTEM OF RECORD
The present disclosure relates to generating performance profiles of member nodes. A plurality of electronic activities can be accessed. A subset of electronic activities from the plurality of electronic activities can be identified. The subset of electronic activities can be parsed to identify participants of the electronic activities. A second node profile can be accessed for each participant. Participant types can be identified from each second node profiles. A distribution of the subset of electronic activities can be determined. A performance profile can be generated.