Patent classifications
G06F16/2458
DATA MODEL FOR MINING
This disclosure relates to managing data by an agent located within a mining operation. The data is stored as voxel data on a voxel net server. The server processes user input from a user controlling the agent within the mining operation and receives from the agent a request for voxel data associated with one or more voxels. The one or more voxels are a subset of voxels stored on the voxel net server and each of the one or more voxels is identified based on connections with voxels of previous requests. The server then queries a database representing the voxel net for the one or more voxels to retrieve associated voxel data based on the connections and returns the voxel data to the agent. Finally, the voxel data is displayed on a user device to the user.
FUZZY LOGIC MODELING FOR DETECTION AND PRESENTMENT OFANOMALOUS MESSAGING
Disclosed is an approach that applies a fuzzy logic model that may involve fuzzy-matching a plurality of address fields to determine a common physical address, and determining a number of communiques directed to that address with reference to a threshold that may determine an excessive number of communiques. The plurality of address fields may also be fuzzy-matched to information in a fraud-risk database which may comprise a fraud-risk address. One or more matches may be presented to a user who may adjust the views of the various matches, track various trends within the data, and harmonize the various address fields relating to a physical address.
Service monitoring interface with an aggregate key performance indicator of a service and aspect key performance indicators of aspects of the service
A method is disclosed that includes receiving a request to display a service-monitoring user interface that illustrates performance of one or more services that are each provided by one or more entities. Each service is associated with a stored service definition that identifies the one or more entities, and each entity is associated with stored entity definition information that identifies machine data produced by or about the entity from one or more sources. The method further includes causing display of the service-monitoring user interface illustrating performance of each service via an aggregate key performance indicator (KPI) that characterizes a respective service as a whole, and a plurality of aspect KPIs that each characterize an aspect of an associated service. Each KPI is defined by a search query that produces a value derived from the machine data identified by the entity definition information, the value indicative of a measure of the service at a point in time or during a period of time. The machine data is produced by one or more components within an information technology environment and reflects activity within the information technology environment.
System and method of selecting events or locations based on content
Systems and methods of returning location and/or event results using information mined from non-textual information are provided. Non-textual information is captured using a hardware component of a user device. Text-based social media content input on the user device is then retrieved. A location of the user device is determined using a global positioning system module in the user device. The non-textual information is converted to a machine-analyzable format, and the converted non-textual information is compared to a database of converted non-textual information samples to analyze and classify the converted non-textual information. The classification is sent to a server for storage in a database in a manner that ties the classification to the geographical location of the user device.
Background format optimization for enhanced queries in a distributed computing cluster
A format conversion engine for Apache Hadoop that converts data from its original format to a database-like format at certain time points for use by a low latency (LL) query engine. The format conversion engine comprises a daemon that is installed on each data node in a Hadoop cluster. The daemon comprises a scheduler and a converter. The scheduler determines when to perform the format conversion and notifies the converter when the time comes. The converter converts data on the data node from its original format to a database-like format for use by the low latency (LL) query engine.
Distributed sequential pattern mining (SPM) using static task distribution strategy
Seed patterns are derived from a sequence database. Execution costs for types of seed patterns are computed. Each seed pattern is iteratively distributed to distributed nodes along with that seed pattern's assigned execution cost. The distributed nodes processing in parallel to mine the sequence database for super patterns found in the sequence database. When a distributed node exhausts its execution budget, any remaining mining needed for the seed pattern being mined is reallocated to another distributed node having remaining execution budget.
Processing of sequencing data streams
This disclosure relates to methods and systems for processing of sequencing data streams. The system receives sequences from a sequencer and stores them as data records on a database. The sequences are associated with a counter indicative of a number of times the associated sequence has been sequenced. The system progressively receives a further sequence as streaming data from the sequence. While receiving the further sequence, the system matches the streaming data against the stored sequences to determine a matching score. Upon the matching score exceeding a matching threshold for one of the multiple sequences in the database, the system selects the one of the sequences in the database based on the matching score and stores the further sequence on non-volatile memory where the counter value associated with the selected sequence is below a saturation threshold. The system also terminates the receiving where the counter value is above the saturation threshold.
Adaptive data retrieval with runtime authorization
Methods and systems are disclosed for data retrieval, from databases to clients, in an environment requiring runtime authorization. In response to a request for T data records, a learning module provides a prediction R of a suitable number of data records to retrieve from a database. Following retrieval of R records or record identifiers, authorization is sought from an authorization service, resulting in A of the records being authorized. The A authorized records are returned to the requesting client, and, if more records are needed, T is decremented and the cycle is repeated. A performance notification is provided to the learning module for training, with respect to providing values of prediction R. The performance notification can be based on a measure of authorization service performance, the number A of authorized records, latency, communication or resource costs, a measure of resource congestion, or other parameters. Variants are disclosed.
Systems and methods for accelerating exploratory statistical analysis
Embodiments of the invention utilize a “data canopy” that breaks statistical measures down to basic primitives for various data portions and stores the basic aggregates in a library within an in-memory data structure. When a queried statistical measure involves a basic aggregate stored in the library over a data portion that at least partially overlaps the data portion associated with the basic aggregate, the basic aggregate may be reused in the statistical computation of the queried measure.
Framework and method for the automated determination of classes and anomaly detection methods for time series
Disclosed are a framework and method for selecting an anomaly detection method for each of a plurality of class of time series based on characteristics a time series example that represents an expected form of data. The method provides classification of a given time series into one of known classes based on expected properties of the time series, filtering the set of possible detection methods based on the time series class, evaluating the remaining detection methods on the given time series using the specific evaluation metric and selecting and returning a recommended anomaly detection method based on the specific evaluation metric.