Patent classifications
G06F16/2462
System and method for question answering with derived glossary clusters
A method, system, and computer-usable medium are disclosed for answering general background questions on a topic from documents with glossary sections, A set of documents with glossaries is received from which a set of terms and associated glossary entries are extracted, where each term has a corresponding glossary entry. Association is performed of related glossary entries. The associations is based on a similarity algorithm to form glossary clusters where each glossary cluster refers to one or more glossary entries. A query with query terms tailored to general information is received. The glossary clusters are ranked relevance to the query terms to form a ranked set. A set of glossary clusters meeting a high ranked threshold is selected and provided.
DETERMINATION OF CANDIDATE FEATURES FOR DEVIATION ANALYSIS
Systems and methods include determination, determine, for each of a plurality of discrete features, of statistics for each discrete value of the discrete feature based on values of a continuous feature associated with the discrete value, determination, for each discrete feature, of first summary statistics based on the statistics determined for each discrete value of the discrete feature, determination, for each discrete feature, of a dissimilarity based on the first summary statistics determined for the discrete feature and on the statistics determined for each discrete value of the discrete feature, determination of candidate discrete features of the discrete features based on the determined dissimilarities, the candidate discrete features comprising less than all of the discrete features, determination, for each of the candidate discrete features, of second summary statistics based on values of the continuous feature associated with each discrete value of the candidate discrete feature, determine of a deviation score for each of the candidate discrete features based on the second summary statistics, and presentation of the candidate discrete features based on the determined deviation scores.
SYSTEMS, METHODS, AND COMPUTER READABLE MEDIA FOR DATA AUGMENTATION
Systems, methods, and computer readable media for data augmentation are described. The system comprises a network device, a memory comprising a data augmentation model and a plurality of seed entries, and a processor in communication with the network device and the memory. The processor is configured to receive a candidate data item in a second data set, generate a candidate seed corresponding to the candidate data item, and determine a data feature, based on the data augmentation model, for the candidate seed. Additionally, the processor is configured to generate at least one matching seed in the plurality of seed entries, the at least one matching seed based on the data feature. The processor is further configured to augment the candidate data item with data corresponding to the at least one matching seed.
Real-time server capacity optimization tool using maximum predicted value of resource utilization determined based on historica data and confidence interval
A system includes a server associated with a resource utilization, a database storing historical data including resource utilization values over a first time period, and a processor. The processor identifies, from the historical data, a maximum resource utilization value and determines a duration of time for which the resource utilization exceeds a percentage of the maximum. The processor predicts, based on the historical data, a maximum predicted resource utilization value over a second time period, later than the first. The processor also determines, based on the historical data, an upper bound of a resource utilization confidence interval. The processor generates, based on the maximum value over the first time period, the duration of time, the maximum predicted value over the second time period, and the upper bound, a recommendation to consolidate the server with a second server and/or to release computational resources. The processor transmits the recommendation to an administrator.
Machine learned chart recommendation system
Systems and methods are disclosed to implement a chart recommendation system that recommends charts to users during a chart building process. In embodiments, when a new chart is being created, specified features of the chart are provided to a machine learned model such as a self-organizing map. The model will determine a previous chart that is the most similar to the new chart and recommend the previous chart to the user for recreation. In embodiments, newly created charts are added to a library and used to update the model. Charts that are highly popular or authored by expert users may be weighed more heavily during model updates, so that the model will be more influenced by these charts. Advantageously, the disclosed system allows novice users to easily find similar charts created by other users. Additionally, the disclosed system is able to automatically group similar charts without using human-defined classification rules.
System and method for unifying heterogenous datasets using primitives
In one embodiment, example systems and methods related to a manner of unifying heterogeneous datasets are provided. Multiple heterogeneous datasets containing traffic or driving data are collected. The records of the datasets are combined, and the records in the combined dataset are ordered into a plurality of time series based on timestamps associated with each record. A Bayesian learning method, such as hidden Markov models, is used to identify traffic primitives in the datasets. Each traffic primitive may include several consecutive records in the combined dataset and may correspond to particular driving actions such as turning left or right, stopping, accelerating, etc. The traffic primitives are used to create a traffic primitive index that can be queried by users or researchers for specific records. These records can be used to train or test one or more learning-based algorithms. In addition, the combined dataset can be further divided into tables corresponding to particular sensors, allowing the users or researchers to query for specific traffic primitive and sensor combinations.
Document Search Support Device
A device to support work of searching document data for interpreting an information analysis result of analysis data obtained by analyzing a sample containing an analyte, includes: an acquisition unit to acquire first information for identifying the analyte from the analysis data; a reception unit to receive input of second information for searching data of a document for interpreting the information analysis result of the analysis data; an extraction unit to extract, based on the first and second information, terms relevant to the information analysis result, from among terms in data of documents in a database; a calculation unit to calculate, for each relevant term, relevance scores indicating a relevance degree between the relevant term and the first information, and a relevance degree between the relevant term and the second information; and a processing unit to obtain an index value of statistical likelihood from the relevance scores.
Automated plan upgrade system for backing services
Embodiments allow automated provisioning of a plan upgrade for databases hosted in storage environments. A database is hosted in a shared storage environment according an existing plan, based upon consumption of available system resources (e.g., processing, I/O, memory, disk). An agent periodically issues requests for information relevant to database behavior (e.g., performance metrics, query logs, and/or knob settings). The agent collects the received information (e.g., via a domain socket), performing analysis thereon to predict whether future database activity is expected remain within the existing plan. Such analysis can include but is not limited to compiling statistics, and calculating values such as entropy, information divergence, and/or adjusted settings for database knobs. Based upon this analysis, the agent communicates a recommendation including a plan update and supporting statistics. Embodiments can reduce the effort/cost of the database administrator in having to manually predict future estimated database resource consumption and generate a plan update.
Duplicate concurrent transaction detection
Techniques are disclosed relating to transaction authorization. In some embodiments, a server computer system receives and caches browsing information for a device of a user, where the browsing information relates to a transaction service. The server computer system may then receive a request to authorize one or more transactions via the transaction service. The server computer system may evaluate the cached browsing information to determine whether the user is attempting to perform multiple concurrent transactions via the transaction service. Based on the evaluating, the server computer system may determine whether to authorize the one or more transactions. In some embodiments, the disclosed techniques may advantageously prevent or reduce authorization of duplicate transactions that are concurrently attempted by a user.
STATISTICAL ANALYSIS METHOD FOR RESEARCH CONDUCTED AFTER PRODUCT LAUNCH
The present invention provides a data statistical analysis method for research done after a product launch, comprising: a, collecting research data after the product launch through a plurality of research terminals, wherein the user terminals are terminals where the product is applied, and the research data comprise at least product life cycle information, intra-cycle usage information, and application feedback information; b, extracting a characteristic value set X in the research data from the research terminals; and c, extracting sales data corresponding to a time point when the research data are generated, and constructing a function S=f(X) by taking the characteristic value set as an independent variable and the sales data as a dependent variable, wherein S represents the sales data, and calculating an extremum of the function and taking the extremum as an index value for predicting the market trend.