G06F16/358

Systems and methods for keyword categorization
11487803 · 2022-11-01 · ·

Systems and methods including one or more processors and one or more non-transitory storage devices storing computing instructions configured to run on the one or more processors and perform: receiving a set of keywords from a graphical user interface of an electronic device of a user; pre-processing at least one keyword of the set of keywords; receiving a hierarchical categorization; pre-processing at least one category of the hierarchical categorization; determining a respective similarity between each of the at least one keyword of the set of keywords and each of the at least one category of the hierarchical categorization; determining a respective confidence level of a most likely category in the hierarchical categorization for each of the at least one keyword of the set of keywords using the respective similarity between each of the at least one keyword of the set of keywords and each of the at least one category of the hierarchical categorization; ranking each of the at least one keyword of the set of keywords using a respective importance of each keyword of the set of keywords and the respective confidence level of the most likely category in the hierarchical categorization; and altering a respective arrangement of at least one of the at least one keyword on the graphical user interface on the electronic device of the user based on the ranking. Other embodiments are disclosed herein.

DATABASE SYSTEMS AND METHODS OF NAMING RECORD GROUPS

Database systems and methods are provided for assigning structural metadata to records and creating automations using the structural metadata. One method of assigning structural metadata to a group of records involves determining, based on one or more fields of metadata associated with the records, a plurality of candidate names, wherein each candidate name of the plurality of candidate names corresponds to semantic content of the one or more fields of a respective record of the group of records, for each candidate name, assigning a name relevance score based on respective word relevance scores assigned to respective words of the respective candidate name based on usage, selecting a candidate name in a manner that is influenced by the respective name relevance scores assigned to the respective candidate names and automatically assigning a name to the group of records using the candidate name.

Methods and systems for dynamically rearranging search results into hierarchically organized concept clusters

Methods of and systems for dynamically rearranging search results into hierarchically organized concept clusters are provided. A method of searching for and presenting content items as an arrangement of conceptual clusters to facilitate further search and navigation on a display-constrained device includes providing a set of content items and receiving incremental input to incrementally identify search terms for content items. Content items are selected and grouped into sets based on how the incremental input matches various metadata associated with the content items. The selected content items are grouped into explicit conceptual clusters and user-implied conceptual clusters based on metadata in common to the selected content items. The clustered content items are presented according to the conceptual clusters into which they are grouped.

RECURSIVE AGGLOMERATIVE CLUSTERING OF TIME-STRUCTURED COMMUNICATIONS
20230078263 · 2023-03-16 ·

An example method of document cluster labeling comprises: selecting a current document cluster of a plurality of document clusters (e.g., the current document cluster can have documents organized using a DBSCAN or OPTICS algorithm); initializing a label associated with the current document cluster; selecting a term from a list of terms comprised by the document cluster; appending the term to the label associated with the current document cluster; responsive to determining that the label is found in a label dictionary, iteratively selecting a next term from the list of terms comprised by the document cluster and appending the next term to the label associated with the current document cluster; responsive to failing to locate the label in the label dictionary, inserting the label into the label dictionary; and associating the label with the current document cluster.

SYSTEMS AND METHODS FOR GENERATING CAUSAL INSIGHT SUMMARY

Conventionally, text summarization has been rule-based method and neural network based which required large dataset for training and the summary delivered had to be assessed by user in terms of relevancy. System and method are provided by present disclosure that generate causal insight summaries wherein event of importance is detected, and it is determined why event is relevant to a user. Text description is processed for named entities recognition, polarities of sentences identified, extraction of causal effects sentences (CES) and causal relationship identification in text segments which correspond to impacting events. Named entities are then role labeled. A score is computed for named entities, polarities of sentences, causal effects sentences, causal relationships, and the impacting events. A causal insight summary is generated with overall polarity being computed/determined. A customized causal insight summary is delivered to target users based on user preferences associated with specific named entities and impacting events.

Information processing device, information processing system, and computer program product for converting a causal relationship into a generalized expression
11599569 · 2023-03-07 · ·

An information processing device includes one or more hardware processors. The hardware processors acquire a causal relationship included in a target document that is a specific document from causal relationship management information in which one or a plurality of causal relationships are registered, which are extracted from one or a plurality of documents and each which includes a set of a first element and a second element having a relationship; acquire a similar expression of the causal relationship included in the target document based on feature management information in which features of a plurality of words included in one or a plurality of documents are registered; and acquire a generalized expression of the causal relationship included in the target document based on the causal relationship included in the target document and the similar expression.

CLUSTER ANALYSIS METHOD, CLUSTER ANALYSIS SYSTEM, AND CLUSTER ANALYSIS PROGRAM
20230119422 · 2023-04-20 ·

A server 4 executes a similarity calculation step (S2) of calculating similarity between content of one document and content of another document, a cluster classification step (S3) of generating a network in which a document is set as a node based on calculated similarity and similar nodes are connected by an edge, and performing classification based on similar documents, a first index calculation step (S4) of calculating a first index indicating centrality of a document in the network, a second index calculation step (S5) of calculating a second index that is different from the first index in the network and indicates importance of a document, and a display data generation step (S6) of generating, regarding a document, first display data indicating the network by an expression of a size of an object of a node according to the first index, an expression of a gauge having a shape corresponding to a shape of the object according to the second index and a length of the gauge, an expression according to a type of the cluster, and an expression according to magnitude of similarity between documents.

METHOD AND SYSTEM FOR GENERATING AND ASSIGNING SOFT LABELS FOR DATA NODE DATA

Techniques described herein relate to a method for managing data of data nodes. The method includes obtaining, by a data node manager, a soft labeling request; in response to obtaining the soft labeling request: sending, by the data node manager, requests for processed data to data nodes associated with the data node manager; obtaining, by the data node manager, processed data from the data nodes; merging, by the data node manager, the processed data to obtain processed data; performing, by the data node manager, clustering on the processed data to obtain soft label metadata; associating, by the data node manager, the soft label metadata with live data associated with the data nodes; and performing, by the data node manager, labeling actions using the live data and the soft label metadata.

System and engine for seeded clustering of news events

The present invention provides a seeded news event clustering and retrieval system configured to first create a candidate data set of documents, second create a set of initial clusters based on nearness or duplicate similarity status, and third create an aggregate cluster by merging initial clusters with seed documents. The invention generates top-level clusters for news events based on an editorially supplied topical label or “seed” component and generates sub-topic-focused clusters based on algorithm. The system uses an agglomerative clustering algorithm to gather and structure documents into distinct result sets. Decisions on whether to merge related documents or clusters are made according to similarity of evidence derived from two distinct sources, one, relying on a digital signature based on the unstructured text in the document, the other based on the presence of named entity tags that have been assigned to the document by an event or named entity tagger such as the Thomson Reuters Calais engine/web service.

Techniques for mixed-initiative visualization of data

In various embodiments, a visualization engine generates graphs that facilitate sense making operations on data sets. A graph includes nodes that are associated with a data set and edges that represent relationships between the nodes. In operation, the visualization engine computes pairwise similarities between the nodes. Subsequently, the visualization engine computes a layout for the graph based on the pairwise similarities and user-specified constraints. Finally, the visualization engine renders a graph for display based on the layout, the nodes, and the edges. Advantageously, by interactively specifying constraints and then inspecting the topology of the automatically generated graph, the user may efficiently explore salient aspects of the data set.