G06F16/316

Financial documents examination methods and systems

A user is able to extract financial data, particularly tables, from a document. The table is stored and the user can compare the data in this table with data from similar tables from previous documents. The user can see how financial data has changed historically by looking only at financial tables from the same type of document, for example, only balance sheet tables from annual reports for a specific public company, over many years, and see how the values have changed or whether any new categories or types of data have been added or deleted. From the time series of financial data, the user can gain real intelligence into an entity's financial health.

SEARCH TOOL FOR IDENTIFYING AND SIZING CUSTOMER ISSUES THROUGH INTERACTION SUMMARIES AND CALL TRANSCRIPTS

The exemplary embodiments may provide a search tool that can locate customer issues in call transcripts and agent notes and can provide an accurate count of how often such issues appear in the call transcripts and agent notes. The exemplary embodiments may improve the speed with which the search of documents is performed. The exemplary embodiments rely upon a document matrix that is computed once for a given corpus of documents and a given vocabulary of the documents. The document matrix may be used across multiple queries. The exemplary embodiments also account for similar terms in processing a query. The exemplary embodiments may use a word coverage factor to improve the relevance of the search results returned by the search tool. The word coverage factor acts as a multiple factor that computes the fraction a query terms that are present in a document.

METHODS AND ARRANGEMENTS TO ADJUST COMMUNICATIONS

Logic may adjust communications between customers. Logic may cluster customers into a first group associated with a first subset of synonyms and a second group associated with a second subset of the synonyms. Logic may associate a first tag with the first group and with each of the synonyms of the first subset. Logic may associate a second tag with the second group and with each of the synonyms of the second subset. Logic may associate one or more models with pairs of the groups. A first pair may comprise the first group and the second group. The first model associated with the first pair may adjust words in communications between the first group and the second group, based on the synonyms associated with the first pair, by replacement of words in a communication between customers of the first subset and customers of the second sub set.

Efficient Indexing for Querying Arrays in Databases

A database system performs queries on fields storing arrays of a database (i.e., array fields) using de-duplication indexes. The system generates de-duplication indexes for array fields. The de-duplication indexes include unique entries for corresponding distinct values stored by the array fields. The system uses the de-duplication indexes to perform efficient queries specifying corresponding array fields. The system may further generate de-duplication indexes corresponding one or more fields storing various types of values. In various embodiments, the system selects an optimal index from various indexes usable to execute a query, such as a de-duplication index and a conventional index.

TRAINING AND APPLYING STRUCTURED DATA EXTRACTION MODELS

A computer system for extracting structured data from unstructured or semi-structured text in an electronic document, the system comprising: a graphical user interface configured to present to a user a graphical view of a document for use in training multiple data extraction models for the document, each data extraction model associated with a user defined question; a user input component configured to enable the user to highlight portions of the document; the system configured to present in association with each highlighted portion an interactive user entry object which presents a menu of question types to a user in a manner to enable the user to select one of the question types, and a field for receiving from the user a question identifier in the form of human readable text, wherein the question identifier and question type selected by the user are used for selecting a data extraction model, and wherein the highlighted portion of the document associated with the question identifier is used to train the selected data extraction model.

TRAINING AND APPLYING STRUCTURED DATA EXTRACTION MODELS

A computer system for extracting structured data from unstructured or semi-structured text in an electronic document, the system comprising: a graphical user interface configured to present to a user a graphical view of a document for use in training multiple data extraction models for the document, each data extraction model associated with a user defined question; a user input component configured to enable the user to highlight portions of the document; the system configured to present in association with each highlighted portion an interactive user entry object which presents a menu of question types to a user in a manner to enable the user to select one of the question types, and a field for receiving from the user a question identifier in the form of human readable text, wherein the question identifier and question type selected by the user are used for selecting a data extraction model, and wherein the highlighted portion of the document associated with the question identifier is used to train the selected data extraction model.

Method and system for high performance integration, processing and searching of structured and unstructured data

Disclosed herein are methods and systems for integrating an enterprise's structured and unstructured data to provide users and enterprise applications with efficient and intelligent access to that data. In accordance with exemplary embodiments, the generation of feature vectors about unstructured data can be hardware-accelerated by processing streaming unstructured data through a reconfigurable logic device, a graphics processor unit (GPU), or chip multi-processor (CMP) to determine features that can aid clustering of similar data objects.

METHODS, SYSTEMS, AND COMPUTER-READABLE MEDIA FOR SEMANTICALLY ENRICHING CONTENT AND FOR SEMANTIC NAVIGATION

Content of different formats may be sourced from various data sources such as content servers and ingested into a data integration server by an ingestion broker embodied on a non-transitory computer readable medium. The ingestion broker may normalize the content of different formats into a uniform representation that can be indexed and delivered across multiple digital channels for a variety of applications. The normalized content may be analyzed and semantic metadata may be determined from the normalized content. The normalized content can be semantically enriched by associating the semantic metadata and the like with the content. The semantic metadata can be stored in a semantic index that can be used for searching via the data integration server. During search, the semantic metadata can be instantiated as facets for user navigation and refinement of search criteria and additional semantic relationships can be assigned to the words in the normalized content.

SYSTEM AND METHOD FOR MULTIVARIATE TESTING OF MESSAGES TO SUBGROUP IN A ONE-TO-MANY MESSAGING PLATFORM

A system and method for multivariate testing of messages to a subgroup in a one-to-many messaging platform. A client text message is generated for transmission to a number of users via one or more messaging services. A subset of users is defined according to one or more attributes of the text message or the users, and the client text message is transmitted only to users in the subgroup. The transmission is analyzed for performance metrics, such as actions or reactions by users in the subgroup, and based on the performance metrics, the message is optimized for transmission to the larger group of users. Optimization happens rapidly.

FINANCIAL DOCUMENTS EXAMINATION METHODS AND SYSTEMS

A user is able to extract financial data, particularly tables, from a document. The table is stored and the user can compare the data in this table with data from similar tables from previous documents. The user can see how financial data has changed historically by looking only at financial tables from the same type of document, for example, only balance sheet tables from annual reports for a specific public company, over many years, and see how the values have changed or whether any new categories or types of data have been added or deleted. From the time series of financial data, the user can gain real intelligence into an entity's financial health.