Patent classifications
G06F16/316
Identifying pattern relationships in machine data
Methods and apparatus consistent with the invention provide the ability to organize and build understandings of machine data generated by a variety of information-processing environments. Machine data is a product of information-processing systems (e.g., activity logs, configuration files, messages, database records) and represents the evidence of particular events that have taken place and been recorded in raw data format. In one embodiment, machine data is turned into a machine data web by organizing machine data into events and then linking events together.
Preliminary ranker for scoring matching documents
The technology described herein provides for preliminary ranking of matching documents for a search query. A preliminary ranker uses score tables for scoring each matching document based on its relevant to a search query. The score table for a document stores pre-computed data used to derive a frequency of terms and other information in the document. The preliminary ranker uses the score table for each matching document and the terms from the search query to determine a score for each matching document. The lowest scoring documents are removed from further consideration by a final ranker.
Recyclable private memory heaps for dynamic search indexes
In one embodiment, a search engine may generate and store a plurality of search index segments such that each of the search index segments is stored in a corresponding one of a plurality of heaps of memory. The plurality of search index segments may include inverted index segments mapping content to documents containing the content. A garbage collection module may release one or more heaps of the memory.
Identifying relevant page content
A computer-implemented method according to one embodiment includes identifying a plurality of related web pages, extracting textual data within the identified plurality of related web pages, determining a plurality of groupings of the extracted textual data, calculating a frequency of each of the determined plurality of groupings within the identified plurality of related web pages, creating a subset of the determined plurality of groupings, based on the calculated frequency of each of the plurality of groupings, and returning the subset of the determined plurality of groupings.
ON-DEMAND, DYNAMIC AND OPTIMIZED INDEXING IN NATURAL LANGUAGE PROCESSING
Indexing natural language processing, a request is received from a user to access a document at a server, the server routes the request to an indexing server. A validation service checks if the CUID of the document is available in the indexing server repository or a file system associated with the indexing server. If the CUID of dataset exists, determine if a timestamp of the new document matches the timestamp of the previously indexed document. Upon determining that the above conditions are fulfilled, the previously indexed data is returned to the server. If it is determined that the above conditions do not match, then a transformation service is invoked at the indexing server. The transformation service compares a hash value of a dataset. If the transformation service determines that the hash value of a dataset in the document is not available, an indexing service is invoked to index the document.
METHODS AND SYSTEMS FOR A COMPLIANCE FRAMEWORK DATABASE SCHEMA
Generating a compliance framework. The compliance framework facilitates an organization's compliance with multiple authority documents by providing efficient methodologies and refinements to existing technologies, such as providing hierarchical fidelity to the original authority document; separating auditable citations from their context (e.g., prepositions and or informational citations); asset focused citations; SNED and Live values, among others.
Social media driven information interface
One or more techniques and/or systems are provided for populating an information interface based upon social media data. For example, users may post, share, and/or discuss various information through social media sources. Accordingly, social media data may be obtained from such social media sources. The social media data may be grouped into sets of social media data based upon temporal information. Within the sets of social media data, social media entries may be clustered into topic clusters (e.g., a royal wedding topic cluster, a plane crash topic cluster, etc.). Event summaries may be generated for respective topic clusters. The event summaries may be used to populate timeslots of an information interface, such as a calendar or timeline, to create annotated timeslots. In this way, the information interface may provide users with an interactive view of events over a time period, such as a year-in-review, based upon social media data.
DETECTION OF A TOPIC
The present invention relates to a method for performing a detection of a topic of a message introduced in a real-time customer service messaging platform. In the method a message comprising at least one word from which the topic is definable is received; a topic from the received message is extracted; it is inquired from a database if the topic is determinable from a number of messages received chronically earlier than the received message; and an indication is generated to an operator of the real-time customer service messaging platform in accordance with a detection result obtained through an inquiry to the database. Some aspects of the present invention relate to a network node, to a computer program product and to a system.
Search apparatus and search method
A search method includes receiving a search request to encoded text data, based on first index information produced by specifying an occurrence position of a character or a word included in original data of the encoded text data as a first axis and contents of the search request, generating second index information having a second axis superordinate to the first axis, and searching the encoded text data in response to the search request using the second index information.
Comparing tables with semantic vectors
A data processing system identifies a first topic for a first table, identifies a second topic for a second table, collects at least one first table attribute comprising at least one row name for the first table, and collects at least one second table attribute comprising at least one row name for the second table. The at least one semantic vector for the first table is compared with the at least one semantic vector for the second table to identify as related at least one row of the first table and at least one row of the second table. The at least one row of the first table and the at least one row of the second table are provided to a communication device with an identification as related.