Patent classifications
G06F16/316
Indexing dynamic hierarchical data
A system includes storage of data of a hierarchy, where each node of the hierarchy is represented by a row, and each row includes a level of its respective node, a pointer to a lower bound entry of an order index structure associated with the hierarchy, and a pointer to an upper bound entry of the order index structure associated with the hierarchy, reception of a pointer l, and determination of an entry e of the order index structure to which the received pointer l points.
SEARCHING AGAINST ATTRIBUTE VALUES OF DOCUMENTS THAT ARE EXPLICITLY SPECIFIED AS PART OF THE PROCESS OF PUBLISHING THE DOCUMENTS
A facility for indexing documents is described. The facility accesses a number of document manifests, each (a) corresponding to a different published document among a set of published documents, and (b) identifying, for each of a plurality of document attributes, a value of the attribute explicitly specified for the published document which the document manifest corresponds. The facility uses the accessed plurality of document manifests to construct a search index covering the set of published documents that is usable by a search engine to resolve queries each specifying a particular value for each of one or more of the plurality of document attributes.
SYSTEM AND METHOD OF CONTEXT-BASED PREDICTIVE CONTENT TAGGING FOR ENCRYPTED DATA
This disclosure relates to systems, methods, and computer readable media for performing multi-format, multi-protocol message threading in a way that is most beneficial for the individual user. Users desire a system that will provide for ease of message threading by stitching together related communications in a manner that is seamless from the user's perspective. Such stitching together of communications across multiple formats and protocols may occur, e.g., by: 1) direct user action in a centralized communications application (e.g., by a user clicking Reply on a particular message); 2) using semantic matching (or other search-style message association techniques); 3) element-matching (e.g., matching on subject lines or senders/recipients/similar quoted text, etc.); and 4) state-matching (e.g., associating messages if they are specifically tagged as being related to another message, sender, etc. by a third-party service, e.g., a webmail provider or Instant Messaging (IM) service).
SUGGESTING AND/OR PROVIDING TARGETING CRITERIA FOR ADVERTISEMENTS
Keyword suggestions that are category-aware (and field-proven) may be used to help advertisers better target the serving of their ads, and may reduce unused ad spot inventory. The advertiser can enter ad information, such as a creative, a landing Webpage, other keywords, etc. for example. A keyword facility may use this entered ad information as seed information to infer one or more categories. It may then request that the advertiser confirm or deny some basic feedback information (e.g., categories, Webpage information, etc.). For example, an advertiser may be provided with candidate categories and may be asked to confirm (e.g., using checkboxes) which of the categories are relevant to their ad. Keywords may be determined using at least the categories. The determined keywords may be provided to the advertiser as suggested keywords, or may automatically populate ad serving constraint information as targeting keywords. The ad server system can run a trial on the determined keywords to qualify or disqualify them as targeting keyword.
Previewing raw data parsing
Embodiments are directed towards previewing results generated from indexing data raw data before the corresponding index data is added to an index store. Raw data may be received from a preview data source. After an initial set of configuration information may be established, the preview data may be submitted to an index processing pipeline. A previewing application may generate preview results based on the preview index data and the configuration information. The preview results may enable previewing how the data is being processed by the indexing application. If the preview results are not acceptable, the configuration information may be modified. The preview application enables modification of the configuration information until the generated preview results may be acceptable. If the configuration information is acceptable, the preview data may be processed and indexed in one or more index stores.
System for optimizing content queries
An indexing scheme generates a token index associating token index values with keywords in queries and generates expression trees for the queries that use the token index values to represent the keywords. The indexing scheme generates a document index assigning document index values to uploaded documents. The indexing scheme generates a document-token index that associates the token index values with the document index values for the documents containing the keywords associated with the token index values. The indexing scheme applies the expression trees to the document-token index to quickly identify the documents satisfying the queries. For example, the indexing scheme may generate bit arrays for each of the token index values identifying the documents containing the keywords and apply logical operators from the queries to the bit arrays. The resulting data structure provides a list of documents satisfying the queries.
Exposing Annotations in a Document
A technique is described herein for effectively exposing annotation information in a document for use by various applications. The technique involves generating a tag tree data structure that identifies a collection of tag elements associated with a document. The technique also generates an overlay data structure that identifies a collection of annotations associated with the document. The overlay data structure also links the annotations to corresponding parts identified in the tag tree data structure. The technique then uses the tag tree data structure and the overlay data structure to provide information to a document-consuming component that conveys an order in which one or more annotations appear in the document relative to one or more parts in the document. According to one illustrative aspect, at least one annotation described by the overlay data structure is an active annotation, corresponding to a transient annotation that has not been saved.
Search engines and systems with handheld document data capture devices
Embodiments of the disclosed innovations provide systems and methods for locating data associated with rendered documents. Some embodiments support the use of a handheld document data capture device.
METHODS AND SYSTEMS FOR PROVIDING A SEARCH SERVICE APPLICATION
A system for providing a search service application is disclosed and includes an application builder component that provides a search model for a first object of a plurality of objects. The search model is based at least on an end-user input field corresponding to a first attribute of the first object and a search result output field corresponding to a second attribute of the first object. The search model is also associated with a backend data store that supports a storage structure that stores information relating to the first object. The system also includes a deployment engine that automatically configures a search engine system associated with the backend data store to place a portion of indexed data into a first partition and to place another portion of indexed data into at least another partition based on the search model.
EFFICIENT RESOLUTION OF SYNTACTIC PATTERNS IN QUESTION AND ANSWER (QA) PAIRS IN AN N-ARY FOCUS COGNITIVE QA SYSTEM
Embodiments for processing questions based on equivalence classes in a cognitive question answering system. A plurality of syntactic representations of a plurality of questions asked of the cognitive question answering system are provided. A plurality of syntactic representations of a plurality of passages ingested by the cognitive question answering system are provided. Question focus to candidate passage pairs are mapped to form an equivalence class mapping, and the equivalence class mapping is used to determine an answer to one of the plurality of questions asked of the cognitive question answering system.