Patent classifications
G06F16/316
Recording medium recording indexed data generation program, indexed data generation method and retrieval method
A non-transitory computer-readable recording medium recording an indexed data generation program causing a computer to execute processing of generating ledger sheet output format data from ledger sheet data including a ledger sheet having a plurality of columns; generating index information for words, characters, or numerical values, the index information including positional information capable of specifying attributes of the plurality of columns and a positional relationship in the ledger sheet data between pieces of data corresponding to the plurality of columns included in the ledger sheet output format data; and outputting an output file including the index information and the ledger sheet output format data.
Exposing annotations in a document
A technique is described herein for effectively exposing annotation information in a document for use by various applications. The technique involves generating a tag tree data structure that identifies a collection of tag elements associated with a document. The technique also generates an overlay data structure that identifies a collection of annotations associated with the document. The overlay data structure also links the annotations to corresponding parts identified in the tag tree data structure. The technique then uses the tag tree data structure and the overlay data structure to provide information to a document-consuming component that conveys an order in which one or more annotations appear in the document relative to one or more parts in the document. According to one illustrative aspect, at least one annotation described by the overlay data structure is an active annotation, corresponding to a transient annotation that has not been saved.
COMPUTER-READABLE RECORDING MEDIUM, INDEX CREATION DEVICE, INDEX CREATION METHOD, COMPUTER-READABLE RECORDING MEDIUM, SEARCH DEVICE, AND SEARCH METHOD
An index creation device reads target text data therein and creates a bitmap index in which, with regard to each of a character or a word and a tag that appear in the target text data, an appearance position of each of the character or the word and the tag in text data is represented as bitmap data.
METHOD AND SYSTEM FOR CLAIM SCOPE LABELING, RETRIEVAL AND INFORMATION LABELING OF GENE SEQUENCE
Embodiments of the present disclosure provides a method and a system for labeling and retrieving the protection scope of claims and for labeling information of a gene sequence, wherein the method includes: recognizing a gene sequence from the claims of the current patent application; extracting descriptive texts of the gene sequence from the claims based on a preset keyword; determining similarity information of the gene sequence based on the extracted descriptive texts, and labeling the scope of the claims of the gene sequence based on the similarity information. In the technical solutions provide in the embodiments of the present disclosure, a sequence retrieval can be performed in a patent library, and the accuracy of the gene sequence retrieval can be improved.
EFFICIENT INDEXING FOR QUERYING ARRAYS IN DATABASES
A database system performs queries on fields storing arrays of a database (i.e., array fields) using de-duplication indexes. The system generates de-duplication indexes for array fields. The de-duplication indexes include unique entries for corresponding distinct values stored by the array fields. The system uses the de-duplication indexes to perform efficient queries specifying corresponding array fields. The system may further generate de-duplication indexes corresponding one or more fields storing various types of values. In various embodiments, the system selects an optimal index from various indexes usable to execute a query, such as a de-duplication index and a conventional index.
System and method for pre-indexing filtering and correction of documents in search systems
Embodiments as disclosed herein provide a search system with an pre-indexing filter that provides both a sophisticated and contextually tailored approach to filtering documents and a corrector that is adapted to alter a document that has been designated to be filtered out from the indexing process and determine if the altered document should be indexed. The alteration of the document may be tied to the attributes, rules or thresholds used to initially filter the document from the indexing process. The filtering criteria can thus be tailored to a specific context such that both the initial filtering and the alteration process may be better suited for application in that context.
IDENTIFYING MATCHING EVENT DATA FROM DISPARATE DATA SOURCES
Methods and apparatus consistent with the invention provide the ability to organize and build understandings of machine data generated by a variety of information-processing environments. Machine data is a product of information-processing systems (e.g., activity logs, configuration files, messages, database records) and represents the evidence of particular events that have taken place and been recorded in raw data format. In one embodiment, machine data is turned into a machine data web by organizing machine data into events and then linking events together.
DIFFERENTIAL INDEXING FOR FAST DATABASE SEARCH
Methods, systems, and computer programs are presented for improving search speed and quality using differential indexing. One method includes an operation for building a first index for a database, the first index being for first tokens resulting from normalizing words in input data. Further, the method includes building a second index for the database, the second index being for second tokens comprising words of the input data eliminated from the first index during the normalizing. The method further includes operations for receiving a raw query for a search of the database, and for generating a search query based on tokens of the raw query. The search query comprises a combined search of the first index and the second index. A search is performed based on the search query, and results of the search are returned for presentation on a display.
Online ranking of queries for sponsored search
A system and method for ranking query-advertisement combinations is disclosed. Embodiments use an online component to enhance and rank query ad combinations. The query ad combination is then reranked with a trained factorization machine. The subsequent list of ranked query-ad combinations is then output. The output may be to an auction for determine ad-query combinations having the greatest expected revenue.
Systems and methods for caching structural elements of electronic documents
Systems and methods are disclosed herein for caching structural elements of electronic documents. A plurality of indices is stored in a database. The plurality of indices corresponds to locations within an electronic document of portions of a structural element. A mutation to the electronic document is received. Based on the plurality of indices, it is determined that the mutation modifies the structural element. Based on the determination, the structural element is updated. The updated structural element is displayed at a user device.