Patent classifications
G06F16/325
Enabling Electronic Loan Documents
The system prepares PDF documents to be digitally populated or signed. The method may comprise converting a document into an image; detecting words on the document; searching the words for keywords; searching for an object on the document; determining an object field based on the keywords and the object; creating a tag with metadata about the object field; and associating the tag with the object field. The method may also comprise determining, by a processor, metadata about a document; creating, by the processor, a hash from the metadata; storing, by the processor, an association of the hash, the metadata and the document in a knowledge database; creating, by the processor, a new hash for a new document; comparing, by the processor, the hash with the new hash; and determining, by the processor, that the new document has similar characteristics as the document based on the comparing.
System and method for storing and querying document collections
A system for storing document collections in a manner that facilitates efficient querying. Each document vector is hashed, by applying a suitable hash function to the components of the vector. The hash function maps the vector to a particular hash value, corresponding to a particular hyperbox in the multidimensional space to which the vectors belong. The vector, or a pointer to the vector, is then stored in a hash table in association with the vector's hash value. Subsequently, given a document of interest, documents similar to the document of interest may be found by hashing the vector of the document of interest, and then returning the vectors that are associated, in the hash table, with the resulting hash value.
Technologies for file sharing
This disclosure enables various computing technologies for sharing various files securely and selectively between various predefined user groups based on various predefined workflows. For each of the predefined workflows, the files are shared based on a data structure storing various document identifiers and various metadata tags, with the document identifiers mapping onto the metadata tags.
Explaining semantic search
The invention uses document retrieval to explain to a human user the properties of a query object that are revealed by a machine learning procedure, lending interpretability to the procedure. A query object is compared to reference objects by transforming the query object and reference objects into representative tokens. Reference objects with many tokens in common with the query object are returned as relevant result objects by a document retrieval system. The token representation furthermore admits comparison between features of the query object and matched features of the reference object or between the query object and groups of reference objects having common features, thus emphasising characteristics of the query and reference objects of semantic importance to the user based on the intention of their search. Embodiments include retrieval of 2-dimensional or 3-dimensional images, audio clips, and text.
Data structures for efficient storage and updating of paragraph vectors
Systems and methods involving data structures for efficient management of paragraph vectors for textual searching are described. A database may contain records, each associated with an identifier and including a text string and timestamp. A look-up table may contain entries for text strings from the records, each entry associating: a paragraph vector for a respective unique text string, a hash of the respective unique text string, and a set of identifiers of records containing the respective unique text string. A server may receive from a client device an input string, compute a hash of the input string, and determine matching table entries, each containing a hash identical to that of the input string, or a paragraph vector similar to one calculated for the input string. A prioritized list of identifiers from the matching entries may be determined based on timestamps, and the prioritized list may be returned to the client.
Prediction model generation system, method, and program
A prediction model generation system is provided that is capable of generating a prediction model for accurately predicting a relationship between an ID of a record in first master data and an ID of a record in second master data. Co-clustering means 71 performs co-clustering processing for performing co-clustering on first IDs and second IDs in accordance with first master data, second master data, and fact data indicating a relationship between each of the first IDs and each of the second IDs. Prediction model generation means 72 performs prediction model generation processing for generating a prediction model for each combination of a first ID cluster and a second ID cluster. The prediction model uses the relationship between each of the first IDs and each of the second IDs as an objective variable. The first ID cluster serves as a cluster of the first IDs. The second ID cluster serves as a cluster of the second IDs. The prediction model generation processing and the co-clustering processing are repeated until it is determined that a prescribed condition is satisfied.
INFORMATION PROCESSING APPARATUS, INFORMATION PROCESSING METHOD, AND STORAGE MEDIUM
An information processing apparatus includes an acquisition unit configured to acquire data with a first schema and data with a second schema, the data with the first schema including case-insensitive first identification information in identification information for management of data, the data with the second schema including the first identification information corresponding to the identification information and case-sensitive second identification information in the identification information, and a management unit configured to manage the data with the first schema and the data with the second schema in a retrievable manner.
METHOD, SYSTEM, AND COMPUTER PROGRAM PRODUCT FOR GENERATING MAP UPDATE DATA USING SUBTREE DATA STRUCTURES
A method, a system, and a computer program product for updating a map database are disclosed herein. The method comprises receiving a map update request including a subtree data structure and a bounding box identifying a region of a map. The method may further comprise obtaining a plurality of second map area identifiers and the corresponding area map content. The method may further comprise computing a plurality of second digests corresponding to the plurality of second map area identifiers, based on the plurality of second map area identifiers and the second map area content and generating the map update data for the region, based on the plurality of second digests and the subtree data structure.
Explaining Semantic Search
The invention uses document retrieval to explain to a human user the properties of a query object that are revealed by a machine learning procedure, lending interpretability to the procedure. A query object is compared to reference objects by transforming the query object and reference objects into representative tokens. Reference objects with many tokens in common with the query object are returned as relevant result objects by a document retrieval system. The token representation furthermore admits comparison between features of the query object and matched features of the reference object or between the query object and groups of reference objects having common features, thus emphasising characteristics of the query and reference objects of semantic importance to the user based on the intention of their search. Embodiments include retrieval of 2-dimensional or 3-dimensional images, audio clips, and text.
SYSTEMS AND METHODS FOR INDEXING GEOLOGICAL FEATURES
Systems and methods for indexing geological features are disclosed. In one embodiment, a method for indexing geological features includes accessing a database storing a plurality of map objects that originate from documents. Each map object includes a map defined by a geographical boundary and a text caption. The method includes, for each map object, determining a plurality of geohashes within the geographical boundary, and includes, for each map object, comparing terms of the text caption with a list of geological keywords. For each map object, the method includes identifying one or more geological noun phrases within the text caption that match one or more geological noun phrases of the list. The method includes determining, for each geological noun phrase, one or more geohashes associated with the geological noun phrase and, for each geohash, determining a frequency that the geohash is associated with the geological noun phrase.