Patent classifications
G06V30/1983
MULTI-WORD PHRASE BASED ANALYSIS OF ELECTRONIC DOCUMENTS
A document processing system is configured to identify, for each accessed electronic document in a first set of multiple electronic documents, a set of identified multi-word phrases determined to be in ordered text information in the accessed electronic document, each multi-word phrase of the set of identified multi-word phrases including adjacent words in the ordered text information; and determine, for each accessed electronic document in the first set of multiple electronic documents, a selected document type from the first set of document types based at least on an analysis of the set of identified multi-word phrases with respect to multi-word-phrase characteristics identified by a first definition and associated with each document type in a first set of document types associated with a first document-set type.
Unsupported character code detection mechanism
An electronic device is described which comprises a memory storing a font comprising a mapping from character codes to glyphs. The memory also stores character information comprising at least information about one or more unsupported character codes. A processor of the device processes text content comprising character codes using the font to create text output by converting the character codes into glyphs for display at a display associated with the electronic device. The processor is configured to capture the text output and detect whether the text output comprises at least one unsupported character code; and, in the case that at least one unsupported character code is detected, to output to a user of the device information about the unsupported character code obtained from the character information.
Textual representation of an image
At least a computer-implemented method and an apparatus for processing an image are described. In examples, numeric values for at least one property of the image are determined. These values are then converted into at least one corresponding text character, said conversion being independent of any text content within the image. This enables a text representation of the image to be generated that contains said plurality of text characters. This text representation may be used to index and search for the image.
GENERATING EVENT DEFINITIONS BASED ON SPATIAL AND RELATIONAL RELATIONSHIPS
Data from one or more sensors is input to a workflow and fragmented to produce HyperFragments. The HyperFragments of input data are processed by a plurality of Distributed Experts, who make decisions about what is included in the HyperFragments or add details relating to elements included therein, producing tagged HyperFragments, which are maintained as tuples in a Semantic Database. Algorithms are applied to process the HyperFragments to create an event definition corresponding to a specific activity. Based on related activity included in historical data and on ground truth data, the event definition is refined to produce a more accurate event definition. The resulting refined event definition can then be used with the current input data to more accurately detect when the specific activity is being carried out.
Deep-learning based text correction method and apparatus
A text correction method and apparatus can take advantage of a greatly reduced number of error-ground truth pairs to train a deep learning model. To generate these error-ground truth pairs, different characters in a ground truth word are replaced with a symbol, not appearing in any ground truth words, to generate error words which are paired with that ground truth word to provide error-ground truth word pairs. This process may be repeated for all ground truth words for which training is to be performed. In embodiments, pairs of characters in a ground truth word may be replaced with a symbol to generate the error words which are paired with that ground truth word to provide error-ground truth word pairs. Again, this process may be repeated for all ground truth words for which training is to be performed.
METHOD FOR EXTRACTING ENTRIES FROM A DATABASE
The present teachings generally relate to a method for extracting one or more matched entries from a first database using a second database, including the steps of: identifying a plurality of second entities from the second database by filtering a plurality of entities of the second database according to one or more identification rules; inputting at least one keyword as a query to extract the one or more matched entries from the first database; linking the at least one keyword to one or more second entities according to one or more linking rules to define one or more linked second entities; matching the one or more linked second entities to one or more entries in the first database according to one or more matching rules to define the one or more matched entries; and extracting the one or more matched entries from the first database.
Natural language processing via a two-dimensional symbol having multiple ideograms contained therein
A string of natural language texts is received and formed a multi-layer 2-D symbol in a first computing system. The 2-D symbol comprises a matrix of NN pixels of data representing a super-character. The matrix is divided into MM sub-matrices with each sub-matrix containing (N/M)(N/M) pixels. N and M are positive integers, and N is preferably a multiple of M. Each sub-matrix represents one ideogram defined in an ideogram collection set. Super-character represents a meaning formed from a specific combination of a plurality of ideograms. The meaning of the super-character is learned in a second computing system by using an image processing technique to classify the 2-D symbol, which is formed in the first computing system and transmitted to the second computing system. Image process technique includes predefining a set of categories and determining a probability for associating each of the predefined categories with the meaning of the super-character.
Method and system for providing assistance by multi-function device for document preparation
The disclosed embodiments illustrate method and system for providing assistance for document preparation. The method includes processing one or more portions for one or more field names in an electronic document by a multifunction device. The electronic document corresponds to a hand-filled document, which comprises a character string in a first format for a field name. Further, one or more portions are processed to determine a second format and a location of each character string. A set of information is received in a pre-specified format for the one or more field names from a user-computing device. A field value for each of the processed one or more portions is determined based on a match between the character string and key strings associated with field names. The electronic document is updated based on replacement of the processed one or more portions with corresponding determined field value at the location.
Identifying visual objects depicted in video data using video fingerprinting
Systems and methods are described for using video fingerprinting to detect depiction of one or more objects of interest in video data. An object of interest may first be identified in one or more frames of video data. A system may then create a digital video fingerprint representing the one or more frames in which the object is depicted. A potentially large amount of subsequent video data may then be received or retrieved, and the system may determine that the object appears in a portion of the subsequent video data based at least in part by identifying that the digital video fingerprint at least substantially matches a portion of the subsequent video data.
Similarity processing method, apparatus, server and storage medium
The present application discloses a similarity processing method, an apparatus, a server and a storage medium, and relates to the fields of information processing and natural language processing. The specific implementation solution is as follows: acquiring a first character string and a second character string; determining a pronunciation pattern similarity and a character pattern similarity between the first character string and the second character string; and determining a comprehensive similarity between the first character string and the second character string, based on the pronunciation pattern similarity and the character pattern similarity.