Patent classifications
G06V30/1914
INFORMATION PROCESSING APPARATUS AND METHOD OF SEARCHING FOR SIMILAR DATA
An information processing apparatus stores first and second registered feature data respectively expressing first and second features of registered data, generates first and second subject feature data respectively expressing the first and second features of subject data, calculates a first degree of dissimilarity between the first registered feature data and the first subject feature data using a first computational process that exhibits symmetry so that a computational result does not change when two input values are interchanged, calculates a second degree of dissimilarity between the second registered feature data and the second subject feature data using a second computational process that exhibits antisymmetry so that a computational result changes when the two input values are interchanged, and selects the registered data based on the first and second degrees of dissimilarity.
FINGERPRINT AUTHENTICATION DEVICE, DISPLAY DEVICE INCLUDING THE SAME, AND METHOD OF AUTHENTICATING FINGERPRINT
A fingerprint authentication device includes: a sensor unit configured to output a sensing signal by sensing a fingerprint; an image processing unit configured to generate a fingerprint image based on the sensing signal; a storage unit configured to store a template including an enrolled image; and a learning unit configured to generate a first pseudo image and add the first pseudo image to the template.
Information processing apparatus and non-transitory computer readable medium for selecting a proper version of a recognition dictionary that is not necessarily a latest version
An information processing apparatus includes a selection unit that, when a target document is recognized, selects a first mode in which a latest version of a recognition dictionary is applied, or a second mode in which a version of the recognition dictionary is applied, the version of the recognition dictionary having a highest correct answer rate among plural versions different from the latest version, the correct answer rate being obtained from a recognition result and a confirmation or correction result of each of plural documents.
Electronic document data extraction
Methods, systems, and computer storage media are provided for data extraction. A target document representation may be generated based on modified text of a target electronic document. A measure of similarity may be determined between the target document representation and a reference document representation, which may be based on modified text of a reference electronic document. Based on the measure of similarity, the reference document representation may be selected. An extraction model associated with the selected reference document representation can then be used to extract data from the target document.
BOOK SCANNING USING MACHINE-TRAINED MODEL
This application discloses a technology for flattening a photographed page of a book and straightening texts therein. The technology uses one or more mathematical models to represent a curved shape of the photographed page with certain parameters. The technology also uses one or more photographic image processing techniques to dewarp the photographed page using the parameters of the curved shape. The technology uses one or more additional parameters that represent certain features of the photographed page to dewarp the photographed page.
METHOD OF GENERATING FONT DATABASE, AND METHOD OF TRAINING NEURAL NETWORK MODEL
A method of generating a font database, and a method of training a neural network model are provided, which relate to a field of artificial intelligence, in particular to a computer vision and deep learning technology. The method of generating the font database includes: determining, by using a trained similarity comparison model, a basic font database most similar to handwriting font data of a target user in a plurality of basic font databases as a candidate font database; and adjusting, by using a trained basic font database model for generating the candidate font database, the handwriting font data of the target user, so as to obtain a target font database for the target user.
ELECTRONIC DOCUMENT DATA EXTRACTION
Methods, systems, and computer storage media are provided for data extraction. A target document representation may be generated based on modified text of a target electronic document. A measure of similarity may be determined between the target document representation and a reference document representation, which may be based on modified text of a reference electronic document. Based on the measure of similarity, the reference document representation may be selected. An extraction model associated with the selected reference document representation can then be used to extract data from the target document.
Mixup image captioning
In an approach to augmenting caption datasets, one or more computer processors sample a ratio lambda from a probability distribution based on a pair of datapoints contained in a dataset, wherein each datapoint in the pair of datapoints comprises an image and an associated caption; extend the dataset by generating one or more new datapoints based on the sampled ratio lambda for each pair of datapoints in the dataset, wherein the sampled ratio lambda incorporates an interpolation of features associated with the pair of datapoints into the generated one or more new datapoints; identify one or more objects contained within a subsequent image utilizing an image model trained utilizing the extended dataset; generate a subsequent caption for one or more identified objects contained within the subsequent image utilizing a language generating model trained utilizing the extended dataset.
MIXUP IMAGE CAPTIONING
In an approach to augmenting caption datasets, one or more computer processors sample a ratio lambda from a probability distribution based on a pair of datapoints contained in a dataset, wherein each datapoint in the pair of datapoints comprises an image and an associated caption; extend the dataset by generating one or more new datapoints based on the sampled ratio lambda for each pair of datapoints in the dataset, wherein the sampled ratio lambda incorporates an interpolation of features associated with the pair of datapoints into the generated one or more new datapoints; identify one or more objects contained within a subsequent image utilizing an image model trained utilizing the extended dataset; generate a subsequent caption for one or more identified objects contained within the subsequent image utilizing a language generating model trained utilizing the extended dataset.
METHOD AND APPARTAUS FOR DATA EFFICIENT SEMANTIC SEGMENTATION
A method and system for training a neural network are provided. The method includes receiving an input image, selecting at least one data augmentation method from a pool of data augmentation methods, generating an augmented image by applying the selected at least one data augmentation method to the input image, and generating a mixed image from the input image and the augmented image.