Patent classifications
G06V30/19127
SEMANTIC TEMPLATE MATCHING
A system and method for field extraction including determining a key position of a key in an electronic file, isolating candidate key values based on a distance from the key position, selecting a key value from the candidate key values based on an output of a trained neural network, and extracting the key and the key value from the electronic file, regardless of a key-value structure.
Tyre sidewall imaging method
A computer implemented method is proposed for classifying one or more embossed and/or engraved markings on a sidewall of a tyre into one or more classes comprising digital image data of the sidewall of the tyre. The method comprises generating a first image channel from a first portion of the digital image data relating to a corresponding first portion of the sidewall of the tyre. Generating the first image channel comprises performing histogram equalisation on the first portion of the digital image data to generate the first image channel. The method further comprises generating a first feature map using the first image channel and applying a first classifier to the first feature map to classify said embossed and/or engraved markings into one or more first classes.
HYPERSPECTRAL IMAGE COMPRESSION USING A FEATURE EXTRACTION MODEL
Disclosed are techniques for obtaining tensor data representing a hyperspectral image including a first portion depicting an object and a second portion depicting at least a portion of a surrounding environment where the object is located, identifying, by one or more computers, a portion of the tensor data representing the hyperspectral image that corresponds to the first portion of the hyperspectral image, providing, by the computers, the identified portion of the tensor data representing the hyperspectral image as an input to a feature extraction model; obtaining, by the computers, one or more matrix structures as output by the feature extraction model based on the feature extraction model processing the identified portion of the tensor data, the one or more matrix structures representing a subset of features extracted from the identified portion of the tensor data, and storing, by the computers, the one or more matrix structures in a memory device.
Artificial Aperture Adjustment for Synthetic Depth of Field Rendering
This disclosure relates to various implementations that dynamically adjust one or more shallow depth of field (SDOF) parameters based on a designated, artificial aperture value. The implementations obtain a designated, artificial aperture value that modifies an initial aperture value for an image frame. The designated, artificial aperture value generates a determined amount of synthetically-produced blur within the image frame. The implementations determine an aperture adjustment factor based on the designated, artificial aperture value in relation to a default so-called “tuning aperture value” (for which the camera's operations may have been optimized). The implementations may then modify, based on the aperture adjustment factor, one or more SDOF parameters for an SDOF operation, which may, e.g., be configured to render a determined amount of synthetic bokeh within the image frame. In response the modified SDOF parameters, the implementations may render an updated image frame that corresponds to the designated, artificial aperture value.
CROSS-MODAL WEAK SUPERVISION FOR MEDIA CLASSIFICATION
Methods, systems, and storage media for classifying content across media formats based on weak supervision and cross-modal training are disclosed. The system can maintain a first feature classifier and a second feature classifier that classifies features of content having a first and second media format, respectively. The system can extract a feature space from a content item using the first feature classifier and the second feature classifier. The system can apply a set of content rules to the feature space to determine content metrics. The system can correlate a set of known labelled data to the feature space to construct determinative training data. The system can train a discrimination model using the content item and the determinative training data. The system can classify content using the discrimination model to assign a content policy to the second content item.
INFORMATION PROCESSING DEVICE, INFORMATION PROCESSING METHOD AND COMPUTER READABLE STORAGE MEDIUM
An information processing device, an information processing method, a computer readable storage medium are provided. The information processing device comprises processing circuitry configured to: construct, for each of a plurality of indexes, a sample unit set for the index based on a plurality of minimum labeled sample units related to the index which are obtained and labeled from an original sample set; and extract, for at least a part of the constructed plurality of sample unit sets, a minimum labeled sample unit from each sample unit set, and generate a labeled training sample based on the extracted minimum labeled sample unit. A sample unit set is constructed based on minimum labeled sample units that are labeled manually, and a labeled training sample is generated automatically based on such sample unit sets, thereby generating the labeled training sample automatically to a certain degree, and reducing manual participation.
TRAINING METHOD FOR CHARACTER GENERATION MODEL, CHARACTER GENERATION METHOD, APPARATUS, AND MEDIUM
Provided is a training method for a character generation model, and a character generation method, apparatus and device, which relates to the technical field of artificial intelligences, particularly, the technical field of computer vision and deep learning. The specific implementation schemes are: a source domain sample word and a target domain style word are input into the character generation model to obtain a target domain generation word; the target domain generation word and a target domain sample word are input into a pre-trained character classification model to calculate a feature loss of the character generation model; and a parameter of the character generation model is adjusted according to the feature loss.
IMAGE DATA PROCESSING METHOD, APPARATUS AND DEVICE, AND STORAGE MEDIUM
Embodiments of this application provide an image data processing method, apparatus and device and a storage medium. The method includes inputting image data including text information into a text recognition model, and acquiring image representation information corresponding to the image data according to a feature extraction component in the text recognition model; obtaining semantic encoding information corresponding to the image representation information according to an image encoding component; acquiring discrete encoding information corresponding to the image representation information according to code tables included in a discrete encoding module; and correcting network parameters of the text recognition model according to an encoding similarity between the semantic encoding information and the discrete encoding information to obtain a target text recognition model.
CHARACTER RECOGNITION METHOD, MODEL TRAINING METHOD, RELATED APPARATUS AND ELECTRONIC DEVICE
A character recognition method, a model training method, a related apparatus and an electronic device are provided. The specific solution is: obtaining a target picture; performing feature encoding on the target picture to obtain a visual feature of the target picture; performing feature mapping on the visual feature to obtain a first target feature of the target picture, where the first target feature is a feature that has a matching space with a feature of character semantic information of the target picture; inputting the first target feature into a character recognition model for character recognition to obtain a first character recognition result of the target picture.
END TO END TRAINABLE DOCUMENT EXTRACTION
A processor may receive an image and identify a plurality of characters in the image using a machine learning (ML) model. The processor may generate at least one word-level bounding box indicating one or more words including at least a subset of the plurality of characters and/or may generate at least one field-level bounding box indicating at least one field including at least a subset of the one or more words. The processor may overlay the at least one word-level bounding box and the at least one field-level bounding box on the image to form a masked image including a plurality of optically-recognized characters and one or more predicted fields for at least a subset of the plurality of optically-recognized characters.