Patent classifications
G06V30/1918
Term weight generation method, apparatus, device and medium
A term weight determination method includes: obtaining a video and video-associated text, the video-associated text including at least one term; generating a halfway vector of the term by performing multimodal feature fusion on the features of the video, the video-associated text and the at least one term; and generating the weight of the at least one term based on the halfway vector of the at least one term.
Dynamic presentation of targeted information in a mixed media reality recognition system
A context-aware targeted information delivery system comprises a mobile device, an MMR matching unit, a plurality of databases for user profiles, user context and advertising information, a plurality of comparison engines and a plurality of weight adjusters. The mobile device is coupled to deliver an image patch to the MMR matching unit which in turn performs recognition to produce recognize text. The recognized text is provided to a first and second comparison engines to produce relevant topics and relevant ads. The relevant topics and relevant ads are adjusted with information from a user context database including information such as location, date, time, and other information from a user profile. The third comparison engine compares the relevant topics and relevant ads to produce a set of final ads that are most related to the topics of interest for the user and delivered for display on to the mobile device.
DOCUMENT ANNOTATION PROCESSING
A method, computer program product, and computer system are provided for document annotation processing. The method includes: providing a target document in a digital format; receiving an image of an annotated hard copy of the target document and extracting annotations from the image as an annotation source; processing the extracted annotations to collect extracted annotation information metadata for the image; merging the extracted annotation information metadata for the image with other extracted annotation information metadata from other annotation sources for the target document to generate master annotation metadata; and providing the master annotation metadata for application to the target document.
CHARACTER COORDINATE EXTRACTION METHOD AND APPARATUS, DEVICE, MEDIUM, AND PROGRAM PRODUCT
Embodiments of the present application disclose a character coordinate extraction method and apparatus, a device, a medium and a program product. The method comprises: inputting a target text image into a feature extraction backbone network, and obtaining character segmentation features and text line segmentation features by means of feature fusion by different layers in the backbone network; respectively inputting the character segmentation features and the text segmentation features into a character segmentation module and a text line segmentation module, and obtaining a character segmentation heat map and a text segmentation heat map of the target text image, wherein the character segmentation module and the text line segmentation module form a segmentation network model; and calculating coordinates of a single character in the target text image according to the character segmentation heat map and the text line segmentation heat map. According to the embodiments of the present application, repeated extraction of features is reduced; high robustness is achieved for character segmentation; convergence of the network is accelerated, and the segmentation efficiency of the network is improved; the accuracy of single-character coordinate extraction is improved.
APPARATUS FOR CAPTURING AND PROCESSING IMAGES
An apparatus comprising an optical sensor having a plurality of photosensitive cells arranged in a grid, wherein the optical sensor is configured to capture a first image using a first subset of the photosensitive cells and to capture a second image using a second subset of the photosensitive cells, the second subset not including any of the photosensitive cells in the first subset, wherein the apparatus is configured to process the first and second images using at least one optical character recognition algorithm.
USING GENERATIVE ARTIFICIAL INTELLIGENCE TO SUPPLEMENT AUTOMATED INFORMATION EXTRACTION
Using generative AI to supplement automated information extraction is disclosed. Computer vision (CV) and/or optical character recognition (OCR) models and a generative artificial intelligence (AI) model are used together to extract information (e.g., names, dates, invoice numbers, etc.) from a source. Acceptance threshold(s) may be used to accept predictions for extracted data elements from the models, and the prediction from the generative AI model may be preferred or a human may be tasked with reviewing the element. If no model meets its respective acceptance threshold (whether common or specific to that model), these element(s) may be marked for subsequent human review, or a human can be looped in to correct these element(s). The models may then be retrained using these labeled elements.
METHOD AND SYSTEM FOR IMPROVED OPTICAL CHARACTER RECOGNITION
Described herein are systems and methods for performing optical character recognition in documents such as, in certain embodiments, a printed receipt from the sale of an item. In certain embodiments, the systems utilize a time dimension associated with inputsfor example, the expectation that the system will identify components in future related inputsin order to increase speed and accuracy. The processing time and computing resources required diminish for each subsequent processing stage, and the embodiments described herein have the ability to self-train, attempting computationally more complicated algorithms in the case of a non-match or ambiguous result at previous stage.
Method and apparatus for data processing, computer, storage medium, and program product
Disclosed are a method for data processing and a computer. The method includes the following. A service processing instruction and virtual asset-associated data of an aircraft are input to a target service processing model. Data division is performed on the virtual asset-associated data to obtain S unit virtual assets. Binary group classification information corresponding to each unit virtual assets is determined. Weight model parameters respectively corresponding to the S unit virtual assets are obtained. Data feature vectors are combined with the weight model parameters to obtain S fused feature vectors. A prompt text is generated, a target processing network is determined, and feature processing is performed on the S fused feature vectors and the prompt text via the target processing network, to obtain a feature processing result. The feature processing result is classified and recognized to obtain a data recognition result for responding to the service processing instruction.
Method and apparatus for data efficient semantic segmentation
A method and system for training a neural network are provided. The method includes receiving an input image, selecting at least one data augmentation method from a pool of data augmentation methods, generating an augmented image by applying the selected at least one data augmentation method to the input image, and generating a mixed image from the input image and the augmented image.
TEXT MATTING METHOD AND APPARATUS BASED ON NEURAL NETWORK, DEVICE, AND STORAGE MEDIUM
The present disclosure provides a text matting method and apparatus based on a neural network, a device, and a storage medium. The text matting method based on a neural network includes: processing a first image with a feature extraction network to obtain feature maps, processing the feature maps with an intermediate processing network to obtain intermediate feature maps, processing the intermediate feature maps with a feature fusion network to obtain a second image, wherein the second image includes a text feature extracted from the first image.