G06V30/18143

TRAINING METHOD FOR IMAGE GENERATION MODEL, IMAGE GENERATION METHOD, DEVICE AND STORAGE MEDIUM

Provided are a training method for an image generation model, an image generation method, apparatus, and a device. The training method includes extracting reference keypoints of a character from a sample reference image; based on a model to be trained, performing motion estimation using sample audio data and the reference keypoints to obtain predicted keypoints that match the sample audio data; performing parameter estimation using the reference keypoints and the predicted keypoints to obtain motion parameters of the predicted keypoints, and performing prior motion estimation using the motion parameters of the predicted keypoints to obtain optical flow of non-key pixel points; performing image prediction using the sample reference image and dense optical flow to obtain predicted image data that matches the sample audio data; performing model training using the predicted image data and annotated image data to obtain the image generation model.

Detecting fields in document images

A method of detecting fields in document images includes: receiving a codebook comprising a set of visual words, each visual word corresponding to a center of a cluster of local descriptors; calculating, based on a set of user labeled document images, for each visual word of the codebook, a respective frequency distribution of a field position of a specified labeled field with respect to the visual word; loading a document image for extraction of target fields; calculating a statistical predicate of a possible position of a target field in the document image based on the frequency distributions; and detecting, using the trained model, fields in the document image based on the calculated statistical predicate.

Character recognition model training method and apparatus, character recognition method and apparatus, device and storage medium

The present disclosure provides a character recognition model training method and apparatus, a character recognition method and apparatus, a device and a medium, relating to the technical field of artificial intelligence, and specifically to the technical fields of deep learning, image processing and computer vision, which can be applied to scenarios such as character detection and recognition technology. The specific implementing solution is: partitioning an untagged training sample into at least two sub-sample images; dividing the at least two sub-sample images into a first training set and a second training set; where the first training set includes a first sub-sample image with a visible attribute, and the second training set includes a second sub-sample image with an invisible attribute; performing self-supervised training on a to-be-trained encoder by taking the second training set as a tag of the first training set, to obtain a target encoder.

Fraud detection via automated handwriting clustering

A computer-implemented method for automatically analyzing handwritten text to determine a mismatch between a purported writer and an actual writer is disclosed. The method comprises receiving two samples of digitized handwriting each allegedly created by one individual and received and entered into a digital system by another. The method further comprises performing a series of feature extractions to convert the samples into two vectors of extracted features; automatically clustering a set of vectors such that the first vector and the second vector are assigned to the same cluster among multiple clusters, based on vector similarity; and automatically determining that a same individual being associated with both the first and second samples indicates a heightened probability that the individual fraudulently created both samples. Finally, the method comprises automatically transmitting a message to flag additional samples of digitized handwriting entered into a digital system as possibly fraudulent.

DETECTING FIELDS IN DOCUMENT IMAGES

A method of detecting fields in document images includes: receiving, by a processing device, a codebook comprising a set of visual words, each visual word corresponding to a center of a cluster of local descriptors, wherein each local descriptor is associated with a respective keypoint region of a first set of document images; calculating, based on a second set of document images, for each visual word of the codebook, a respective frequency distribution of a field position of a specified field with respect to the visual word; loading a document image for extraction of target fields; and detecting fields in the document image based on the calculated frequency distributions.

METHODS AND APPLICATIONS FOR GENERATING CITATIONS FOR MACHINE-GENERATED CONTENT
20250348659 · 2025-11-13 ·

A citation for output content that is generated by a trained generative machine learning (ML) model is disclosed. A content database is filtered based on a text prompt embedding generated based on the same text prompt input to the ML model to generate the output content. Further filtering may be performed using an output content embedding generated based on the output content generated by the ML model. A base content item is then estimated as being similar to the output content generated by the ML model by filtering the content list using component/content features generated based on the output content. A similarity score is generated and the citation identifying the base content item is provided to the ML model. In response to determining that the similarity meets a first threshold similarity criterion, an alternative output content may be generated with or without further user input.

Model for detecting phishing URLS
12470596 · 2025-11-11 · ·

Methods, storage systems and computer program products implement embodiments of the present invention for protecting a computing device. These embodiments include detecting that an email is received by the computing device, the email including a Uniform Resource Locator (URL) for a web page in a first domain. The web page is retrieved from the domain, and a set of keywords are extracted from the retrieved web page. A query included the set of keywords is submitted to a search engine, and a response to the query is received from the search engine, the response indicating a set of second domain. Finally, in response to detecting that the first domain does not match any of the second domains, an alert for a phishing attack is generated.

User interfaces for detecting objects

In some embodiments, an computer system detects objects, such as physical objects in the physical environment of the electronic device. In some embodiments, the computer system presents indications of characteristics of the physical objects. In some embodiments, the physical objects are entry points to physical locations.

Saliency analysis system, saliency analysis method and recording medium
12561778 · 2026-02-24 · ·

A saliency analysis system includes: an input receiver that receives an evaluation target image; and a hardware processor, wherein the hardware processor extracts low-order image feature amounts and high-order image feature amounts, from the evaluation target image, and calculates saliencies in the image, based on the low-order image feature amounts and the high-order image feature amounts.

Context-based review translation
12561997 · 2026-02-24 · ·

A translation system provides machine translations of review texts on item pages using context from the item pages outside of the review text being translated. Given review text from an item page, context for machine translating the review text is determined from the item page. In some aspects, one or more keywords are determined based on text, images, and/or videos on the item page. The one or more keywords are used as context by the machine translator to translate the review text from a first language to a second language to provide translated review text, which can be presented on the item page.