G06K9/34

Systems And Methods For Determining Dominant Colors In An Image
20210407136 · 2021-12-30 · ·

Systems and methods for determining a dominant color in a digital image are provided and include dividing pixels of the digital image into pixel groups, with pixels in a first pixel group being closer to a center of the digital image than pixels in a second pixel group. Pixels in the first and second pixel groups having a chroma value greater than a predetermined chroma value threshold and a lightness greater than a low brightness threshold and less than a high brightness threshold are analyzed using a first sample rate for the first pixel group and a second sample rate for the second group. The first sample rate is greater than the second sample rate. A dominant color for the digital image is determined based on the analyzed pixels in the first and second pixel groups.

COMPUTER-IMPLEMENTED METHOD OF TRANSCRIBING AN AUDIO STREAM AND TRANSCRIPTION MECHANISM
20210407515 · 2021-12-30 ·

A computer-implemented method of transcribing an audio stream can include transcribing the audio stream using a first transcribing instance having a first predetermined transcription size that is smaller than the total length of the audio stream. The first transcribing instance can provide a plurality of consecutive first transcribed text data snippets of the audio stream and the size of the first transcribed text data snippets can respectively corresponding to the first predetermined transcription size. The audio stream can also be transcribed using at least a second transcribing instance having a second predetermined transcription size that is smaller than the length of the audio stream. The second transcribing instance can provide a plurality of consecutive second transcribed text data snippets each corresponding to the second predetermined transcription size.

MODEL TRAINING METHOD, IDENTIFICATION METHOD, DEVICE, STORAGE MEDIUM AND PROGRAM PRODUCT
20210406579 · 2021-12-30 ·

The present disclosure provides a model training method, an identification method, device, storage medium and program product, relating to computer vision technology and deep learning technology. In the solution provided by the present application, the image is deformed by the means of deforming the first training image without label itself, and the first unsupervised identification result is obtained by using the first model to identify the image before deformation, and the second unsupervised identification result is obtained by using the second model to identify the image after deformation, and the first unsupervised identification result of the first model is deformed, thus a consistency loss function can be constructed according to the second unsupervised identification result and the scrambled identification result. In this way, it is able to enhance the constraint effect of the consistency loss function and avoid destroying the scene semantic information of the images used for training.

System and method for character recognition model and recursive training from end user input
11210545 · 2021-12-28 · ·

One embodiment of a system and process of reading a multi-character code may include identifying regions in which respective characters of the code reside in response to receiving an image of the multi-character code. The identified regions may be applied to a neural network to determine the respective characters in the identified regions. The determined characters may be displayed in an ordered sequence for a user to visually inspect to confirm that each of the determined characters are correct.

Cross-platform content muting

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, facilitate cross-platform content muting. Methods include detecting a request from a user to remove, from a user interface, a media item that is provided by a first content source and presented on a first platform. One or more tags that represent the media item are determined. These tags, which indicate that the user removed the media item represented by the one or more tags from presentation on the first platform, are stored in a storage device. Subsequently, content provided by a second content source (different from the first content source) on a second platform (different from the first platform) is prevented from being presented. This content is prevented from being presented based on a tag representing the content matching the one or more tags stored in the storage device.

Classification of records in a data set

Techniques for classifying records in a dataset are described. The method includes identifying a set of features of a target record and assigning a scoring metric to each feature in the set of features. The method also includes processing the set of features based on the scoring metric to obtain a reduced subset of features and an expanded subset of features. The method also includes searching a store of electronic records using the reduced subset of features to obtain a reduced subset of electronic records. The method also includes searching the reduced subset of electronic records using the expanded subset of features to obtain a set of matching electronic records.

FORM RECOGNITION METHODS, FORM EXTRACTION METHODS AND APPARATUSES THEREOF
20210397830 · 2021-12-23 ·

Methods, devices, apparatuses, and systems for form recognition and form extraction are provided. In one aspect, a form recognition method includes: obtaining a form line extraction result of a to-be-recognized form image by performing a form line extraction process on the to-be-recognized form image, obtaining a corrected to-be-recognized form image by performing a correction process on the to-be-recognized form image based on the form line extraction result of the to-be-recognized form image and a preset form template, and performing a text recognition process on the corrected to-be-recognized form image to obtain a form recognition result. The form line extraction result includes at least one of a plurality of first form lines or a plurality of first form line intersections, and the preset form template has at least one of a plurality of preset second form lines or a plurality of preset second form line intersections.

BI-DIRECTIONAL INTERACTION NETWORK (BINET)-BASED PERSON SEARCH METHOD, SYSTEM, AND APPARATUS

A bi-directional interaction network (BINet)-based person search method, system, and apparatus are provided. The method includes: obtaining, as an input image, a t.sup.th frame of image in an input video; and normalizing the input image, and obtaining a search result of a to-be-searched target person by using a pre-trained person search model, where the person search model is constructed based on a residual network, and a new classification layer is added to a classification and regression layer of the residual network to obtain an identity classification probability of the target person. The method improves the accuracy of the person search.

Image acquisition method, apparatus, system, and electronic device

The present disclosure provides image acquisition methods, apparatuses, systems and electronic devices. One image acquisition method includes: acquiring an initial face image of a user by a first image acquisition apparatus; controlling a second image acquisition apparatus to acquire an eye print image of the user according to an acquisition parameter, the acquisition parameter being determined based on the initial face image; and synthesizing the initial face image and the eye print image into a target face image of the user.

Determining associations between objects and persons using machine learning models

In various examples, sensor data—such as masked sensor data—may be used as input to a machine learning model to determine a confidence for object to person associations. The masked sensor data may focus the machine learning model on particular regions of the image that correspond to persons, objects, or some combination thereof. In some embodiments, coordinates corresponding to persons, objects, or combinations thereof, in addition to area ratios between various regions of the image corresponding to the persons, objects, or combinations thereof, may be used to further aid the machine learning model in focusing on important regions of the image for determining the object to person associations.