G06V40/169

Tunable models for changing faces in images

Techniques are disclosed for changing the identities of faces in images. In embodiments, a tunable model for changing facial identities in images includes an encoder, a decoder, and dense layers that generate either adaptive instance normalization (AdaIN) coefficients that control the operation of convolution layers in the decoder or the values of weights within such convolution layers, allowing the model to change the identity of a face in an image based on a user selection. A separate set of dense layers may be trained to generate AdaIN coefficients for each of a number of facial identities, and the AdaIN coefficients output by different sets of dense layers can be combined to interpolate between facial identities. Alternatively, a single set of dense layers may be trained to take as input an identity vector and output AdaIN coefficients or values of weighs within convolution layers of the decoder.

Learning apparatus and method for creating emotion expression video and apparatus and method for emotion expression video creation

A learning apparatus for creating an emotion expression video according to an embodiment disclosed include first generative adversarial networks (GAN) that receive text for creating an emotion expression video, extract vector information by performing embedding on the input text, and create an image based on the extracted vector information, and second generative adversarial networks that receive an emotion expression image and a frame of comparison video, and create a frame of emotion expression video from the emotion expression image and the frame of comparison video.

Systems, Methods and Media for Deep Shape Prediction

Exemplary embodiments include a computer-implemented method of training a neural network for facial reconstruction including collecting a set of 3D head scans, combining each feature of each 3D head scan with a weight to create a modified set of 3D head scans, training the neural network using the modified set of head scans, and inputting a real digital facial image into the neural network for facial reconstruction. Further exemplary embodiments include the set of 3D head scans comprising approximately a tenth or less in quantity in comparison to a quantity of the modified set of 3D head scans. The modified set of 3D head scans may comprise features found in the set of 3D head scans or the modified set of 3D head scans may consist of features found in the set of 3D head scans.

Extracting Facial Imagery from Online Sessions
20230237837 · 2023-07-27 ·

A system can determine, from a video of an online session, respective bounding boxes of text names of people, wherein the text names are presented in the video, and wherein images of the people are present in the video. The system can determine, from the video, respective faces of the people. The system can associate a first bounding box of the bounding boxes with a first face of the faces based on the first bounding box satisfying a function of distance with respect to the first face among the faces. The system can extract a name from the first bounding box via optical character recognition. The system can extract a subportion of the video that comprises the first face. The system can store an association between the name and the subportion of the video that comprises the first face.

Face recognition method, terminal device using the same, and computer readable storage medium

A backlight face recognition method, a terminal device using the same, and a computer readable storage medium are provided. The method includes: performing a face detection on each original face image in an original face image sample set to obtain a face frame corresponding to the original face image; capturing the corresponding original face images from the original face image sample set, and obtaining a new face image containing background pixels corresponding to the captured original face images from the original face image sample set; preprocessing all the obtained new face images to obtain a backlight sample set and a normal lighting sample set; and training a convolutional neural network using the backlight sample set and the normal lighting sample set until the convolutional neural network reaches a preset stopping condition. The trained convolutional neural network will improve the accuracy of face recognition in complex background and strong light.

HUMAN ABNORMAL BEHAVIOR RESPONSE METHOD AND MOBILITY AID ROBOT USING THE SAME

Response methods to human abnormal behaviors for a mobility aid robot having a user-facing camera are disclosed. The mobility aid robot responds to human abnormal behaviors by detecting a face of a human during the robot aiding the human to move through the camera, comparing an initial size of the face and an immediate size of the face in response to the face of the human having detected during the robot aiding the human to move, determining the human as in abnormal behavior(s) in response to the immediate size of the face being smaller than the initial size of the face, and performing response(s) corresponding to the abnormal behavior(s) in response to the human being in the abnormal behavior(s), where the response(s) include slowing down the robot.

Systems, Methods, and Platform for Facial Identification within Photographs

In an illustrative embodiment, systems and methods for assisting users in identifying unknown individuals in photographs first apply facial recognition to obtain a first likelihood of match between a target face and other faces in a corpus of images provided by users of a genealogy platform, and then adjusts the first likelihood of match according to similarities and dissimilarities in attributes supplied by users regarding individuals represented by each face. Resultant likelihoods drive presentation of potential matches for consideration by a requesting user.

NEURAL NETWORK FOR AUDIO AND VIDEO DUBBING WITH 3D FACIAL MODELLING
20230015971 · 2023-01-19 ·

A computer-implemented method includes obtaining source video data comprising a plurality of image frames, and using a face tracker to detect one or more instances of faces within respective sequences of image frames of the source video data. For a first instance of a given face detected within a first sequence of image frames, the method includes determining a framewise location and size of the first instance of the given face in the first sequence of image frames, using a neural renderer to obtain replacement video data comprising a replacement instance of the given face, and using the determined framewise location and size to replace at least part of the first instance of the given face with at least part of the replacement instance of the given face.

METHOD AND APPARATUS FOR DISTRIBUTING COSMETICS FOR USERS BASED ON MANUFACTURERS AND DISTRIBUTORS WHO USE COMPREHENSIVE COSMETICS PROVIDING SERVICES
20230214904 · 2023-07-06 ·

Provided is a comprehensive cosmetic provision platform server for providing a service that recommends a cosmetic to a user, the comprehensive cosmetic provision platform server including: a manufacturer management unit configured to conclude a contract with a manufacturer server and to obtain information about a manufacturer and information about the cosmetic provided by the manufacturer from the manufacturer server; a distributor management unit configured to conclude a contract with a distributor server and to obtain distributor information from the distributor server; a data management unit configured to be interlocked with the comprehensive cosmetic provision platform server and to obtain a face image of the user from a kiosk installed in a distribution store of the distributor server; a skin characteristic determination unit configured to determine skin characteristics of the user based on the obtained face image; a cosmetic determination unit configured to determine a cosmetic including an ingredient and effect necessary for the user based on the determined skin characteristics; a kiosk control unit configured to display information about the determined cosmetic through the kiosk; a payment processing unit configured to obtain a payment request signal for the cosmetic from the kiosk and to perform payment for the cosmetic according to the obtained payment request signal; and a settlement processing unit configured to settle a sales amount for the cosmetic and to charge a fee to the manufacturer server and the distributor server based on the sales amount.

Intelligent mixing and replacing of persons in group portraits
11551338 · 2023-01-10 · ·

The present disclosure is directed toward intelligently mixing and matching faces and/or people to generate an enhanced image that reduces or minimize artifacts and other defects. For example, the disclosed systems can selectively apply different alignment models to determine a relative alignment between a references image and a target image having an improved instance of the person. Upon aligning the digital images, the disclosed systems can intelligently identify a replacement region based on a boundary that includes the target instance and the reference instance of the person without intersecting other objects or people in the image. Using the size and shape of the replacement region around the target instance and the reference instance, the systems replace the instance of the person in the reference image with the target instance. The alignment of the images and the intelligent selection of the replacement region minimizes inconsistencies and/or artifacts in the final image.